Chapter 28
Data
Mining Concepts
Over the last three decades, many organizations have generated
a large amount
of machine-readable data in the
form of files and databases. To process this data, we have the database
technology available that supports query languages like SQL. The problem with
SQL is that it is a structured language that assumes the user is aware of the
database schema. SQL supports operations of relational algebra that allow a
user to select rows and columns of data from tables or join-related information
from tables based on common fields. In the next chapter, we will see that data warehousing technology affords
several types of functionality: that of consolidation, aggregation, and summarization of data. Data warehouses
let us view the same information along multiple dimensions. In this chapter, we
will focus our attention on another very popular area of interest known as data
mining. As the term connotes, data mining
refers to the mining or discovery of new information in terms of patterns or
rules from vast amounts of data. To be practically useful, data mining must be
carried out efficiently on large files and databases. Although some data mining
features are being provided in RDBMSs, data mining is not well-integrated with database management systems.
We will briefly review the state of the art of this rather extensive
field of data mining, which uses techniques from such areas as machine
learning, statistics, neural networks, and genetic algorithms. We will
highlight the nature of the information that is discovered, the types of
problems faced when trying to mine databases, and the types of applications of
data mining. We will also survey the state of the art of a large number of
commercial tools available (see Section 28.7) and describe a number of
research advances that are needed to make this area viable.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2026 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.