Sunday, November 3, 2013

1. Knowledge Discovery in Databases (KDD) and Data Mining (DM)

More and more data of different types (text, audio, images, videos,...) are collected nowadays from different data sources (telecommunication, science, business, health-care systems, WWW,...).

Due to their quantity and complexity, it is impossible for humans to exploit these data collections through some manual process. Here comes the role of Knowledge Discovery in Databases (KDD), which aims at discovering knowledge hidden in vast amounts of data.

The KDD process consists of the following steps (see the picture below):
  1. Selection of data which are relevant to the analysis task
  2. Preprocessing of these data, including tasks like data data cleaning and data integration
  3. Transformation of the data into forms appropriate for mining
  4. Application of Data Mining algorithms for the extraction of patterns
  5. Interpretation/evaluation of the generated patterns so as to identify those patterns that represent real knowledge, based on some interestingness measures.


The Data Mining (DM) step is one of the core steps of the KDD process.
Its goal is to apply data analysis and knowledge discovery algorithms that
produce a particular enumeration of patterns (or models) over the data.

The KDD process was introduced in the following paper:

Usama Fayyad, Gregory Piatetsky-Shapiro and Padhraic Smyth (1995), “From Knowledge Discovery to Data Mining:  An Overview,” Advances in Knowledge Discovery and Data Mining, U. Fayyad et al. (Eds.), AAAI/MIT Press

What do we mean by the term patterns or data mining models?
One can think of patterns as comprehensive summaries of the data or as higher level description of the data. Different types of patterns exist, like: clusters, decision trees, association rules, frequent itemsets, sequential patterns.

There are a lot of informative resources on DM and KDD that one can user for further reading. I appose some of them below:

1 comment:

  1. Playtech Casino Online - Dr.MCD
    Playtech casino offers a complete 여주 출장샵 list of slots, table games, poker and slot machines. 경주 출장샵 Each table 원주 출장안마 will have the 세종특별자치 출장샵 same layout, rules, and features 군포 출장안마 as

    ReplyDelete