M358 - Data Mining
fellstrider.com - the logo!
Home| OU Study Rooms | M358 Index | Block 1 - Information Systems | Block 2 - Relational Theory | Block 3 - SQL | Block 4 - Database Development | Block 5 - Database Issues
 
Elicitation of Knowledge in Data

Facts and Rules

Facts that can be recorded for a customers supermarket purchases are:

These facts can be utilised to identify patterns that can be expressed as predicates. This is knowledge about customers.

Rules can be formed expressed as antecedents and consequents. (IF x THEN y) where x is the antecedent and y is the consequent.

if a customer purchases strawberries then they will also purchase cream.

The antecedent can contain any number of conditions but the consequent can only take one term. Rules do not always have to hold. The proportion of cases found where the rule does hold can be recorded.

if a customer purchases strawberries then they will also purchase cream. (proportion 80%)

Isolated facts do not constitute good analysis. It requires the analysis of a large number of facts. Hypothesis verification involves a human analyst formulating and refining the hypothesis. KDD automates this process.

Hypothesis Verification

Hypotheses can be developed and explored to reveal which products sold along with other products to establish the proportions sold.

For complex queries the human analyst would need many queries to obtain an answer. This is time consuming and dependent on the analysts skill. Relationships within data are complex. As a result hypothesis verification by SQL is an unsatisfactory method for discovering patterns in data.

Knowledge Discovery

This is the automated equivalent of the human analyst. Searching vast amounts of data for relationships. It is once again a multistep process involving data preparation, search for patterns, and interpretation and evaluation.

Home| OU Study Rooms | M358 Index | Block 1 - Information Systems | Block 2 - Relational Theory | Block 3 - SQL | Block 4 - Database Development | Block 5 - Database Issues
Move on to Data Mining Operations.

Valid CSS! Valid XHTML 1.0!

Comments, suggestions, ideas to
Stuart Banner