Abstract

Knowledge discovery in databases (KDD) is defined as the non trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data (W.J. Frawley et al., 1991). KDD is an iterative process involving five steps which lead to the final goal of useful information. The five steps are: selection of data-determining which fields and records are to be analysed; preprocessing-cleaning the data, by removal of noise and outliers, if appropriate, and deciding on strategies for missing attribute values; transformation-representing the data by new features, and reducing its dimensionality; data mining-deciding which algorithms to apply to the data i.e., classification, regression, rule induction, neural networks; and interpretation/evaluation-feasibility analysis of the results from the data mining step. There are two general 'goals' in KDD: verification of a hypothesis; and discovery, where the 'system' autonomously discovers patterns. Within the KDD process a data warehouse is typically employed as the 'source' of the KDD exercise. The power industry has evolved to become dependent upon computerised environments with more online data being stored for later extraction and investigation. Two key areas where KDD has been shown to be applicable is in the analysis of energy pooling and settlement data, and for condition monitoring of power system plant.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call