Abstract

Data mining is a complex process that involves the interaction of the application of human knowledge and skills and technology. This must be supported by clearly defined processes and procedures. This Chapter describes CRISP-DM (Cross-Industry Standard Process for Data Mining), a fully documented, freely available, robust, and non proprietary data mining model. The chapter analyzes the contents of the official Version 1.0 Document, and it is a guide through all the implementation process. The main purpose of data mining is the extraction of hidden and useful knowledge from large volumes of raw data. Data mining brings together different disciplines like software engineering, computer science, business intelligence, human-computer interaction, and analysis techniques. Phases of these disciplines must be combined for data mining project outcomes. CRISP-DM methodology defines its processes hierarchically at four levels of abstraction allowing a project to be structured modularly, being more maintainable, scalable and the most important, to reduce complexity. CRISP-DM describes the life cycle of a data mining project consisting of six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call