Abstract

Existing Data Mining process models propose one way or another of developing projects in a structured manner, trying to reduce their complexity through effective project management. It is well-known in any engineering environment that one of the management tasks that helps to reduce project problems is systematic project documentation, but few of the existing Data Mining processes propose their documentation. Furthermore, these few remark the need of producing documentation at each phase as an input for the next, but they don’t show how to do it. On the other hand, in the literature there are examples of UML extensions for data mining projects, but they always focus on the model implementation side and fail to take into account the remainder of the process. In this paper, we present an extension of the UML modeling language for data mining projects (DM-UML) covering all the documentation needs for a project conforming to a standard process, namely CRISP-DM, ranging from business understanding to deployment. We also show an example of a real application of the proposed DM-UML modeling. The result of this approach is that, besides the advantages of having an standardized way of producing the documentation, it clearly constitutes a very useful and transparent tool for modeling and connecting the business understanding or modeling phase with the remainder of the project right through to deployment, as well as a way of facilitating the communication with the nontechnical stakeholders involved in the project, problems which have always been an open question in data mining.

Highlights

  • Data mining projects are approached in an unstructured, ad hoc manner, and results are very dependent on the skills of the person(s) doing the job and on the tools they use [1,2,3,4,5]

  • Most data mining projects are beset by common development problems, including trouble defining project objectives that are achievable with the available data, effort focused on the data preparation phase, experimentation with data parameters and transformation in the data mining phase, lack of an approach and methodological support for project development, project resource management problems [1,2,6]

  • Some of the data mining project development problems can be reduced through effective project management [1,7]

Read more

Summary

Introduction

Data mining projects are approached in an unstructured, ad hoc manner, and results are very dependent on the skills of the person(s) doing the job and on the tools they use [1,2,3,4,5]. One of the management tasks that help to reduce data mining project problems is systematic project documentation [1,8,9]. To exploit documentation to the full, it should be a corporate resource, usable by all project members and created based on standards defined within the business [1] This leads to effective management, planning and communication [7]

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.