Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology

Stefan Studer,Thanh Binh Bui,Ludwig Winkler,Klaus-Robert Müller,Christian Drescher,Alexander Hanuschkin,Steven Peters

doi:10.3390/make3020020

Abstract

Machine learning is an established and frequently used technique in industry and academia, but a standard process model to improve success and efficiency of machine learning applications is still missing. Project organizations and machine learning practitioners face manifold challenges and risks when developing machine learning applications and have a need for guidance to meet business expectations. This paper therefore proposes a process model for the development of machine learning applications, covering six phases from defining the scope to maintaining the deployed machine learning application. Business and data understanding are executed simultaneously in the first phase, as both have considerable impact on the feasibility of the project. The next phases are comprised of data preparation, modeling, evaluation, and deployment. Special focus is applied to the last phase, as a model running in changing real-time environments requires close monitoring and maintenance to reduce the risk of performance degradation over time. With each task of the process, this work proposes quality assurance methodology that is suitable to address challenges in machine learning development that are identified in the form of risks. The methodology is drawn from practical experience and scientific literature, and has proven to be general and stable. The process model expands on CRISP-DM, a data mining process model that enjoys strong industry support, but fails to address machine learning specific tasks. The presented work proposes an industry- and application-neutral process model tailored for machine learning applications with a focus on technical tasks for quality assurance.

Highlights

Many industries, such as manufacturing [1,2], personal transportation [3], and healthcare [4,5], are currently undergoing a process of digital transformation, challenging established processes with machine learning driven approaches
The expanding demand is highlighted by the Gartner report [6], claiming that organizations expect to double the number of Machine Learning (ML) projects within a year
It is best practice to hold back an additional test set, which is disjointed from the the validation and training set, stored only for a final evaluation and never shipped to any partner to be able to measure the performance metrics

Summary

Introduction

Many industries, such as manufacturing [1,2], personal transportation [3], and healthcare [4,5], are currently undergoing a process of digital transformation, challenging established processes with machine learning driven approaches. The expanding demand is highlighted by the Gartner report [6], claiming that organizations expect to double the number of Machine Learning (ML) projects within a year. Name data and software quality among others as the key challenges in the machine learning life cycle. Another reason is the lack of guidance through standards and development process models specific to ML applications.

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Machine Learning and Knowledge Extraction	Publication Date: Apr 22, 2021
Citations: 87	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning and Knowledge Extraction

Lead the way for us

Similar Papers

Tool Support for Improving Software Quality in Machine Learning Programs
Kwok Sun Cheng ... Tae-Hyuk Ahn
Information | VOL. 14
Kwok Sun Cheng, et. al.Kwok Sun Cheng ... Tae-Hyuk Ahn
16 Jan 2023
Information | VOL. 14

A Novel Browser-based No-code Machine Learning Application Development Tool
Erol Ozan
-
Erol OzanErol Ozan
10 May 2021
10 May 2021

Tutorial on Software Testing & Quality Assurance for Machine Learning Applications from research bench to real world
Sandya Mannarswamy ... Shourya Roy
-
Sandya Mannarswamy, et. al.Sandya Mannarswamy ... Shourya Roy
05 Jan 2020
05 Jan 2020

Why is Developing Machine Learning Applications Challenging? A Study on Stack Overflow Posts
Moayad Alshangiti ... Qi Yu
-
Moayad Alshangiti, et. al.Moayad Alshangiti ... Qi Yu
01 Sep 2019
01 Sep 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning and Knowledge Extraction