Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

What Gets Measured Gets Improved: Monitoring Machine Learning Applications in Their Production Environments

  • Abstract
  • Highlights & Summary
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Machine learning (ML) applications face many new, hardly predictable aspects in their production environments. Detecting new aspects in an ML production environment and understanding their impacts on the ML application is crucial if organizations are to ensure ML applications functionality. A monitoring entity is essential if one is to monitor ML applications in their production environments, to both continually minimize risks and improve ML application’s performance. But existing monitoring approaches are struggling to deal with specifics that arise from ML applications. We aim at deriving monitoring practices and providing a holistic view over required steps in successful ML applications monitoring. Since there has been little research on this topic, we followed a qualitative research approach, i.e., we conducted an interview study combined with a multivocal literature review. Thus, we provide a theoretical framework of an ML-enabled agent in its production environment, five characteristics of ML applications’ production environments and 17 monitoring practices – 14 practices arranged sequentially on a typical quality management cycle and three cross-sectional practices. To outline the ML specifics that arise in monitoring ML applications, we investigate the five ML production environment characteristics’ influences on the ML monitoring practices.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 5
  • 10.3390/info14010053
Tool Support for Improving Software Quality in Machine Learning Programs
  • Jan 16, 2023
  • Information
  • Kwok Sun Cheng + 3 more

Machine learning (ML) techniques discover knowledge from large amounts of data. Modeling in ML is becoming essential to software systems in practice. The accuracy and efficiency of ML models have been focused on ML research communities, while there is less attention on validating the qualities of ML models. Validating ML applications is a challenging and time-consuming process for developers since prediction accuracy heavily relies on generated models. ML applications are written by relatively more data-driven programming based on the black box of ML frameworks. All of the datasets and the ML application need to be individually investigated. Thus, the ML validation tasks take a lot of time and effort. To address this limitation, we present a novel quality validation technique that increases the reliability for ML models and applications, called MLVal. Our approach helps developers inspect the training data and the generated features for the ML model. A data validation technique is important and beneficial to software quality since the quality of the input data affects speed and accuracy for training and inference. Inspired by software debugging/validation for reproducing the potential reported bugs, MLVal takes as input an ML application and its training datasets to build the ML models, helping ML application developers easily reproduce and understand anomalies in the ML application. We have implemented an Eclipse plugin for MLVal that allows developers to validate the prediction behavior of their ML applications, the ML model, and the training data on the Eclipse IDE. In our evaluation, we used 23,500 documents in the bioengineering research domain. We assessed the ability of the MLVal validation technique to effectively help ML application developers: (1) investigate the connection between the produced features and the labels in the training model, and (2) detect errors early to secure the quality of models from better data. Our approach reduces the cost of engineering efforts to validate problems, improving data-centric workflows of the ML application development.

  • Research Article
  • Cite Count Icon 1
  • 10.54254/2755-2721/51/20241165
Research on the application of machine learning in business analytics: Cases of Amazon and eBay
  • Mar 25, 2024
  • Applied and Computational Engineering
  • Rongrong Zhang

With the rapid development of the Internet and the rise of e-commerce, commercial enterprises are faced with a large amount of data and a complex market environment. In this situation, machine learning, as a powerful tool, is widely used in the field of business analysis. In this dissertation, we take Amazon and eBay as examples to study the application of machine learning in the company's business analytics, focusing on its role in market prediction, customer behavior analysis and operation optimization. By analyzing the relevant cases, we find that machine learning technology plays an important role in helping companies make more accurate decisions and improve efficiency. Studying the application of Amazon machine learning in business analytics can promote in-depth research on the application of machine learning in business in academia, and promote the application and development of machine learning technology in other business scenarios. Overall, the application of machine learning in business analytics can help companies understand customer behavior, optimize operations, and improve sales results. However, there are still some challenges, such as data quality, algorithm selection and privacy protection. Therefore, further research and innovation are necessary to advance the development of machine learning applications in business analytics.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/icidca56705.2023.10100252
Meta Analysis of Human Body Diseases with the Application of Machine Learning
  • Mar 14, 2023
  • Nikhil Verma + 2 more

Machine learning in medical applications is one of the focus areas of the researchers these days. Machine Learning with the application of Artificial Intelligence is not only giving solutions to the complex problems but also revolutionised the medical field. The main motive of machine learning is to improve its learning process over time by taking all the relevant data and information in the form of different inputs and observations. This study reviews different medical disease prediction and detection techniques with the help of distinct deep learning & machine learning models. The problems related to medical diseases, like cancer related diseases, heart, lung, thyroid and kidney diseases are being discussed in this article. Detection and analysing of medical diseases is one of the prominent applications of machine and deep learning. Deep learning as a technology offers a huge set of different and innovative tools which are relevant to different issues faced in the field of medical image processing. This study will discuss about the applications of Machine Learning, and then discuss some of the advancements done in different diseases like breast cancer, heart disease, skin disease, kidney disease etc.

  • Conference Article
  • Cite Count Icon 66
  • 10.1109/issrew.2018.00024
TensorFI: A Configurable Fault Injector for TensorFlow Applications
  • Oct 1, 2018
  • Guanpeng Li + 2 more

Machine Learning (ML) applications have emerged as the killer applications for next generation hardware and software platforms, and there is a lot of interest in software frameworks to build such applications. TensorFlow is a high-level dataflow framework for building ML applications and has become the most popular one in the recent past. ML applications are also being increasingly used in safety-critical systems such as self-driving cars and home robotics. Therefore, there is a compelling need to evaluate the resilience of ML applications built using frameworks such as TensorFlow. In this paper, we build a high-level fault injection framework for TensorFlow called TensorFI for evaluating the resilience of ML applications. TensorFI is flexible, easy to use, and portable. It also allows ML application programmers to explore the effects of different parameters and algorithms on error resilience.

  • Research Article
  • Cite Count Icon 1
  • 10.1145/3729394
Automatically Detecting Numerical Instability in Machine Learning Applications via Soft Assertions
  • Jun 19, 2025
  • Proceedings of the ACM on Software Engineering
  • Shaila Sharmin + 5 more

Machine learning (ML) applications have become an integral part of our lives. ML applications extensively use floating-point computation and involve very large/small numbers; thus, maintaining the numerical stability of such complex computations remains an important challenge. Numerical bugs can lead to system crashes, incorrect output, and wasted computing resources. In this paper, we introduce a novel idea, namely soft assertions (SA) , to encode safety/error conditions for the places where numerical instability can occur. A soft assertion is an ML model automatically trained using the dataset obtained during unit testing of unstable functions. Given the values at the unstable function in an ML application, a soft assertion reports how to change these values in order to trigger the instability. We then use the output of soft assertions as signals to effectively mutate inputs to trigger numerical instability in ML applications. In the evaluation, we used the GRIST benchmark, a total of 79 programs, as well as 15 real-world ML applications from GitHub. We compared our tool with 5 state-of-the-art (SOTA) fuzzers. We found all the GRIST bugs and outperformed the baselines. We found 13 numerical bugs in real-world code, one of which had already been confirmed by the GitHub developers. While the baselines mostly found the bugs that report NaN and INF, our tool found numerical bugs with incorrect output. We showed one case where the Tumor Detection Model , trained on Brain MRI images, should have predicted ”tumor”, but instead, it incorrectly predicted ”no tumor” due to the numerical bugs. Our replication package is located at https://figshare.com/s/6528d21ccd28bea94c32.

  • Research Article
  • Cite Count Icon 3
  • 10.1002/cben.70012
ML@ChemE: Past, Present, and Future of Machine Learning in Chemical Engineering
  • Jun 2, 2025
  • ChemBioEng Reviews
  • Pınar Özdemir + 1 more

This paper aims to review the machine learning (ML) applications in chemical engineering (ChemE) and provide perspectives for the future. First, the evolution of ML, data structures, and ML applications in ChemE were reviewed; then, the current state of the art in ML and its ChemE applications were summarized. Finally, a perspective for the future developments, including recently popularized tools like generative artificial intelligence (AI) and large language models (LLMs), as well as major challenges and limitations, was provided. Although the initial applications were mainly on fault detection, signal processing, and process modeling, the focus had been extended to other fields involving material development, property estimation, and performance analysis in later years with the use of more complex models and datasets. In future, new developments like LLMs will likely spread more; the other new applications like automated ML, physics‐informed ML, and transfer learning, as well as field‐specific databases, will also get more attention. ML applications in ChemE‐related fields, like new energy technologies, environmental issues, and new material discovery, are expected to grow further.

  • Supplementary Content
  • Cite Count Icon 102
  • 10.3390/ijerph18042121
A Review on Human–AI Interaction in Machine Learning and Insights for Medical Applications
  • Feb 1, 2021
  • International Journal of Environmental Research and Public Health
  • Mansoureh Maadi + 2 more

Objective: To provide a human–Artificial Intelligence (AI) interaction review for Machine Learning (ML) applications to inform how to best combine both human domain expertise and computational power of ML methods. The review focuses on the medical field, as the medical ML application literature highlights a special necessity of medical experts collaborating with ML approaches. Methods: A scoping literature review is performed on Scopus and Google Scholar using the terms “human in the loop”, “human in the loop machine learning”, and “interactive machine learning”. Peer-reviewed papers published from 2015 to 2020 are included in our review. Results: We design four questions to investigate and describe human–AI interaction in ML applications. These questions are “Why should humans be in the loop?”, “Where does human–AI interaction occur in the ML processes?”, “Who are the humans in the loop?”, and “How do humans interact with ML in Human-In-the-Loop ML (HILML)?”. To answer the first question, we describe three main reasons regarding the importance of human involvement in ML applications. To address the second question, human–AI interaction is investigated in three main algorithmic stages: 1. data producing and pre-processing; 2. ML modelling; and 3. ML evaluation and refinement. The importance of the expertise level of the humans in human–AI interaction is described to answer the third question. The number of human interactions in HILML is grouped into three categories to address the fourth question. We conclude the paper by offering a discussion on open opportunities for future research in HILML.

  • Research Article
  • Cite Count Icon 9
  • 10.14778/3352063.3352110
Ease.ml/ci and Ease.ml/meter in action
  • Aug 1, 2019
  • Proceedings of the VLDB Endowment
  • Cedric Renggli + 5 more

Developing machine learning (ML) applications is similar to developing traditional software --- it is often an iterative process in which developers navigate within a rich space of requirements, design decisions, implementations, empirical quality , and performance . In traditional software development, software engineering is the field of study which provides principled guidelines for this iterative process. However, as of today, the counterpart of "software engineering for ML" is largely missing --- developers of ML applications are left with powerful tools (e.g., TensorFlow and PyTorch) but little guidance regarding the development lifecycle itself. In this paper, we view the management of ML development life-cycles from a data management perspective. We demonstrate two closely related systems, ease.ml/ci and ease.ml/meter, that provide some "principled guidelines" for ML application development: ci is a continuous integration engine for ML models and meter is a "profiler" for controlling overfitting of ML models. Both systems focus on managing the "statistical generalization power" of datasets used for assessing the quality of ML applications, namely, the validation set and the test set . By demonstrating these two systems we hope to spawn further discussions within our community on building this new type of data management systems for statistical generalization.

  • Research Article
  • Cite Count Icon 2
  • 10.2979/esj.2022.a886946
Integrating Fairness in Machine Learning Development Life Cycle: Fair CRISP-DM
  • Dec 1, 2022
  • e-Service Journal
  • Vivek K Singh + 1 more

ABSTRACT: Developing efficient processes for building machine learning (ML) applications is an emerging topic for research. One of the well-known frameworks for organizing, developing, and deploying predictive machine learning models is cross-industry standard for data mining (CRISP-DM). However, the framework does not provide any guidelines for detecting and mitigating different types of fairness-related biases in the development of ML applications. The study of these biases is a relatively recent stream of research. To address this significant theoretical and practical gap, we propose a new framework—Fair CRISP-DM, which groups and maps these biases corresponding to each phase of an ML application development. Through this study, we contribute to the literature on ML development and fairness. We present recommendations to ML researchers on including fairness as part of the ML evaluation process. Further, ML practitioners can use our framework to identify and mitigate fairness-related biases in each phase of an ML project development. Finally, we also discuss emerging technologies which can help developers to detect and mitigate biases in different stages of ML application development.

  • Research Article
  • Cite Count Icon 43
  • 10.1016/j.matpr.2021.12.101
Machine learning applications in healthcare sector: An overview
  • Dec 18, 2021
  • Materials Today: Proceedings
  • Virendra Kumar Verma + 1 more

Machine learning applications in healthcare sector: An overview

  • PDF Download Icon
  • Book Chapter
  • 10.1007/978-3-031-54827-7_20
Adversarial Evasion on LLMs
  • Jan 1, 2024
  • Rachid Guerraoui + 1 more

While Machine Learning (ML) applications have shown impressive achievements in tasks such as computer vision, NLP, and control problems, such achievements were possible, first and foremost, in the best-case-scenario setting. Unfortunately, settings where ML applications fail unexpectedly, abound, and malicious ML application users or data contributors can trigger such failures. This problem became known as adversarial example robustness. While this field is in rapid development, some fundamental results have been uncovered, allowing some insight into how to make ML methods resilient to input and data poisoning. Such ML applications are termed adversarially robust. While the current generation of LLMs is not adversarially robust, results obtained in other branches of ML can provide insight into how to make them adversarially robust. Such insight would complement and augment ongoing empirical efforts in the same direction (red-teaming).

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 20
  • 10.3390/computation11060115
Machine Learning in X-ray Diagnosis for Oral Health: A Review of Recent Progress
  • Jun 10, 2023
  • Computation
  • Mónica Vieira Martins + 5 more

The past few decades have witnessed remarkable progress in the application of artificial intelligence (AI) and machine learning (ML) in medicine, notably in medical imaging. The application of ML to dental and oral imaging has also been developed, powered by the availability of clinical dental images. The present work aims to investigate recent progress concerning the application of ML in the diagnosis of oral diseases using oral X-ray imaging, namely the quality and outcome of such methods. The specific research question was developed using the PICOT methodology. The review was conducted in the Web of Science, Science Direct, and IEEE Xplore databases, for articles reporting the use of ML and AI for diagnostic purposes in X-ray-based oral imaging. Imaging types included panoramic, periapical, bitewing X-ray images, and oral cone beam computed tomography (CBCT). The search was limited to papers published in the English language from 2018 to 2022. The initial search included 104 papers that were assessed for eligibility. Of these, 22 were included for a final appraisal. The full text of the articles was carefully analyzed and the relevant data such as the clinical application, the ML models, the metrics used to assess their performance, and the characteristics of the datasets, were registered for further analysis. The paper discusses the opportunities, challenges, and limitations found.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.ijbiomac.2025.142374
Application of explainable machine learning in the production of pullulan by Aureobasidium pullulans CGMCCNO.7055.
  • May 1, 2025
  • International journal of biological macromolecules
  • Shiwei Chen + 7 more

Application of explainable machine learning in the production of pullulan by Aureobasidium pullulans CGMCCNO.7055.

  • Book Chapter
  • 10.1049/pbcs080e_ch1
Introduction: secured co-processors for machine learning and DSP applications using biometrics
  • May 10, 2023
  • Anirban Sengupta + 1 more

The chapter gives an introduction on security requirements of co-processors for machine learning (ML) and digital signal processing (DSP) applications and the role of biometrics in securing them. This introduction of the book tries to build interest in readers about the various DSP and ML co-processors; behavioral synthesis design process for generating secured DSP and ML co-processors and importance of biometric security for hardware authentication.The chapter is organized as follows: Section 1.1 introduces about the co-processors, different hardware threats, and conventional security solutions; Section 1.2 highlights the significance of behavioral synthesis in designing and securing co-processors; Section 1.3 introduces about the co-processors for ML applications, why ML co-processors need to be secured, and how behavioral synthesis plays a crucial role in securing ML co-processors; Section 1.4 introduces about the behavioral synthesis perspective in designing and securing DSP co-processors; Section 1.5 introduces about the biometric security based on fingerprint, face, and palmprint for ML and DSP co-processors.

  • Conference Article
  • Cite Count Icon 86
  • 10.1109/icaice51518.2020.00102
Towards MLOps: A Case Study of ML Pipeline Platform
  • Oct 1, 2020
  • Yue Zhou + 2 more

The development and deployment of machine learning (ML) applications differ significantly from traditional applications in many ways, which have led to an increasing need for efficient and reliable production of ML applications and supported infrastructures. Though platforms such as TensorFlow Extended (TFX), ModelOps, and Kubeflow have provided end-to-end lifecycle management for ML applications by orchestrating its phases into multistep ML pipelines, their performance is still uncertain. To address this, we built a functional ML platform with DevOps capability from existing continuous integration (CI) or continuous delivery (CD) tools and Kubeflow, constructed and ran ML pipelines to train models with different layers and hyperparameters while time and computing resources consumed were recorded. On this basis, we analyzed the time and resource consumption of each step in the ML pipeline, explored the consumption concerning the ML platform and computational models, and proposed potential performance bottlenecks such as GPU utilization. Our work provides a valuable reference for ML pipeline platform construction in practice.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant