IDENTIFICATION AND PROCESSING OF DATA ANOMALIES IN MACHINE LEARNING TASKS

Iryna Kalinina,Oleksandr Gozhyj

doi:10.34185/1991-7848.itmm.2021.01.029

Abstract

The paper presents the procedure of identification and processing of data anomalies at the stage of preliminary data processing in machine learning tasks. The procedure consists of three stages. At the first stage, emissions are detected in the data samples. A large number of methods are used for this. The choice of a particular method depends on the task of machine learning, the structure of the data set and the types of data being processed. The methods used at this stage are methods of statistical tests, methods of metric tests, methods of model tests, iterative methods, methods of machine learning, ensemble methods. Until the second stage, the analysis of the causes of emissions is carried out. The causes of emissions include: causes of measurement errors and causes of data processing errors, the results of external influences, or errors in data records. In the third stage, there is a final processing of data sets with emissions, in which there is a removal of emissions or normalizing transformations. The effectiveness of the procedure was tested on different data sets.

Highlights

Summary

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

IDENTIFICATION AND PROCESSING OF DATA ANOMALIES IN MACHINE LEARNING TASKS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International scientific and technical conference Information technologies in metallurgy and machine building

Lead the way for us

Journal: International scientific and technical conference Information technologies in metallurgy and machine building	Publication Date: Apr 10, 2021
License type: cc-by

Similar Papers

Machine learning-based farm risk management: A systematic mapping review
Saman Ghaffarian ... Yann De Mey
Computers and Electronics in Agriculture | VOL. 192
Saman Ghaffarian, et. al.Saman Ghaffarian ... Yann De Mey
22 Dec 2021
Computers and Electronics in Agriculture | VOL. 192

A systematic machine learning method for reservoir identification and production prediction
Wei Liu ... Yuan Hu
Petroleum Science | VOL. 20
Wei Liu, et. al.Wei Liu ... Yuan Hu
01 Feb 2023
Petroleum Science | VOL. 20

Defect Detection in Multiple Product Variants Using Hammering Test with Machine Learning
Yosuke Yamashita ... Yasushi Umeda
International Journal of Automation Technology | VOL. 16
Yosuke Yamashita, et. al.Yosuke Yamashita ... Yasushi Umeda
05 Nov 2022
International Journal of Automation Technology | VOL. 16

Examining and Predicting Teacher Professional Development by Machine Learning Methods
Xin Zhang ... Yueyuan Kang
-
Xin Zhang, et. al.Xin Zhang ... Yueyuan Kang
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IDENTIFICATION AND PROCESSING OF DATA ANOMALIES IN MACHINE LEARNING TASKS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International scientific and technical conference Information technologies in metallurgy and machine building