The subject of the research is the Isolation Forest model, which is a powerful and efficient tool for detecting anomalies in measurement data and outliers, applicable in various fields where ensuring high accuracy and reliability of measurements is important. The goal of the study is to apply the Isolation Forest model to identify unusual or anomalous patterns that differ from typical patterns in the output data. This is achieved by isolating anomalous patterns from normal ones through the construction of multiple different decision trees. The task of the research is to detect outliers in data obtained during the preparation for international comparisons on the state primary standard for mass and volume flow rate of fluid, mass and volume of fluid flowing through a pipeline, by measuring with a сoriolis flowmeter. Data collected during metrological studies undergo processing by the model to detect anomalies. This model analyzes the data and identifies anomalous or outlier values that may indicate systematic or random measurement errors. It enables quick and efficient detection of even the smallest deviations in the data, helping to maintain high accuracy and reliability of measurement results. The main methods for detecting outliers in statistical analysis, which are distribution-independent, are the Grubbs' criterion, interquartile range distribution, and standard deviation. They are sensitive to sample size but are simple and understandable tools. However, the Isolation Forest model also has its limitations, particularly it can be resource-demanding for large datasets. Additionally, it is necessary to consider that using the model requires proper parameter tuning to achieve optimal results. The results of the research include assessment of the Isolation Forest model's effectiveness by comparing it with traditional outlier detection methods. Comparative analysis of the results of different approaches to the same task is an effective method for evaluating the model's performance. Conclusion. The article concludes with the perspective of further research development in this direction. The work will focus on further developing methods for detecting anomalies in measurement data and improving the accuracy and reliability of measurement results in various application fields, which can find broad applications in science and industry.
Read full abstract