Abstract
Froth flotation, a widely used mineral beneficiation technique, generates substantial volumes of data, offering the opportunity to extract valuable insights from these data for production line analysis. The quality of flotation data is critical to designing accurate prediction models and process optimisation. Unfortunately, industrial flotation data are often compromised by quality issues such as outliers that can produce misleading or erroneous analytical results. A general approach is to preprocess the data by replacing or imputing outliers with data values that have no connection with the real state of the process. However, this does not resolve the effect of outliers, especially those that deviate from normal trends. Outliers often occur across multiple variables, and their values may occur in normal observation ranges, making their detection challenging. An unresolved challenge in outlier detection is determining how far an observation must be to be considered an outlier. Existing methods rely on domain experts’ knowledge, which is difficult to apply when experts encounter large volumes of data with complex relationships. In this paper, we propose an approach to conduct outlier analysis on a flotation dataset and examine the efficacy of multiple machine learning (ML) algorithms—including k-Nearest Neighbour (kNN), Local Outlier Factor (LOF), and Isolation Forest (ISF)—in relation to the statistical 2σ rule for identifying outliers. We introduce the concept of “quasi-outliers” determined by the 2σ threshold as a benchmark for assessing the ML algorithms’ performance. The study also analyses the mutual coverage between quasi-outliers and outliers from the ML algorithms to identify the most effective outlier detection algorithm. We found that the outliers by kNN cover outliers of other methods. We use the experimental results to show that outliers affect model prediction accuracy, and excluding outliers from training data can reduce the average prediction errors.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.