Abstract

With an increasing demand for raw materials, predictive models that support successful mineral exploration targeting are of great importance. We evaluated different machine learning techniques with an emphasis on boosting algorithms and implemented them in an ArcGIS toolbox. Performance was tested on an exploration dataset from the Iberian Pyrite Belt (IPB) with respect to accuracy, performance, stability, and robustness. Boosting algorithms are ensemble methods used in supervised learning for regression and classification. They combine weak classifiers, i.e., classifiers that perform slightly better than random guessing to obtain robust classifiers. Each time a weak learner is added; the learning set is reweighted to give more importance to misclassified samples. Our test area, the IPB, is one of the oldest mining districts in the world and hosts giant volcanic-hosted massive sulfide (VMS) deposits. The spatial density of ore deposits, as well as the size and tonnage, makes the area unique, and due to the high data availability and number of known deposits, well-suited for testing machine learning algorithms. We combined several geophysical datasets, as well as layers derived from geological maps as predictors of the presence or absence of VMS deposits. Boosting algorithms such as BrownBoost and Adaboost were tested and compared to Logistic Regression (LR), Random Forests (RF) and Support Vector machines (SVM) in several experiments. We found performance results relatively similar, especially to BrownBoost, which slightly outperformed LR and SVM with respective accuracies of 0.96 compared to 0.89 and 0.93. Data augmentation by perturbing deposit location led to a 7% improvement in results. Variations in the split ratio of training and test data led to a reduction in the accuracy of the prediction result with relative stability occurring at a critical point at around 26 training samples out of 130 total samples. When lower numbers of training data were introduced accuracy dropped significantly. In comparison with other machine learning methods, Adaboost is user-friendly due to relatively short training and prediction times, the low likelihood of overfitting and the reduced number of hyperparameters for optimization. Boosting algorithms gave high predictive accuracies, making them a potential data-driven alternative for regional scale and/or brownfields mineral exploration.

Highlights

  • With an increasing demand for raw materials, predictive models that support successful mineral exploration targeting are of great importance

  • Logistic Regression (LR) performed worst determined by all accuracy measures, except for prospective area (PPA) according to which Support Vector machines (SVM) performs the worst

  • The most important evidential layer is the distance to undifferentiated metamorphic rocks which is quite surprising considering the genesis of the deposits

Read more

Summary

Introduction

With an increasing demand for raw materials, predictive models that support successful mineral exploration targeting are of great importance. A wide selection of algorithms have been used to find favorable areas using knowledge-based methods such as evidential belief functions (e.g., Carranza et al 2005; Tien Bui et al 2012; Ford et al 2016), fuzzy logic (e.g., Knox-Robinson 2000, Nykanen et al 2008) or data-driven approaches like weights of evidence (e.g., Chung and Agterberg 1980; Agterberg 1992a, b; Tangestani and Moore 2001; Xiao et al 2015) and logistic regression (e.g., Reddy and Bonham-Carter 1991; Oh and Lee 2008) Currently, there is a trend toward machine learning techniques such as artificial neural networks (e.g., Singer and Kouda 1996; Porwal et al 2003), decision trees (Reddy and Bonham-Carter 1991), random forest (RF) (Carranza and Laborte 2015; RodriguezGaliano et al 2015) and support vector machines (SVM) (Zuo and Carranza 2011; Abedi et al 2012). The black-box nature of the algorithms, as well as the time and performance required to estimate potentially many hyper-parameters can, be seen as a drawback when applying these techniques (Rodriguez-Galiano et al 2015)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call