PMLB: a large benchmark suite for machine learning evaluation and comparison

Randal S Olson,Patryk Orzechowski,Ryan J Urbanowicz,William La Cava,Jason H Moore

doi:10.1186/s13040-017-0154-4

Randal S Olson, Patryk Orzechowski + Show 3 more

Open Access

https://doi.org/10.1186/s13040-017-0154-4

Copy DOI

Abstract

BackgroundThe selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists.ResultsThe present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. From this study, we find that existing benchmarks lack the diversity to properly benchmark machine learning algorithms, and there are several gaps in benchmarking problems that still need to be considered.ConclusionsThis work represents another important step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.

Highlights

The term benchmarking is used in machine learning (ML) to refer to the evaluation and comparison of ML methods regarding their ability to learn patterns in ‘benchmark’ datasets that have been applied as ‘standards’
Evaluating machine learning methods To provide a basis for comparison, we evaluated 13 supervised ML classification methods from scikit-learn [22] on the 165 datasets in Penn Machine Learning Benchmark (PMLB)
We evaluated the ML methods using balanced accuracy [25, 26] as the scoring metric, which is a normalized version of accuracy that accounts for class imbalance by calculating accuracy on a per-class basis averaging the per-class accuracies

Summary

Introduction

The term benchmarking is used in machine learning (ML) to refer to the evaluation and comparison of ML methods regarding their ability to learn patterns in ‘benchmark’ datasets that have been applied as ‘standards’. Comparisons could be made over a range of evaluation metrics, e.g., power to detect signal, prediction accuracy, computational complexity, and model interpretability. This approach to benchmarking would be important for demonstrating new methodological abilities or to guide the selection of an appropriate ML method for a given problem. The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BioData mining	Publication Date: Dec 1, 2017
Citations: 244	License type: open-access

R Discovery Prime

R Discovery Prime

PMLB: a large benchmark suite for machine learning evaluation and comparison

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioData mining

Lead the way for us

Similar Papers

Foreword for the special issue of selected papers from the 1st ECML/PKDD Workshop on Privacy and Security issues in Data Mining and Machine Learning
...
Transactions on Data Privacy | VOL. 4
, et. al. ...
01 Dec 2011
Transactions on Data Privacy | VOL. 4

MULTICLASS CANCER CLASSIFICATION USING GENE EXPRESSION PROFILING AND PROBABILISTIC NEURAL NETWORKS
Daniel P Berrar ... Werner Dubitzky
-
Daniel P Berrar, et. al.Daniel P Berrar ... Werner Dubitzky
01 Dec 2002
01 Dec 2002

Data Mining and Machine Learning in Cybersecurity
Sumeet Dua ... Xian Du
-
Sumeet Dua, et. al.Sumeet Dua ... Xian Du
19 Apr 2016
19 Apr 2016

A systematic review of data mining and machine learning for air pollution epidemiology
Colin Bellinger ... Mohomed Shazan Mohomed Jabbar
BMC International Health and Human Rights | VOL. 17
Colin Bellinger, et. al.Colin Bellinger ... Mohomed Shazan Mohomed Jabbar
28 Nov 2017
BMC International Health and Human Rights | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PMLB: a large benchmark suite for machine learning evaluation and comparison

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioData mining