Recent advances in scaling‐down sampling methods in machine learning

Amr Elrafey,Janusz Wojtusiak

doi:10.1002/wics.1414

Abstract

Data sampling methods have been investigated for decades in the context of machine learning and statistical algorithms, with significant progress made in the past few years driven by strong interest in big data and distributed computing. Most recently, progress has been made in methods that can be broadly categorized into random sampling including density‐biased and nonuniform sampling methods; active learning methods, which are a type of semi‐supervised learning and an area of intense research; and progressive sampling methods which can be viewed as a combination of the above two approaches. A unified view of scaling‐down sampling methods is presented in this article and complemented with descriptions of relevant published literature. WIREs Comput Stat 2017, 9:e1414. doi: 10.1002/wics.1414This article is categorized under: Statistical and Graphical Methods of Data Analysis > Sampling

Full Text