Abstract
This paper provides a unified description of Widening, a framework for the use of parallel (or otherwise abundant) computational resources to improve model quality. We discuss different theoretical approaches to Widening with and without consideration of diversity. We then soften some of the underlying constraints so that Widening can be implemented in real world algorithms. We summarize earlier experimental results demonstrating the potential impact as well as promising implementation strategies before concluding with a survey of related work.
Highlights
In particular we make the distinction between explicit partitions of the model space and how partitions can be closed under refinement or just weakly closed when the selection operator has been applied as well
Afterwards we introduced the notion of path-based Widening which relies on the selection operator to implicitly segment the model space
We discuss an aggregate of earlier results, highlighting potential pitfalls and providing an intuition for different approaches to realize Widening in practice
Summary
The trend to add more cores to modern processors and the growing popularity of cloud based compute resources has increased the importance of parallel algorithm development. Instead our goal is to improve model quality without increasing the overall time spent by investing parallel resources into better exploration of the (model) search space These types of search problems are widespread in machine learning and data mining with models relying on numerical parameters that need optimization, discrete models turning this into a combinatorial search problem, and sometimes a hybrid of both. We provide a formalization of Widening combining, expanding, and unifying earlier publications (Akbar et al 2012; Ivanova and Berthold 2013) that describe a number of ideal methods for Widening of this type of search and reducing the impact of the greedy heuristic These choices differ in how they widen the search with various partitioning methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have