Selective AnDE for large data learning: a low-bias memory constrained approach

Shenglei Chen,Ana M Martínez,Geoffrey I Webb,Limin Wang

doi:10.1007/s10115-016-0937-9

Abstract

Learning from data that are too big to fit into memory poses great challenges to currently available learning approaches. Averaged n-Dependence Estimators (AnDE) allows for a flexible learning from out-of-core data, by varying the value of n (number of super parents). Hence, AnDE is especially appropriate for learning from large quantities of data. Memory requirement in AnDE, however, increases combinatorially with the number of attributes and the parameter n. In large data learning, number of attributes is often large and we also expect high n to achieve low-bias classification. In order to achieve the lower bias of AnDE with higher n but with less memory requirement, we propose a memory constrained selective AnDE algorithm, in which two passes of learning through training examples are involved. The first pass performs attribute selection on super parents according to available memory, whereas the second one learns an AnDE model with parents only on the selected attributes. Extensive experiments show that the new selective AnDE has considerably lower bias and prediction error relative to A$$n'$$nźDE, where $$n' = n-1$$nź=n-1, while maintaining the same space complexity and similar time complexity. The proposed algorithm works well on categorical data. Numerical data sets need to be discretized first.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Selective AnDE for large data learning: a low-bias memory constrained approach

Abstract

Talk to us

Similar Papers

More From: Knowledge and Information Systems

Lead the way for us

Journal: Knowledge and Information Systems	Publication Date: Mar 31, 2016
Citations: 18

Similar Papers

Classification of Categorical and Numerical Data on Selected Subset of Features

-

18 Aug 2010
18 Aug 2010

ZERO++: Harnessing the Power of Zero Appearances to Detect Anomalies in Large-Scale Data Sets
Guansong Pang ... Kai Ming Ting
Journal of Artificial Intelligence Research | VOL. 57
Guansong Pang, et. al.Guansong Pang ... Kai Ming Ting
29 Dec 2016
Journal of Artificial Intelligence Research | VOL. 57

Individual-tree diameter growth model for sugar maple trees in uneven-aged northern hardwood stands under selection system
Diane H Kiernan ... Ralph D Nyland
Forest Ecology and Management | VOL. 256
Diane H Kiernan, et. al.Diane H Kiernan ... Ralph D Nyland
03 Sep 2008
Forest Ecology and Management | VOL. 256

Roof type classification with innovative machine learning approaches.
Naim Ölçer ... Emre Sümer
PeerJ. Computer science | VOL. 9
Naim Ölçer, et. al.Naim Ölçer ... Emre Sümer
25 Jan 2023
PeerJ. Computer science | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Selective AnDE for large data learning: a low-bias memory constrained approach

Abstract

Talk to us

Similar Papers

More From: Knowledge and Information Systems