Abstract

Dealing with very large databases is one of the defining challenges in data mining research and development. When a data base is not a static repository of data, or if the data come from different data sources and putting all data together might amass a huge database for centralized processing, knowledge discovery in such data environments cannot be a one-time process. Existing techniques include data sampling, windowing, bagging, boosting, batch learning, hierarchical meta-learning, and parallel and distributed data mining. This talk will provide a review on these techniques, and present our own recent research efforts on multi-layer induction and synthesizing association rules from different data sources.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.