Abstract

William A. Maniatty Department of Computer Science University at Albany Albany, NY 12222 maniatty@cs.albany.edu Mohammed J. Zaki Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 zaki@cs.rpi.edu ABSTRACT The urrent generation of data mining tools have limited apa ity and performan e, sin e these tools tend to be sequential. This paper explores a migration path out of this bottlene k by onsidering an integrated hardware and software approa h to parallelize data mining. Our analysis shows that parallel data mining solutions require the following omponents: parallel data mining algorithms, parallel and distributed data bases, parallel le systems, parallel I/O, tertiary storage, management of online data, support for heterogeneous data representations, se urity, quality of servi e and pri ing metri s. State of the art te hnology in these areas is surveyed with an eye towards an integration strategy leading to a omplete solution. General Terms S alable Knowledge Dis overy and Data Mining

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.