A case-comparison study of automatic document classification utilizing both serial and parallel approaches

Beatriz Wilges ,Gustavo Pereira Mateus ,Rogério Cid Bastos ,Mário A R Dantas

doi:10.1088/1742-6596/540/1/012001

Beatriz Wilges , Gustavo Pereira Mateus + Show 2 more

Open Access

https://doi.org/10.1088/1742-6596/540/1/012001

Copy DOI

Abstract

A well-known problem faced by any organization nowadays is the high volume of data that is available and the required process to transform this volume into differential information. In this study, a case-comparison study of automatic document classification (ADC) approach is presented, utilizing both serial and parallel paradigms. The serial approach was implemented by adopting the RapidMiner software tool, which is recognized as the worldleading open-source system for data mining. On the other hand, considering the MapReduce programming model, the Hadoop software environment has been used. The main goal of this case-comparison study is to exploit differences between these two paradigms, especially when large volumes of data such as Web text documents are utilized to build a category database. In the literature, many studies point out that distributed processing in unstructured documents have been yielding efficient results in utilizing Hadoop. Results from our research indicate a threshold to such efficiency.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Physics: Conference Series	Publication Date: Oct 13, 2014
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

A case-comparison study of automatic document classification utilizing both serial and parallel approaches

Abstract

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series

Lead the way for us

Similar Papers

Serial vs. parallel approach to screen sleep disorders: an exploratory study
Lorenzo Tonetti ... Vincenzo Natale
Biological Rhythm Research | VOL. 48
Lorenzo Tonetti, et. al.Lorenzo Tonetti ... Vincenzo Natale
07 Apr 2017
Biological Rhythm Research | VOL. 48

A two‐stage comittee machine of neural networks
Jen‐Feng Wang ... Mark L Nagurka
Journal of the Chinese Institute of Engineers | VOL. 32
Jen‐Feng Wang, et. al.Jen‐Feng Wang ... Mark L Nagurka
01 Mar 2009
Journal of the Chinese Institute of Engineers | VOL. 32

MO‐C‐17A‐10: Comparison of Dose Deformable Accumulation by Using Parallel and Serial Approaches
Z Gao ... J Wong
Medical Physics | VOL. 41
Z Gao, et. al.Z Gao ... J Wong
29 May 2014
MO‐C‐17A‐10: Comparison of Dose Deformable Accumulation by Using Parallel and Serial Approaches
Z Gao ... J Wong

Comparison of serial and parallel approaches using artificial neural networks for Algerian short term load forecasting
Kheir Eddine ... Oussama Laib
-
Kheir Eddine, et. al.Kheir Eddine ... Oussama Laib
12 Apr 2015
12 Apr 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A case-comparison study of automatic document classification utilizing both serial and parallel approaches

Abstract

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series