Research on Attribute Dimension Partition Based on SVM Classifying and MapReduce

Wenbin Zhao,Hou Wen,Feng Wu,Yongchuan Nie,Tongrang Fan

doi:10.1007/s11277-018-5301-9

Abstract

The data analysis is closely related to data attribute dimension. The traditional extraction and partition of data attribute dimension is so manual and inefficiency as to not meet the needs of analysing big data. This paper proposed an attribute dimension partition scheme based on SVM classifying and MapReduce for analysing big data. This scheme improve traditional SVM classifying method by combining Euclidean distance theory for overcoming its disadvantages, and adopts punish coefficient to reduce the unbalance of data distribution. With the improved SVM classifying method, the implementation of attribute dimension partition take MapReduce model of Hadoop as process engine, use TF–IDF vector to save the extracted attribute dimension, and use k-means clustering algorithm to clustering partition. The experiment result shows that the execution efficiency of the proposed method is enhanced, and while the rationality of partition is guaranteed, the increasing of data attributes does not significantly increase the execution time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Research on Attribute Dimension Partition Based on SVM Classifying and MapReduce

Abstract

Talk to us

Similar Papers

More From: Wireless Personal Communications

Lead the way for us

Journal: Wireless Personal Communications	Publication Date: Feb 22, 2018
Citations: 9

Similar Papers

VTWM: An Incremental Data Extraction Model Based on Variable Time-Windows
Weixing Jia ... Yang Xu
EAI Endorsed Transactions on Collaborative Computing | VOL. -
Weixing Jia, et. al.Weixing Jia ... Yang Xu
13 Jul 2018
EAI Endorsed Transactions on Collaborative Computing | VOL. -

Extractive text summarization system to aid data extraction from full text in systematic review development
Duy Duc An Bui ... Siddhartha Jonnalagadda
Journal of Biomedical Informatics | VOL. 64
Duy Duc An Bui, et. al.Duy Duc An Bui ... Siddhartha Jonnalagadda
27 Oct 2016
Journal of Biomedical Informatics | VOL. 64

RDCRMG: A Raster Dataset Clean & Reconstitution Multi-Grid Architecture for Remote Sensing Monitoring of Vegetation Dryness
Sijing Ye ... Zuliang Zhao
Remote Sensing | VOL. 10
Sijing Ye, et. al.Sijing Ye ... Zuliang Zhao
30 Aug 2018
Remote Sensing | VOL. 10

Cross-Domain Transfer Learning for Demand Forecasting: Using Social Media Sentiment from Related Industries
Sweta Kumari
Journal for Research in Applied Sciences and Biotechnology | VOL. 1
Sweta KumariSweta Kumari
30 Jun 2022
Journal for Research in Applied Sciences and Biotechnology | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Research on Attribute Dimension Partition Based on SVM Classifying and MapReduce

Abstract

Talk to us

Similar Papers

More From: Wireless Personal Communications