Hyper Surface Classification Research Articles

Data are inherently uncertain in most applications. Uncertainty is encountered when an experiment such as sampling is to proceed, the result of which is not known to us while leading to variety of potential outcomes. With the rapid developments of data collection and distribution storage technologies, big data have become a bigger-than-ever problem. And dealing with big data with uncertainty distribution is one of the most important issues of big data research. In this paper, we propose a Parallel Sampling method based on Hyper Surface for big data with uncertainty distribution, namely PSHS, which adopts a universal concept of Minimal Consistent Subset (MCS) of Hyper Surface Classification (HSC). Our inspiration for handling uncertainties in sampling from big data depends on (1) the inherent structure of the original sample set is uncertain for us, (2) boundary set formed of all the possible separating hyper surfaces is a fuzzy set and (3) the uncertainty of elements in MCS. PSHS is implemented based on MapReduce framework, which is a current and powerful parallel programming technique used in many fields. Experiments have been carried out on several data sets including real world data from UCI repository and synthetic data. The results show that our algorithm shrinks data sets while maintaining identical distribution, which is useful for obtaining the inherent structure of the data sets. Furthermore, the evaluation criterions of speedup, scaleup and sizeup validate its efficiency.

Read full abstract

Based on Jordan Curve Theorem, a universal classification method called hyper surface classification (HSC) has recently been proposed. Experimental results are exciting, which show that in three-dimensional space, this method works fairly well in both accuracy and efficiency even for large size data up to 107. However, designing a number of new classifiers is needed with the growing of feature dimension. To solve the problem, a kind of efficient dimension transposition method that is suitable for HSC and without losing any essential information is put forward in this paper. The dimension transposition method rearrange all of the numerals in the higher dimensional data to lower dimensional data without changing each numeral, but only change their position according to some orders. The experiment shows that the method can classify high dimension data with high accuracy.

Read full abstract

Hyper Surface Classification Research Articles

Articles published on Hyper Surface Classification

Parallel sampling from big data with uncertainty distribution

Study on Complexity in Hyper Surface Classification

MINIMAL CONSISTENT SUBSET FOR HYPER SURFACE CLASSIFICATION METHOD

Classification based on dimension transposition for high dimension data

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Hyper Surface Classification Research Articles

Articles published on Hyper Surface Classification

Parallel sampling from big data with uncertainty distribution

Study on Complexity in Hyper Surface Classification

MINIMAL CONSISTENT SUBSET FOR HYPER SURFACE CLASSIFICATION METHOD

Classification based on dimension transposition for high dimension data