Abstract

Interval-valued data (IVD) is a kind of data where each feature is an interval. The midpoint and boundary are the two commonly used methods for representing IVD. However, their structure information (such as location, size) may be incomplete because only midpoint or endpoint is adopted which will lead to poor results of data processing. To better depict the structural information of IVD, a unified representation frame (URF) for IVD is proposed. It not only takes into account the size and location information, but the relationship between them as well. This frame can also represent the midpoint and boundary methods in a unified way. Besides, symmetrical uncertainty (SU) is adopted to measure the relationship between features and classes quantitatively, and irrelevant features will be eliminated based on SU. The proposed URF_ SU is applied in some traditional classifiers like LIBSVM, CART Tree and KNN. The experimental results on synthetic and real-world datasets demonstrate that the proposed approach is more effective than other representation methods of IVD in classification tasks.

Highlights

  • In many real situations, inaccuracy, uncertainty or variability may be in some important available information

  • For LIBSVM, RBF kernel is adopted with γ = 0.2, To reduce the experimental error, the penalty factor is set to the default (C = 1)

  • The unified representation frame is proposed to solve Interval-valued data (IVD)’s representation problem. It can incorporate the existing representation methods and make the midpoint and radius reach a good compromise by adjustment factor

Read more

Summary

INTRODUCTION

Inaccuracy, uncertainty or variability may be in some important available information. (1) Midpoint method takes the midpoint as a special value of IVD, and uses traditional methods to deal with it [1]–[3] This method only considers the internal condition of IVD, but loses the size information. Reference [8] represented IVD by the midpoint and radius, predicted these two independent variables by symmetric linear regression model The method considers both internal condition and size information, but there is no correlation between these two elements. The existing representation methods lose either size information or location information, but do not notice the relationship between them They may even have twice the number of features of the original IVD. A united representation frame, which only contains the same number of features as the original IVD and considers the relationship between midpoint and radius, is proposed. The proposed classification method will be explained in depth

THE UNIFIED REPRESENTATION FRAME FOR IVD
FEATURE SELECTION BASED ON THE UNITED
1) CLASSIFICATION RESULTS
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.