Abstract

Rapid advancement of sensing and information technology brings the big data, which presents a gold mine of the 21st century. However, big data also brings significant challenges for data-driven decision making. In particular, it is not uncommon that a large number of variables (or features) underlie the big data. Complex interdependence structures among variables challenge the traditional framework of predictive modeling. This paper presents a new methodology of self-organizing network for variable clustering and predictive modeling. Specifically, we developed a new approach, namely nonlinear coupling analysis to measure nonlinear interdependence structures among variables. Further, all the variables are embedded as nodes in a complex network. Nonlinear-coupling forces move these nodes to derive a self-organizing topology of network. As such, variables are clustered as sub-network communities in the space. Experimental results demonstrated that the proposed methodology not only outperforms traditional variable clustering algorithms such as hierarchical clustering and oblique principal component analysis, but also effectively identify interdependent structures among variables and further improves the performance of predictive modeling. The proposed new idea of self-organizing network is generally applicable for predictive modeling in many disciplines that involve a large number of highly-redundant variables.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.