Abstract
Finding relevant features in datasets is critical to designing and implementing effective Machine Learning solutions. For datasets of relatively low dimensions this can be a challenging task and in high-dimensional datasets this can be very difficult. This paper provides a nonparametric approach to find feature relationships in large datasets. Such an approach applies to high-dimensional, time series data, but can be used on any dataset. The approach discussed uses rank correlation and applies a technique to identify feature relationships in the data. This results in finding sets of feature relationships, some of which have relevance (weak, strong, etc.) to a specific feature while other sets of relationships may be irrelevant. The number of feature relationships identified will vary based on the data. The approach in this paper is ultimately limited by the ability to distinguish between random and correlated data (i.e., noise limited). The end goal is identifying relevant features and Knowledge Discovery as these sets of Feature Relationships can reveal a framework that helps explain complex systems described by the data. Furthermore, this paper considers implementing the approach using Field Programmable Gate Arrays (FPGAs). FPGAs support high performance, scalable, high throughput implementations to process big data. Design data is provided to implement this approach in the context of challenging high-dimensional data. The example FPGA design is provided based on the Xilinx Adaptive Compute Acceleration Platform (ACAP) using the Versal Series device.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.