Abstract

Abstract The increase in biomedical data has given rise to the need for developing data sampling techniques. With the emergence of big data and the rise of popularity of data science, sampling or reduction techniques have been assistive to significantly hasten the data analytics process. Intuitively, without sampling techniques, it would be difficult to efficiently extract useful patterns from a large dataset. However, by using sampling techniques, data analysis can effectively be performed on huge datasets, to produce a relatively small portion of data, which extracts the most representative objects from the original dataset. However, to reach effective conclusions and predictions, the samples should preserve the data behavior. In this paper, we propose a unique data sampling technique which exploits the notion of formal concept analysis. Machine learning experiments are performed on the resulting sample to evaluate quality, and the performance of our method is compared with another sampling technique proposed in the literature. The results demonstrate the effectiveness and competitiveness of the proposed approach in terms of sample size and quality, as determined by accuracy and the F1-measure.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.