Abstract

The development in information science has enabled an explosive growth of data, which attracts more and more researchers to engage in the field of big data analytics. Noticeably, in many real-world applications, large amounts of data are imbalanced data since the events of interests occur infrequently. However, the detection of these events is such an important research problem and has attracted significant research efforts as lots of real-world big data sets have skewed class distributions. Despite extensive research efforts, rare class mining remains one of the most challenging problems in information science, especially for multimedia big data. Though inter-concept correlations have been utilized to address this issue recently, the very small number of instances in the minority class often lead to the detection of imprecise correlations and unsatisfactory classification results. This paper proposes a novel concept correlation analysis strategy framework using the correlations between the retrieval scores and labels. By integrating the correlation information, the proposed framework can help imbalance data classification and enhance rare class (or concept) mining even with trivial scores from the minority class. Experimental results on the TRECVID multimedia big benchmark data set demonstrate the effectiveness of the proposed framework with promising performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call