Abstract
Multisource incomplete mixed data fusion (MsIMDF) plays a crucial role in outlier detection by utilizing complementary, informative, interpretative, and less noisy single-source data to identify unexpected errors or behaviors. However, existing multisource data fusion approaches only consider homogeneous data and single uncertain information,overlooking the mixed heterogeneous data and diverse uncertainty information. This limitation can negatively impact the performance of outlier detection. To address this issue, we propose MsIMDF-USF, a novel two-stage model that fully leverages the rich multisource knowledge and uncertainty information in incomplete mixed data. During the information fusion stage, the MsIMDF model combines multisource data into new single-source data using the minimum uncertainty strategy based on rough and fuzzy information. Subsequently, in the outlier detection stage, we reconstruct a neighborhood information network under a united-similar-fuzzy (USF) relationship using the new fused data. This reconstruction aims to strengthen the connections between similar objects while weakening relationships among dissimilar ones by considering single-attribute and multi-attribute information. Outlier scores are obtained based on the stationary distribution of the reconstructed networks using a Markov random walk. Experimental results on 16 real datasets demonstrate that the MsIMDF-USF model effectively extracts higher-quality data, exhibiting high applicability and robustness in outlier detection tasks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.