Abstract

The co-occurrence analysis of Medical Subject Heading (MeSH) terms extracted from the PubMed database is popularly used in bibliometrics. Practically for making the result interpretable, it is necessary to apply a certain filter procedure of co-occurrence matrix for removing the low-frequency items due to their low representativeness. Unfortunately, there is rare research referring to determine a critical threshold to remove the noise of co-occurrence matrix. Here, we proposed a probabilistic model for co-occurrence analysis that can provide statistical inferences about whether the paired items co-occur randomly. With help of this model, the dimensionality of co-occurrence matrix could be reduced according to the selected threshold. The conceptual model framework, simulation and practical applications are illustrated in the manuscript. Further details (including all reproducible codes) can be downloaded from the project website: https://github.com/xizhou/co-occurrence-analysis.git.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call