Abstract

There is an urgent need for a system that facilitates surveys by biomedical researchers and the subsequent formulation of hypotheses based on the knowledge stored in literature. One approach is to cluster papers discussing a topic of interest and reveal its sub-topics that allow researchers to acquire an overview of the topic. We developed such a system called McSyBi. It accepts a set of citation data retrieved with PubMed and hierarchically and non-hierarchically clusters them based on the titles and the abstracts using statistical and natural language processing methods. A novel point is that McSyBi allows its users to change the clustering by entering a MeSH term or UMLS Semantic Type, and therefore they can see a set of citation data from multiple aspects. We evaluated McSyBi quantitatively and qualitatively: clustering of 27 sets of citation data (40643 different papers) and scrutiny of several resultant clusters. While non-hierarchical clustering provides us with an overview of the target topic, hierarchical clustering allows us to see more details and relationships among citation data. McSyBi is freely available at http://textlens.hgc.jp/McSyBi/.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.