Abstract
AbstractBibliographic search engines allow endless possibilities for building queries based on specific words or phrases in article titles and abstracts, indexing terms, and other attributes. Unfortunately, deciding which attributes to use in a methodologically sound query is a non-trivial process. In this paper, we describe a system to help with this task, given an example set of PubMed articles to retrieve and a corresponding set of articles to exclude. The system provides the users with unigram and bigram features from the title, abstract, MeSH terms, and MeSH qualifier terms in decreasing order of precision, given a recall threshold. From this information and their knowledge of the domain, users can formulate a query and evaluate its performance. We apply the system to the task of distinguishing original research articles of functional magnetic resonance imaging (fMRI) of sensorimotor function from fMRI studies of higher cognitive functions.
Highlights
The classification of abstracts in PubMed has been studied extensively, there are few tools to help end users develop effective classification queries for use in PubMed
Query development features We began with a set of functional magnetic resonance imaging (fMRI) research articles over the period 1991-2001 which had been manually curated based on the degree of cognitive function under observation (Illes et al, 2010)
We described a simple mechanism for formulating effective queries for use PubMed, provided a set of example true positives and true negatives
Summary
The classification of abstracts in PubMed has been studied extensively, there are few tools to help end users develop effective classification queries for use in PubMed. Several tools exist to illustrate relative recall of features, but these only provide results for a single query, rather than differential attributes between two queries. Plikus et al.'s PubFocus (2006) provides citation analytics and sorting by impact factor, but lacks for any means of comparison. We propose a method to suggest query components given a user-provided list of true positive and true negative PubMed identifiers. We recently developed a system to extract features from full text of open access articles, for query execution in existing full-text portals like PubMed Central, HighWire Press, and Google Scholar (Piwowar and Chapman, 2010). The current implementation evaluates unigram and bigram features of the article title and abstract, as well as medical subject heading (MeSH) indexing terms, MeSH major terms, MeSH qualifiers, and MeSH major qualifiers
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.