Abstract

BackgroundConsumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language.ObjectiveOur objective was to create a computer assisted update (CAU) system that works with live corpora to identify new candidate terms for inclusion in the open access and collaborative (OAC) CHV.MethodsThe CAU system consisted of three main parts: a Web crawler and an HTML parser, a candidate term filter that utilizes natural language processing tools including term recognition methods, and a human review interface. In evaluation, the CAU system was applied to the health-related social network website PatientsLikeMe.com. The system’s utility was assessed by comparing the candidate term list it generated to a list of valid terms hand extracted from the text of the crawled webpages.ResultsThe CAU system identified 88,994 unique terms 1- to 7-grams (“n-grams” are n consecutive words within a sentence) in 300 crawled PatientsLikeMe.com webpages. The manual review of the crawled webpages identified 651 valid terms not yet included in the OAC CHV or the Unified Medical Language System (UMLS) Metathesaurus, a collection of vocabularies amalgamated to form an ontology of medical terms, (ie, 1 valid term per 136.7 candidate n-grams). The term filter selected 774 candidate terms, of which 237 were valid terms, that is, 1 valid term among every 3 or 4 candidates reviewed.ConclusionThe CAU system is effective for generating a list of candidate terms for human review during CHV development.

Highlights

  • Controlled vocabularies play an important role in the development of biomedical informatics applications because data used by clinical, bibliometric, and research applications need to be coded for easy retrieval and analysis

  • The computer assisted update (CAU) system is effective for generating a list of candidate terms for human review during Consumer health vocabularies (CHVs) development

  • VA Medical Record Term Filter To filter the nonmedical terms from the non-CHV n-grams, we looked them up in a database of 70,000 medical records of patients obtained from the US Department of Veteran’s Affairs of patients with amyotrophic lateral sclerosis (ALS), Parkinson’s, and multiple sclerosis (MS) dated from January 1, 1998, through December 31, 2008

Read more

Summary

Introduction

Controlled vocabularies play an important role in the development of biomedical informatics applications because data used by clinical, bibliometric, and research applications need to be coded for easy retrieval and analysis. Research and development activities have been carried out to provide standardized health vocabularies, for example, SNOMED (Systematized Nomenclature of Medicine) and LOINC (Logical Observation Identifiers Names and Codes). Consumer health vocabularies (CHVs) have been developed to aid consumer health informatics applications This purpose is best served if the vocabulary evolves with consumers’ language. In this background section, we will first briefly review the prior research and current practice for updating controlled health vocabularies. For the “Execution” component, almost all of the quartile IV systems satisfied the main criteria, that is, 67% included standardized change proposals, 100% validated the change proposals, 100% had maintenance teams that verified accepted proposals, 100% had structured and standardized documentation, 100% documented changes made, and 100% produced new versions with unique id’s, while only 70% produced twice yearly updates. The CAU system we describe here is designed to automate the production and collection of change proposals and assist with the validation of those proposals

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call