Abstract

BackgroundTo facilitate the implementation of the Family Smoking Prevention and Tobacco Control Act of 2009, the Federal Drug Agency (FDA) Center for Tobacco Products (CTP) has identified research priorities under the umbrella of tobacco regulatory science (TRS). As a newly integrated field, the current boundaries and landscape of TRS research are in need of definition. In this work, we conducted a bibliometric study of TRS research by applying author topic modeling (ATM) on MEDLINE citations published by currently-funded TRS principle investigators (PIs).ResultsWe compared topics generated with ATM on dataset collected with TRS PIs and topics generated with ATM on dataset collected with a TRS keyword list. It is found that all those topics show a good alignment with FDA’s funding protocols. More interestingly, we can see clear interactive relationships among PIs and between PIs and topics. Based on those interactions, we can discover how diverse each PI is, how productive they are, which topics are more popular and what main components each topic involves. Temporal trend analysis of key words shows the significant evaluation in four prime TRS areas.ConclusionsThe results show that ATM can efficiently group articles into discriminative categories without any supervision. This indicates that we may incorporate ATM into author identification systems to infer the identity of an author of articles using topics generated by the model. It can also be useful to grantees and funding administrators in suggesting potential collaborators or identifying those that share common research interests for data harmonization or other purposes. The incorporation of temporal analysis can be employed to assess the change over time in TRS as new projects are funded and the extent to which new research reflects the funding priorities of the FDA.Electronic supplementary materialThe online version of this article (doi:10.1186/s13040-015-0043-7) contains supplementary material, which is available to authorized users.

Highlights

  • To facilitate the implementation of the Family Smoking Prevention and Tobacco Control Act of 2009, the Federal Drug Agency (FDA) Center for Tobacco Products (CTP) has identified research priorities under the umbrella of tobacco regulatory science (TRS)

  • In standard Latent Dirichlet Allocation (LDA), each document in the collection of D documents is modeled as a multinomial distribution over T topics, where each topic is a multinomial distribution over W words and both sets of multinomial are sampled from a Dirichlet distribution

  • As in LDA, a topic is chosen from a distribution over topics specific to that author, and the word is generated from the chosen topic

Read more

Summary

Introduction

To facilitate the implementation of the Family Smoking Prevention and Tobacco Control Act of 2009, the Federal Drug Agency (FDA) Center for Tobacco Products (CTP) has identified research priorities under the umbrella of tobacco regulatory science (TRS). We conducted a bibliometric study of TRS research by applying author topic modeling (ATM) on MEDLINE citations published by currently-funded TRS principle investigators (PIs). To facilitate the implementation of the Family Smoking Prevention and Tobacco Control Act (FSPTCA) of 2009, the Federal Drug Agency (FDA) Center for Tobacco Products (CTP) was formed to oversee tobacco regulatory activities. Given the broader TRS research field, we attempted to discover topics as well as general research interests utilizing MEDLINE citations for currently funded TRS investigators

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call