In refining drug safety signals, defining the object of study is crucial. While research has explored the effect of different event definitions, drug definition is often overlooked. The US FDA Adverse Event Reporting System (FAERS) records drug names as free text, necessitating mapping to active ingredients. Although pre-mapped databases exist, the subjectivity and lack of transparency of the mapping process lead to a loss of control over the object of study. We implemented the DiAna dictionary, systematically mapping individual free-text instances to their corresponding active ingredients and linking them to the World Health Organization Anatomical Therapeutic Chemical (WHO-ATC) classification. We retrieved all drug names reported to the FAERS (2004-December 2022). Using existing vocabularies and string editing, we automatically mapped free text to ingredients. We manually revised the mapping and linked it to the ATC classification. We retrieved 18,151,842 reports, with 74,143,411 drug entries. We manually checked the first 14,832 terms, up to terms occurring over 200 times (96.88% of total drug entries), to 6282 unique active ingredients. Automatic unchecked translations extend the standardization to 346,854 terms (98.94%). The DiAna dictionary showed a higher sensitivity compared with RxNorm alone, particularly for specific drugs (e.g., rimegepant, adapalene, drospirenone, umeclidinium). The most prominent drug classes in the FAERS were immunomodulating (37.40%) and neurologic drugs (29.19%). The DiAna dictionary, as a dynamic open-source tool, provides transparency and flexibility, enabling researchers to actively shape drug definitions during the mapping phase. This empowerment enhances accuracy, reproducibility, and interpretability of results.
Read full abstract