Individual case reports are essential to identify and assess previously unknown adverse effects of medicines. On these reports, information on adverse events (AEs) and drugs are encoded in hierarchical terminologies. Encoding differences may hinder the retrieval and analysis of clinically related reports relevant to a topic of interest. Recent studies have explored the use of data-driven semantic vector representations to support analysis of pharmacovigilance data. This study aims to evaluate the stability and clinical relatedness of vigiVec, a semantic vector representation for codes of AEs and drugs. vigiVec is a published adaptation to pharmacovigilance of the publicly available Word2Vec model, applied to structured data instead of free text. It provides vector representations for MedDRA® Preferred Terms and WHODrug Global active ingredients, learned from reporting patterns in VigiBase, the WHO global database of adverse event reports for medicines and vaccines. For this study, a 20-dimensional Skip-gram architecture with window size 250 was used. Our evaluation focused on nearest neighbors identified by the cosine similarity of vigiVec vector representations. Clinical relatedness was measured through term intruder detection, whereby a medical doctor was tasked to identify a randomly selected term-the intruder-includedamong the four nearest neighbors to a specific AE or drug. Stability was measured as the average overlap in the ten nearest neighbors for each AE or drug, in repeated fittings of vigiVec. Among the ten nearest neighbors, 1.8 AEs on average belonged to the same MedDRA High Level Term (HLT; e.g., coagulopathies), and 1.3 drugs belonged to the same Anatomical Therapeutic Chemical level 3 (ATC-3; e.g., opioids). In the intruder detection task, when neighbors and intruders were both chosen from the same HLT, the intruder detection rate was 46%. When selected from different HLTs, it was 79%. Byrandom chance, we should expect 20%(1 in 5). Corresponding rates for drugs were 42% in same ATC-3 and 65% in different ATC-3. The stability of nearest neighbors was 80% for AEs and 64% for drugs. Nearest neighbors identified with vigiVec are stable and show high level of clinical relatedness. They are often from different parts of the existing hierarchies and complement these.
Read full abstract