Abstract

Case‐control design based high‐throughput pharmacoinformatics study using large‐scale longitudinal health data is able to detect new adverse drug event (ADEs) signals. Existing control selection approaches for case‐control design included the dynamic/super control selection approach. The dynamic/super control selection approach requires all individuals to be evaluated at all ADE case index dates, as the individuals’ eligibilities as control depend on ADE/enrollment history. Thus, using large‐scale longitudinal health data, the dynamic/super control selection approach requires extraordinarily high computational time. We proposed a random control selection approach in which ADE case index dates were matched by randomly generated control index dates. The random control selection approach does not depend on ADE/enrollment history. It is able to significantly reduce computational time to prepare case‐control data sets, as it requires all individuals to be evaluated only once. We compared the performance metrics of all control selection approaches using two large‐scale longitudinal health data and a drug‐ADE gold standard including 399 drug‐ADE pairs. The F‐scores for the random control selection approach were between 0.586 and 0.600 compared to between 0.545 and 0.562 for dynamic/super control selection approaches. The random control selection approach was ~ 1000 times faster than dynamic/super control selection approach on preparing case‐control data sets. With large‐scale longitudinal health data, a case‐control design‐based pharmacoinformatics study using random control selection is able to generate comparable ADE signals than the existing control selection approaches. The random control selection approach also significantly reduces computational time to prepare the case‐control data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call