Abstract
Along with increasing amounts of big data sources and increasing computer performance, real-world evidence from such sources likewise gains in importance. While this mostly applies to population averaged results from analyses based on the all available data, it is also possible to conduct so-called personalized analyses based on a data subset whose observations resemble a particular patient for whom a decision is to be made. Claims data from statutory health insurance companies could provide necessary information for such personalized analyses. To derive treatment recommendations from them for a particular patient in everyday care, an automated, reproducible and efficiently programmed workflow would be required. We introduce the R-package SimBaCo (Similarity-Based Cohort generation) offering a simple, but modular, and intuitive framework for this task. With the six built-in R-functions, this framework allows the user to create similarity cohorts tailored to the characteristics of particular patients. An exemplary workflow illustrates the distinct steps beginning with an initial cohort selection according to inclusion and exclusion criteria. A plotting function facilitates investigating a particular patient’s characteristics relative to their distribution in a reference cohort, for example the initial cohort or the precision cohort after the data has been trimmed in accordance with chosen variables for similarity finding. Such precision cohorts allow any form of personalized analysis, for example personalized analyses of comparative effectiveness or customized prediction models developed from precision cohorts. In our exemplary workflow, we provide such a treatment comparison whereupon a treatment decision for a particular patient could be made. This is only one field of application where personalized results can directly support the process of clinical reasoning by leveraging information from individual patient data. With this modular package at hand, personalized studies can efficiently weight benefits and risks of treatment options of particular patients.
Highlights
Analyses of large, routinely collected data sources can support decision-making in new patients whose data in turn contribute to that data source again [1]
The decisive information on individual benefits and harms is difficult to obtain from controlled observational studies because averaged responses in heterogeneous treatment groups might not apply to a particular patient, even if they were derived by appropriate methods approximating causal inference [4]
We developed the R package SimBaCo (Similarity-Based Cohort generation) for generating precision cohorts within a standardized and flexible workflow to be applied to claims data
Summary
Routinely collected data sources can support decision-making in new patients whose data in turn contribute to that data source again [1]. Comparative effectiveness research and personalized prediction of outcomes are two major analytical applications evaluating big healthcare data repositories such as statutory health insurance databases [2]. Such information could guide medical treatment recommendations in situations with limited evidence from randomized controlled trials as frequently encountered in frailty, multimorbidity, older patients, and children [3]. The data source can be personalized in accordance with individual patient characteristics to derive more specific subsets called precision cohorts Analyses based on such precision cohorts hold the promise to improve predictions compared to models developed from an unselected source [7]. In order to achieve adequate (internal and external) validity and transportability, it is fundamental to generate a precision cohort in a reproducible and preferably automated way [8]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.