Abstract

Being able to better understand the underlying structure of clinical data is a topic of growing importance. Topological data analysis enables data scientists to uncover the “shape” of data by extracting the underlying topological structure which enables distinct regions to be identified. For example, certain regions may be associated with early-stage disease whilst others may represent different advanced disease sub-types. The identification of these regions can help clinicians to better understand specific patients’ symptoms based upon where they lie in the disease topology, and therefore to make more targeted interventions. However, these topologies do not capture any sequential or temporal information. Pseudo-time series analysis can generate realistic trajectories through non-time-series data based on a combination of graph theory and the exploitation of expert knowledge (e.g. disease staging information). In this paper, we explore the combination of pseudo time and topological data analysis to build realistic trajectories over disease topologies. Using three different datasets: simulated, diabetes and genomic data, we explore how the combined method can highlight distinct temporal phenotypes in each disease based on the possible trajectories through the disease process.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call