Abstract

While rare clinical events, by definition, occur infrequently in a population, the consequences of these events can often be drastic. Unfortunately, developing risk stratification algorithms for these conditions typically requires collecting large volumes of data to capture enough positive and negative cases for training. This process is slow, expensive, and often burdensome to both patients and caregivers. In this paper, we propose an unsupervised machine learning approach to address this challenge and risk stratify patients for adverse outcomes without use of {\it a priori} knowledge or labeled training data. The key idea of our approach is to identify high risk patients as anomalies in a population (i.e., patients lying in sparse regions of the feature space). We identify these cases through a novel algorithm that finds an approximate solution to the k-nearest neighbor problem using locality sensitive hashing (LSH) based on p-stable distributions. Our algorithm is optimized to use multiple LSH searches, each with a geometrically increasing radius, to find the k-nearest neigbors of patients in a dynamically changing dataset where patients are being added or removed over time. When evaluated on data from the National Surgical Quality Improvement Program (NSQIP), this approach was able to successfully identify patients at an elevated risk of mortality and rare morbidities. The LSH-based algorithm provided a substantial improvement over an exact k-nearest neighbor algorithm in runtime, while achieving a similar accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.