K-nearest Neighbor Clustering Research Articles

Background: The second wave of the COVID-19 pandemic led to substantial differences in incidence rates across Germany. Methods: Assumption-free k-nearest neighbour clustering from the principal component analysis of weekly incidence rates of German counties groups similar spreading behaviour. Different spreading dynamics was analysed by the derivative plots of the temporal evolution of tuples [x(t),x’(t)] of weekly incidence rates and their derivatives. The effectiveness of the different shutdown measures in Germany during the second wave is assessed by the difference of weekly incidences before and after the respective time periods. Findings: The implementation of non-pharmaceutical interventions of different extents resulted in four distinct time periods of complex, spatially diverse, and age-related spreading patterns during the second wave of the COVID-19 pandemic in Germany. Clustering gave three regions of coincident spreading characteristics. October 2020 showed a nationwide exponential growth of weekly incidence rates with a doubling time of 10 days. A partial shutdown during November 2020 decreased the overall infection rates by 20–40% with a plateau-like behaviour in northern and southwestern Germany. The eastern parts exhibited a further near-linear growth by 30–80%. Allover the incidence rates among people above 60 years still increased by 15–35% during partial shutdown measures. Only an extended shutdown led to a substantial decrease in incidence rates. These measures decreased the numbers among all age groups and in all regions by 15–45%. This decline until January 2021 was about -1∙25 times the October 2020 growth rates with a strong correlation of -0∙96. Interpretation: Three regional groups with different dynamics and different degrees of effectiveness of the applied measures were identified. The partial shutdown was moderately effective and at most stopped the exponential growth, but the spread remained partly plateau-like and regionally continued to grow in a nearly linear fashion. Only the extended shutdown reversed the linear growth.Funding Statement: Institutional support and physical resources were provided by the University Witten/ Herdecke and Kliniken der Stadt Köln, German ministry of education and research ‘Netzwerk Universitätsmedizin’ (NUM), egePan Unimed (01KX2021).Declaration of Interests: Dr. Karagiannidis reports personal fees from Maquet, personal fees from Xenios, personal fees from Bayer, non-financial support from Speaker of the German register of ICUs, grants from German Ministry of Research and Education, during the conduct of the study. Dr. Schuppert reports grants from Bayer AG, outside the submitted work. Dr. Jens Karschau has nothing to disclose. Dr. Polotzek has nothing to disclose. Dr. Schmitt reports personal fees from Sanofi, Lilly, ALK, Novartis, grants from Sanofi, Pfizer, ALK, Novartis, outside the submitted work. Dr. Busse reports grants from Berlin University Alliance, non-financial support from German Federal Ministry of Health, during the conduct of the study.

IntroductionNearly 80% of all patients with heart failure (HF) are older adults (≥65 years of age). Prior studies have built predictive models that relied on structured data from electronic health records (EHRs) to predict the risk of 30-day rehospitalization for patients with HF. Structured data mostly included simple vocabularies such as age, and ethnicity. Rarely do prior studies include clinical narrative data in a free-text format (i.e., unstructured data). No previous study has focused on using clinical narrative notes specifically for Medicare patients with HF in the acute-care setting.AimTo identify clinical notes for building a predictive model for risk of 30-day rehospitalization among Medicate patients with HF.MethodsThis study first used free-text discharge summary notes and nursing care plans collected from June 1, 2015 to December 31, 2019, for a randomly selected 500 Medicare patients with HF. Natural Language Processing (NLP): we iterated over standard text pre-processing steps, exploring the impact of n-gram length, term document-frequency, word stemming, and the added value of parts-of-speech. We chose two models: 1) the classification model called Bag-of Words (BOW), where each document is represented by a vector based on the pre-processed text, and 2) Document Embedding, where document terms are mapped to a dimension-reducing layer (length equals 300). The latter runs exceptionally fast (on the order of tens-of-seconds for 2,000 documents). Machine Learning (ML): the output of the NLP BOW and Document Embedding models were fed to six different conventional machine learning systems (logistic regression, support vector machine, random forest, k-nearest neighbor clustering, a three-layer neural network, and Naïve Bayes).ResultsThe mean age was 77±7.9, and the average of length of hospital stay was 4.9 days ± 4.8. The best BOW model we found using discharge summaries (n=387) produced an Area Under the Receiver Operating Characteristics Curve (AUC) of 0.71 and F1 score of 0.65. The best Document Embedding model yielded an AUC of 0.65 and an F1 score of 0.61. Using nursing care notes as the unit of analysis (n = 2,046), the NLM/ML performed far better. The best BOW model on care plans found an AUC of 0.85 and F1 score of 0.77. The best Document Embedding produced an AUC of 0.83 and an F1 score of 0.75. In all cases we held out 33% of the data set for validation, repeating a random draw 10 times and averaging the performance results.ConclusionsWe conclude that nursing care plans are a better predictor of 30-day rehospitalization risk than discharge summaries. Because nursing care plans are shorter than discharge summaries, they have the added advantage of faster processing. Since the faster Document Embedding model's performance is comparable to that of BOW, we suggest its use in future work in the area of 30-day rehospitalization risk in Medicare patients with HF.

K-nearest Neighbor Clustering Research Articles

Related Topics

Articles published on K-nearest Neighbor Clustering

Investigations on Brain Tumor Classification Using Hybrid Machine Learning Algorithms.

SRIQ clustering: A fusion of Random Forest, QT clustering, and KNN concepts

Pear Defect Detection Method Based on ResNet and DCGAN

An Optimized Nature-Inspired Metaheuristic Algorithm for Application Mapping in 2D-NoC.

Different spreading dynamics throughout Germany during the second wave of the COVID-19 pandemic: a time series study based on national surveillance data.

Different Spreading Dynamics Throughout Germany During the Second Wave of the COVID-19 Pandemic: Link to Public Health Interventions

Online Methodology for Separating the Power Consumption of Lighting Sockets and Air-Conditioning in Public Buildings Based on an Outdoor Temperature Partition Model and Historical Energy Consumption Data

Diagnostic of pathology on the vertebral column machine learning - Cluster K-nearest Neighbor (CKNN) part (I)

Diagnostic of pathology on the vertebral column machine learning - Cluster K-nearest Neighbor (CKNN) part (I)

An adaptive new state recognition method based on density peak clustering and voting probabilistic neural network

Predictive Model for Risk of 30-Day Rehospitalization Using a Natural Language Processing/Machine Learning Approach Among Medicare Patients with Heart Failure

Data Mining Approach to Effort Modeling On Agile Software Projects

Green Computing Process and its Optimization Using Machine Learning Algorithm in Healthcare Sector

Cumulative belief peaks evidential K-nearest neighbor clustering

An Intelligent Big Data Analytics System using Enhanced Map Reduce Techniques

Hierarchical Clustering Algorithm Based on Density Peaks using Kernel Function for Thalassemia Classification

Detection and Classification of Breast Cancer in Mammography Images Using Pattern Recognition Methods

Multi-Objective Optimization of Time-of-Use Price for Tertiary Industry Based on Generalized Seasonal Multi- Model Structure

A Self-Adaptive Mapping Approach for Network on Chip With Low Power Consumption

Performance study of K-nearest neighbor classifier and K-means clustering for predicting the diagnostic accuracy

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

K-nearest Neighbor Clustering Research Articles

Related Topics

Articles published on K-nearest Neighbor Clustering

Investigations on Brain Tumor Classification Using Hybrid Machine Learning Algorithms.

SRIQ clustering: A fusion of Random Forest, QT clustering, and KNN concepts

Pear Defect Detection Method Based on ResNet and DCGAN

An Optimized Nature-Inspired Metaheuristic Algorithm for Application Mapping in 2D-NoC.

Different spreading dynamics throughout Germany during the second wave of the COVID-19 pandemic: a time series study based on national surveillance data.

Different Spreading Dynamics Throughout Germany During the Second Wave of the COVID-19 Pandemic: Link to Public Health Interventions

Online Methodology for Separating the Power Consumption of Lighting Sockets and Air-Conditioning in Public Buildings Based on an Outdoor Temperature Partition Model and Historical Energy Consumption Data

Diagnostic of pathology on the vertebral column machine learning - Cluster K-nearest Neighbor (CKNN) part (I)

Diagnostic of pathology on the vertebral column machine learning - Cluster K-nearest Neighbor (CKNN) part (I)

An adaptive new state recognition method based on density peak clustering and voting probabilistic neural network

Predictive Model for Risk of 30-Day Rehospitalization Using a Natural Language Processing/Machine Learning Approach Among Medicare Patients with Heart Failure

Data Mining Approach to Effort Modeling On Agile Software Projects

Green Computing Process and its Optimization Using Machine Learning Algorithm in Healthcare Sector

Cumulative belief peaks evidential K-nearest neighbor clustering

An Intelligent Big Data Analytics System using Enhanced Map Reduce Techniques

Hierarchical Clustering Algorithm Based on Density Peaks using Kernel Function for Thalassemia Classification

Detection and Classification of Breast Cancer in Mammography Images Using Pattern Recognition Methods

Multi-Objective Optimization of Time-of-Use Price for Tertiary Industry Based on Generalized Seasonal Multi- Model Structure

A Self-Adaptive Mapping Approach for Network on Chip With Low Power Consumption

Performance study of K-nearest neighbor classifier and K-means clustering for predicting the diagnostic accuracy