Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal Features

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Abstract Parkinson's disease (PD), the second most prevalent neurodegenerative disorder after Alzheimer's disease, affects approximately 10 million individuals worldwide. The disease is characterized by both motor and non-motor symptoms, and clinical aspects are pivotal for diagnosis. Vocal abnormalities can be identified in about 90% of PD patients in the early stages of the condition. Machine Learning (ML), a prominent subfield of Artificial Intelligence (AI), holds significant promise in the medical domain, particularly for early disease detection, enabling effective preventive measures and treatments. In this paper, we considered the unique characteristics of each ML algorithm. Seventeen ML algorithms were applied to a dataset of voice recordings from Healthy Control and PD individuals, sourced from a publicly available repository. We leveraged the PyCaret Python library's ML algorithms and functions, which were introduced in this article, to demonstrate their simplicity and effectiveness in dealing with real-world data. Among these algorithms, Extra Trees Classifier (ETC), Gradient Boosting Classifier (GBC), and K Neighbors Classifier (KNN) exhibited the best performance for the given dataset. Furthermore, to enhance the models' performance, we employed various techniques, including Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance, feature selection based on correlation, and hyperparameter tuning. Our findings highlight the potential of the PyCaret ML library demonstrated in this article as a valuable tool for applying ML to the classification of Parkinson's disease through voice analysis. The application of ML in this context can greatly support clinical decision-making, leading to more informed and precise interventions.

Similar Papers
  • Research Article
  • Cite Count Icon 7
  • 10.1080/08839514.2022.2158273
Analysis of Birth Data using Ensemble Modeling Techniques
  • Feb 28, 2023
  • Applied Artificial Intelligence
  • Sohaib Latif + 5 more

Machine learning and data mining are being used in different fields like data analysis, prediction, image processing, etc., and particularly in healthcare. Over the past decade, several types of research have been carried out focusing on machine learning and data mining application to generate intuitions from historical data and make predictions about the results. Machine learning algorithms play a vital role in improving healthcare systems due to continuous research in machine learning applications. Several researchers have used algorithms of machine learning to develop systems for decision support, analyze clinical aspects, use historical data to extract useful information, make future predictions and categorize diseases, etc. to help physicians make better decisions. In this study, we used an ensemble modeling voting technique for the classification of the birth dataset. Ensemble models combine individual machine learning algorithms to improve the accuracy by predicting from the combined output of the base classifiers. Gradient boosting classifier (GBC), random forest (RF), bagging classifier (BC), and extra trees classifier (ETC) were used as base learners for making a voting ensemble model for the classification of the birth dataset. The results produced have shown that the voting classifier of support vector machine (SVM), random forest (RF), extra trees classifier, and bagging classifier has given the best results with the proportion of 94.78%, gradient boosting classifier has 84.39% accuracy, the random forest has 94.26% accuracy, extra trees classifier have 94.02% accuracy and bagging classifier has 93.65% accuracy. The accuracy achieved by ensemble modeling is far higher than the machine learning algorithms. Ensemble models increase the accuracy of machine learning algorithms by reducing variance and classification errors. The development of such a system will not only help health organizations to take effective measures to improve the maternal health assessment process but will also open the doors for interdisciplinary research in two different fields in the region.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 135
  • 10.3389/frai.2023.1084001
Machine learning approaches to identify Parkinson's disease using voice signal features.
  • Mar 28, 2023
  • Frontiers in Artificial Intelligence
  • Raya Alshammri + 3 more

Parkinson's Disease (PD) is the second most common age-related neurological disorder that leads to a range of motor and cognitive symptoms. A PD diagnosis is difficult since its symptoms are quite similar to those of other disorders, such as normal aging and essential tremor. When people reach 50, visible symptoms such as difficulties walking and communicating begin to emerge. Even though there is no cure for PD, certain medications can relieve some of the symptoms. Patients can maintain their lifestyles by controlling the complications caused by the disease. At this point, it is essential to detect this disease and prevent it from progressing. The diagnosis of the disease has been the subject of much research. In our project, we aim to detect PD using different types of Machine Learning (ML), and Deep Learning (DL) models such as Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), K-Nearest Neighbor (KNN), and Multi-Layer Perceptron (MLP) to differentiate between healthy and PD patients by voice signal features. The dataset taken from the University of California at Irvine (UCI) machine learning repository consisted of 195 voice recordings of examinations carried out on 31 patients. Moreover, our models were trained using different techniques such as Synthetic Minority Over-sampling Technique (SMOTE), Feature Selection, and hyperparameter tuning (GridSearchCV) to enhance their performance. At the end, we found that MLP and SVM with a ratio of 70:30 train/test split using GridSearchCV with SMOTE gave the best results for our project. MLP performed with an overall accuracy of 98.31%, an overall recall of 98%, an overall precision of 100%, and f1-score of 99%. In addition, SVM performed with an overall accuracy of 95%, an overall recall of 96%, an overall precision of 98%, and f1-score of 97%. The experimental results of this research imply that the proposed method can be used to reliably predict PD and can be easily incorporated into healthcare for diagnosis purposes.

  • Research Article
  • Cite Count Icon 2
  • 10.3389/fendo.2025.1486350
Machine learning applications to classify and monitor medication adherence in patients with type 2 diabetes in Ethiopia.
  • Mar 20, 2025
  • Frontiers in endocrinology
  • Ewunate Assaye Kassaw + 3 more

Medication adherence plays a crucial role in determining the health outcomes of patients, particularly those with chronic conditions like type 2 diabetes. Despite its significance, there is limited evidence regarding the use of machine learning (ML) algorithms to predict medication adherence within the Ethiopian population. The primary objective of this study was to develop and evaluate ML models designed to classify and monitor medication adherence levels among patients with type 2 diabetes in Ethiopia, to improve patient care and health outcomes. Using a random sampling technique in a cross-sectional study, we obtained data from 403 patients with type 2 diabetes at the University of Gondar Comprehensive Specialized Hospital (UoGCSH), excluding 13 subjects who were unable to respond and 6 with incomplete data from an initial cohort of 422. Medication adherence was assessed using the General Medication Adherence Scale (GMAS), an eleven-item Likert scale questionnaire. The responses served as features to train and test machine learning (ML) models. To address data imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied. The dataset was split using stratified K-fold cross-validation to preserve the distribution of adherence levels. Eight widely used ML algorithms were employed to develop the models, and their performance was evaluated using metrics such as accuracy, precision, recall, and F1 score. The best-performing model was subsequently deployed for further analysis. Out of 422 enrolled patients, 403 data samples were collected, with 11 features extracted from each respondent. To mitigate potential class imbalance, the dataset was increased to 620 samples using the Synthetic Minority Over-sampling Technique (SMOTE). Machine learning models including Logistic Regression (LR), Support Vector Machine (SVM), K Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Gradient Boost Classifier (GBC), Multilayer Perceptron (MLP), and 1D Convolutional Neural Network (1DCNN) were developed and evaluated. Although the performance differences among the models were subtle (within a range of 0.001), the SVM classifier outperformed the others, achieving a recall of 0.9979 and an AUC of 0.9998. Consequently, the SVM model was selected for deployment to monitor and detect patients' medication adherence levels, enabling timely interventions to improve patient outcomes. This study highlights a variety of machine learning (ML) models that can be effectively used to monitor and classify medication adherence in diabetic patients in Ethiopia. However, to fully realize the potential impact of digital health applications, further studies that include patients from diverse settings are necessary. Such research could enhance the generalizability of these models and provide insights into the broader applicability of digital tools for improving medication adherence and patient outcomes in varying healthcare contexts.

  • Research Article
  • Cite Count Icon 69
  • 10.1016/j.ascom.2018.02.002
Separation of pulsar signals from noise using supervised machine learning algorithms
  • Feb 22, 2018
  • Astronomy and Computing
  • S Bethapudi + 1 more

Separation of pulsar signals from noise using supervised machine learning algorithms

  • Research Article
  • Cite Count Icon 26
  • 10.1016/j.dajour.2023.100381
An artificial intelligence-based decision support system for early and accurate diagnosis of Parkinson’s Disease
  • Dec 13, 2023
  • Decision Analytics Journal
  • Mahesh T.R + 6 more

An artificial intelligence-based decision support system for early and accurate diagnosis of Parkinson’s Disease

  • Research Article
  • 10.52756/ijerr.2024.v45spl.005
User Interface Bug Classification Model Using ML and NLP Techniques: A Comparative Performance Analysis of ML Models
  • Nov 30, 2024
  • International Journal of Experimental Research and Review
  • Sara Khan + 1 more

Analyzing user interface (UI) bugs is an important step taken by testers and developers to assess the usability of the software product. UI bug classification helps in understanding the nature and cause of software failures. Manually classifying thousands of bugs is an inefficient and tedious job for both testers and developers. Objective of this research is to develop a classification model for the User Interface (UI) related bugs using supervised Machine Learning (ML) algorithms and Natural Language Processing (NLP) techniques. Also, to assess the effect of different sampling and feature vectorization techniques on the performance of ML algorithms. Classification is based upon ‘Summary’ feature of the bug report and utilizes six classifiers i.e., Gaussian Naïve Bayes (GNB), Multinomial Naïve Bayes (MNB), Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF) and Gradient Boosting (GB). Dataset obtained is vectored using two vectorization techniques of NLP i.e., Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF). ML models are trained after vectorization and data balancing. The models ' hyperparameter tuning (HT) has also been done using the grid search approach to improve their efficacy. This work provides a comparative performance analysis of ML techniques using Accuracy, Precision, Recall and F1 Score. Performance results showed that a UI bug classification model can be built by training a tuned SVM classifier using TF-IDF and SMOTE (Synthetic Minority Oversampling Techniques). SVM classifier provided the highest performance measure with Accuracy: 0.88, Precision: 0.86, Recall: 0.85 and F1: 0.85. Result also inferred that the performance of ML algorithms with TF-IDF is better than BoW in most cases. This work provides classification of bugs that are related to only the user interface. Also, the effect of two different feature extraction techniques and sampling techniques on algorithms were analyzed, adding novelty to the research work.

  • Research Article
  • Cite Count Icon 48
  • 10.1016/j.isprsjprs.2023.05.015
Utilization of synthetic minority oversampling technique for improving potato yield prediction using remote sensing data and machine learning algorithms with small sample size of yield data
  • May 24, 2023
  • ISPRS Journal of Photogrammetry and Remote Sensing
  • Hamid Ebrahimy + 2 more

Utilization of synthetic minority oversampling technique for improving potato yield prediction using remote sensing data and machine learning algorithms with small sample size of yield data

  • Book Chapter
  • 10.1007/978-981-19-5403-0_31
Graded Classification of Liver Cirrhosis Using Machine Learning Algorithms on a Highly Unbalanced Dataset
  • Nov 29, 2022
  • Diganta Sengupta + 3 more

Liver cirrhosis is the fibrosis of liver caused by a long-term damage of the organ. This study classifies the disease in four classes based on a highly unbalanced dataset having 18 features and a data count of 6800 with labeled data of 465, 1507, 1322, and 3506 for the four classes using machine learning (ML) algorithms. Twelve ML algorithms have been deployed for the classification purpose which reflected the highest accuracy of 68.21% for the Histogram Gradient Boost Classifier. For further improvement of the accuracy, hyper-parameter tuning was done on all the ML algorithms which fetched the highest accuracy of 77.97% for the Gradient Boost Classifier (GBC). Further improvement of accuracy was observed with stacking model which furnished an accuracy of 84.24%. The stacked model comprised of the GBC as the meta-learner, and K-Nearest Neighbor (KNN), Xtreme Gradient Boost algorithm (XGB), Support Vector Machine (SVM), and the Light Gradient Boost Machine (LGBM) as the base-learners. To the best of our knowledge, this is the first attempt for graded classification of liver using all the ML algorithms, including hyper-parameter tuned and stacked models.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 16
  • 10.1007/s12012-024-09843-8
Development and Validation of Machine Learning Algorithms to Predict 1-Year Ischemic Stroke and Bleeding Events in Patients with Atrial Fibrillation and Cancer
  • Mar 18, 2024
  • Cardiovascular Toxicology
  • Bang Truong + 5 more

In this study, we leveraged machine learning (ML) approach to develop and validate new assessment tools for predicting stroke and bleeding among patients with atrial fibrillation (AFib) and cancer. We conducted a retrospective cohort study including patients who were newly diagnosed with AFib with a record of cancer from the 2012–2018 Surveillance, Epidemiology, and End Results (SEER)-Medicare database. The ML algorithms were developed and validated separately for each outcome by fitting elastic net, random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), and neural network models with tenfold cross-validation (train:test = 7:3). We obtained area under the curve (AUC), sensitivity, specificity, and F2 score as performance metrics. Model calibration was assessed using Brier score. In sensitivity analysis, we resampled data using Synthetic Minority Oversampling Technique (SMOTE). Among 18,388 patients with AFib and cancer, 523 (2.84%) had ischemic stroke and 221 (1.20%) had major bleeding within one year after AFib diagnosis. In prediction of ischemic stroke, RF significantly outperformed other ML models [AUC (0.916, 95% CI 0.887–0.945), sensitivity 0.868, specificity 0.801, F2 score 0.375, Brier score = 0.035]. However, the performance of ML algorithms in prediction of major bleeding was low with highest AUC achieved by RF (0.623, 95% CI 0.554–0.692). RF models performed better than CHA2DS2-VASc and HAS-BLED scores. SMOTE did not improve the performance of the ML algorithms. Our study demonstrated a promising application of ML in stroke prediction among patients with AFib and cancer. This tool may be leveraged in assisting clinicians to identify patients at high risk of stroke and optimize treatment decisions.

  • Dissertation
  • 10.53846/goediss-10997
Computertomographie bei COVID-19 Pneumonien und pulmonalen Non-COVID Erkrankungen: Diagnostische Genauigkeit der radiologischen Beurteilung und maschineller Lernverfahren
  • Jan 1, 2025
  • Karla Sophie Hähnle

Since early 2020, the novel Covid-19 pneumonia has been spreading worldwide and poses a diagnostic challenge. Medical imaging techniques such as computed tomography (CT) play an important role in diagnosing Covid-19. CT-based diagnosis, generates a large number of CT images, whose interpretation by radiologists is both time-consuming and subject to subjective assessment. In the present study, the diagnostic performance of radiologists in distinguishing between Covid and non-Covid patients using the CO-RADS and COV-RADS classification systems is compared with that of machine learning (ML) algorithms, in order to evaluate whether the diagnostic efficiency of radiologists in clinical routine can be improved by ML algorithms. Included were patients who tested PCR-positive for SARS-CoV-2 between March 1, 2020, and January 31, 2021 (Covid cohort), or who, between 2001 and 2020, had a differential diagnosis relevant to Covid-19 (non-Covid cohort) and therefore received a CT scan. The non-Covid cohort covers a wide range of differential diagnoses, such as pneumonias of other etiologies, interstitial lung diseases (ILD), vasculitides, malignancies, COPD, cystic fibrosis, tuberculosis, and cardiovascular diseases. In this study, the CT scans were systematically assessed by radiologists in a blinded manner and evaluated using CO-RADS and COV-RADS. The resulting data were extracted as semantic annotations for training the ML algorithms. After data preprocessing, addressing class imbalances, and performing feature selection, ML algorithms were implemented to predict Covid-19 pneumonia using 10-fold cross-validation. Both the diagnostic performance of the radiologists (based on CO-RADS and COV-RADS) and the ML algorithms were evaluated as area under the curve (AUC). A total of n=237 individuals (75% male, mean age 60 ± 18 years) with n=74 Covid-19 and n=163 non-Covid cases from the patient collective of the University Medical Center Göttingen were included retrospectively. First, the entire cohort was analyzed; subsequently, a smaller subgroup (n = 139) was examined in more detail, which, in addition to Covid-19 pneumonias, also included pneumonias of other etiologies, ILDs, and vasculitides. In the overall cohort analysis, the radiologists achieved diagnostic accuracies of AUC=0.74 and AUC=0.73 using CO-RADS and COV-RADS, respectively. The best-performing ML algorithm, the Gradient Boosting Classifier (GBC), achieved an AUC=0.75 under feature selection. In the subgroup analysis, the radiologists achieved AUC=0.61 and AUC=0.62 with CO-RADS and COV-RADS, respectively, while the best-performing GBC achieved an AUC=0.67 under feature selection. The diagnostic difference between radiologists and ML algorithms was not significant in either the overall cohort analysis or the subgroup analysis. In summary, both the diagnostic performance of radiologists based on CO-RADS and COV-RADS, as well as the ML algorithms, for detecting Covid-19 pneumonia is moderate. Differentiating Covid-19 pneumonia within the subgroup analysis appears particularly complex. The results indicate that there is no statistically significant difference in the diagnostic accuracy of the ML algorithms compared to the radiologists. Thus, the information processing of the available CT image features in Covid-19 pneumonia by radiologists seems to be adequate.

  • Research Article
  • Cite Count Icon 1
  • 10.22214/ijraset.2024.59236
Heart Failure Prediction Using Machine Learning
  • Mar 31, 2024
  • International Journal for Research in Applied Science and Engineering Technology
  • Jeevan Babu Maddala + 5 more

Abstract: Cardiovascular Disease (CVD) currently stands as the leading cause of death worldwide. Clinical data analytics encounter a significant challenge in accurately predicting cardiac disease. The healthcare industry generates vast volumes of raw data, necessitating its transformation into meaningful insights through machine learning techniques. The objective is to leverage machine learning models to improve the predictability of survival among cardiac patients. This study employs machine learning classifiers: Random Forest, Gradient Boosting classifier, Extra Tree Classifier, XG-Boost, Ada Boost and Hybrid models. The Synthetic Minority Oversampling Technique (SMOTE) addresses the challenge posed by imbalanced datasets. Experimental results indicate that employing the SMOTE technique enhances the accuracy of the chosen classifier's predictions. Among these classifiers, Hybrid Model stands out with the highest accuracy of 89.82% when applied to predicting the survival of cardiac illness afterimplementing SMOTE

  • Front Matter
  • Cite Count Icon 63
  • 10.1002/aps3.11371
Plants meet machines: Prospects in machine learning for plant biology
  • Jun 1, 2020
  • Applications in Plant Sciences
  • Pamela S Soltis + 3 more

Plants meet machines: Prospects in machine learning for plant biology

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-981-19-4863-3_55
Top Five Machine Learning Libraries in Python: A Comparative Analysis
  • Oct 28, 2022
  • Mothe Rajesh + 1 more

Nowadays machine learning (ML) is used in all sorts of fields like health care, retail, travel, finance, social media, etc. ML system is used to learn from input data to construct a suitable model by continuously estimating, optimizing, and tuning parameters of the model. To attain the stated, Python programming language is one of the most flexible languages, and it does contain special libraries for ML applications, namely SciKit-Learn, TensorFlow, PyTorch, Keras, Theano, etc., which is great for linear algebra and getting to know kernel methods of machine learning. The Python programming language is great to use when working with ML algorithms and has easy syntax relatively. When taking the deep-dive into ML, choosing a framework can be daunting. The most common concern is to understand which of these libraries has the most momentum in ML system modeling and development. The major objective of this paper is to provide extensive knowledge on various Python libraries and different ML algorithms in comparison with meet multiple application requirements. This paper also reviewed various ML algorithms and application domains.KeywordsMachine LearningLibrariesPython

  • Research Article
  • Cite Count Icon 77
  • 10.1186/s12874-023-02078-1
Application of machine learning in predicting survival outcomes involving real-world data: a scoping review
  • Nov 13, 2023
  • BMC medical research methodology
  • Yinan Huang + 3 more

BackgroundDespite the interest in machine learning (ML) algorithms for analyzing real-world data (RWD) in healthcare, the use of ML in predicting time-to-event data, a common scenario in clinical practice, is less explored. ML models are capable of algorithmically learning from large, complex datasets and can offer advantages in predicting time-to-event data. We reviewed the recent applications of ML for survival analysis using RWD in healthcare.MethodsPUBMED and EMBASE were searched from database inception through March 2023 to identify peer-reviewed English-language studies of ML models for predicting time-to-event outcomes using the RWD. Two reviewers extracted information on the data source, patient population, survival outcome, ML algorithms, and the Area Under the Curve (AUC).ResultsOf 257 citations, 28 publications were included. Random survival forests (N = 16, 57%) and neural networks (N = 11, 39%) were the most popular ML algorithms. There was variability across AUC for these ML models (median 0.789, range 0.6–0.950). ML algorithms were predominately considered for predicting overall survival in oncology (N = 12, 43%). ML survival models were often used to predict disease prognosis or clinical events (N = 27, 96%) in the oncology, while less were used for treatment outcomes (N = 1, 4%).ConclusionsThe ML algorithms, random survival forests and neural networks, are mainly used for RWD to predict survival outcomes such as disease prognosis or clinical events in the oncology. This review shows that more opportunities remain to apply these ML algorithms to inform treatment decision-making in clinical practice. More methodological work is also needed to ensure the utility and applicability of ML models in survival outcomes.

  • Preprint Article
  • 10.5194/epsc2020-963
Investigating Machine Learning as a Basis for Asteroid Taxnomies in the 3-Micron Spectral Region
  • May 2, 2024
  • Matthew Richardson + 2 more

Abstract:As part of a larger study to elucidate the presence of hydrated minerals on asteroid surfaces, we are developing a robust taxonomic classification system using spectroscopic observations in the vicinity of 3 μm. We have constructed a Python algorithm to identify band centers and band depths near 3 µm for a set of normalized, thermally-corrected asteroid spectra for use to serve as inputs to Python’s Scikit-Learn library of Machine Learning (ML) algorithms. We anticipate a thorough investigation of both Principal Component Analysis and ML (supervised, unsupervised, and Artificial Neural Network) techniques to assess which technique is likely to be better suited for classifying the 3-µm data. At this writing, we have run tests using Python’s Agglomerative clustering ML algorithm to examine possible clustering scenarios. These initial steps have given us some familiarity with the mechanics of using ML on the 3-µm dataset as well as serving to identify some possible pitfalls or cul-de-sacs. Presented here are the preliminary results we have obtained.Introduction:Although various techniques have been used, asteroid classification has typically been done via Principal Component Analysis (PCA: [1,2]). PCA is a statistical technique that reduces the dimensionality of a dataset by identifying the most important parameters within a dataset based on their variance. Parameters that exhibit the greatest amount of variance are considered to be of greater importance while parameters with the least amount of variance are considered to be of lower importance. While the PCA technique produces better visualizations of the data by reducing the dimensionality of a dataset, the PCA technique comes with some drawbacks. Disadvantages such as its dependence on scale and information loss due to the orthogonal property of PCA can cause interpretation of PCA results to prove to be a more critical and time-consuming process. Therefore, exploring other means of classification may prove to be worthwhile.Machine Learning (ML) algorithms have had a significant impact on the way in which data is analyzed and interpreted, and have already proven to be a powerfully reliable resource in the field of planetary science. Accordingly, the application of ML to an asteroid taxonomy has the potential to be more efficient, objective, and easy-to-implement than PCA. ML algorithms can be supervised, in which the program “learns” from training data and is able to classify new inputs, or unsupervised, in which the program analyzes the dataset to determine patterns such as clusters. [3] used an Artificial Neural Network (ANN, a subset of ML) to classify asteroids, work followed up by [4]. Recent explorations of supervised ML for asteroid taxonomy are promising, and have applied training sets from existing databases to new visible and/or NIR photometric data (e.g. [5,6,7]).We seek to explore the benefits of ML algorithms, as well as compare and contrast to the PCA technique, in the production of an asteroid taxonomy. Our initial exploration has utilized a set of normalized, thermally-corrected asteroid spectra in the vicinity of 3 µm. We have identified band centers and band depths and served this parameter space as inputs to Python’s Agglomerative clustering ML algorithm.Methodology:Thermal corrections of the asteroid spectra were performed via a forward model that uses a modified version of the Standard Thermal Model (STM: [8]). The forward model treats the beaming parameter as a free parameter adjusting its value for each iteration of the STM until it converges onto a value that yields expected long-wavelength continuum behavior. Spectra were then normalized to unity at a wavelength of 2.3 µm, followed by identification of band centers and band depths near 3 µm using both polynomial and Gaussian fits. In addition, band depths were measured at wavelengths of 2.9 µm and 3.2 µm to gather more information on asteroid band shapes. Lastly, the aforementioned calculated spectral features were input into Python’s Agglomerative clustering algorithm to determine which asteroid spectra shared similar features.Summary:As part of a larger investigation to better understand hydrated mineralogies as they apply to asteroids, we have begun work towards developing a quantitative taxonomic framework derived from asteroid spectra in the wavelength range from 2.0-4.0 µm. Our exploration thus far of Python’s Agglomerative clustering algorithm has proven to be fruitful. Minor changes to the parameterization of this algorithm can yield very different results, which naturally can lead to different interpretations. The Agglomerative clustering algorithm is one of many the powerful ML algorithms we will explore against the PCA technique, all of which we will be discussing in our presentation.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant