Decision tree analysis of the prospects of organic food: Evidence from China and Hungary
Decision tree analysis of the prospects of organic food: Evidence from China and Hungary
- Research Article
2
- 10.3233/jcm-2009-0256
- Oct 1, 2009
- Journal of Computational Methods in Sciences and Engineering
In the present study, an in silico approach using decision tree, random forest and moving average analysis has been applied to a data set comprising of 53 analogues of 5-alkyl-2-alkylamino-6-(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3$H$)-one for development of models for prediction of anti-HIV-1 activity. A total of 46 2D and 3D molecular descriptors of diverse nature, have been used for decision tree and random forest analysis. The value of majority of these descriptors for each analogue in the dataset was computed using E-Dragon software (version 1.0). An in-house computer program was also employed to calculate additional topological descriptors which were not included in E-Dragon software. Random forest correctly classified the analogues into active and inactive with an accuracy of 85%. A decision tree was also employed for determining the importance of molecular descriptors. The decision tree learned the information from the input data with an accuracy of 98% and correctly predicted the cross-validated (10 fold) data with accuracy up to 77%. The best five descriptors identified by decision tree analysis were subsequently used to build suitable models using moving average analysis. The use of models based upon these non-correlating molecular descriptors resulted in the prediction of anti-HIV-1 activity with an overall accuracy of 83-96%. Moreover, active ranges of the proposed models not only revealed high potency but also exhibited improved safety as indicated by relatively high values of selectivity index. The statistical significance of models/ indices was assessed through intercorrelation analysis, sensitivity, specificity and Matthew's correlation coefficient. High predictability offer proposed models a vast potential for providing lead structures for development of potent but safe anti- HIV-1 agents.
- Research Article
- 10.3233/isb-2010-0436
- Nov 1, 2010
- In Silico Biology
Antagonism of cannabinoid receptor-1 has emerged as a most promising therapeutic target for the development of anti-obesity drugs. In the present study, an in silico approach using decision tree, random forest and moving average analysis has been applied to a data set comprising of 76 analogues of substituted 2-(3-pyrazolyl)-1,3,4-oxadiazoles for development of models for prediction of antagonistic activity of cannabinoid receptor-1. A total of 46 2D and 3D molecular descriptors of diverse nature were employed for decision tree and random forest analysis. The values of majority of these descriptors for each analogue involved in the dataset were computed using E-Dragon software (version 1.0). Random forest correctly classified the analogues into active and inactive with an accuracy of 95%. A decision tree was also utilized for determining the importance of molecular descriptors. The decision tree learned the information from the input data with an accuracy of 99% and correctly predicted the cross-validated (10 fold) data with an accuracy up to 90%. Finally, three molecular descriptors of diverse nature (including best descriptor identified by decision tree analysis) were subsequently used to build suitable models using moving average analysis. These models resulted in the prediction of cannabinoid receptor-1 antagonistic activity with an accuracy of 95-96%. High predictability of proposed models offer vast potential for providing lead structures for the development of potent cannabinoid receptor-1 antagonists for the treatment of obesity.
- Research Article
- 10.3389/fped.2025.1569913
- Mar 19, 2025
- Frontiers in pediatrics
This study aimed to investigate and understand predictor variables and isolate the exact roles of anthropometric and demographic variables in the hand grip strength of young children. In total, 315 male and female children participated in the study and 11 participants were excluded, therefore, 304 participants completed the assessments. Anthropometric measurements were collected at the time of study, along with age, height, weight, circumference of the hand, hand span, hand length, palm length, and hand grip strength (HGS) was measured. Both decision tree and regression machine learning analyses were used to isolate the relative contribution of independent features in predicting the targeted grip strength of children. Two predictive models were developed to understand the role of predictor variables in dominant hand HGS for both boys and girls. For boys, the decision tree was found to be the best model with the lowest error in predicting HGS. The respondents' age, hand span, and weight were the most significant contributors to male hand grip strength. For the boys under 9.5 years of age, based on the decision tree analysis, weight (split at 27.5 kg) was found to be the most significant predictor. Furthermore, for the boys under 14.5 years of age, weight (split at 46.7 kg) remained the most important predictor. For boys 14.5 years and older, hand span was important in predicting handgrip strength. Backward regression was found to be the best model for predicting female hand grip strength. The R 2 value for the model was 0.6646 and the significant variables were body mass index (BMI), hand length, hand span, and palm length, showing significance at a p-value of ≤0.05. This model predicted 66.46% of the variance in handgrip strength among the girls. Anthropometric factors played a significant role in hand grip strength. Age, weight, and a larger hand span were found to be significant in impacting male HGS, while BMI, hand length, and palm length contributed to higher grip strength among the girls.
- Research Article
47
- 10.3168/jds.2018-14422
- Aug 16, 2018
- Journal of Dairy Science
Association between metabolic diseases and the culling risk of high-yielding dairy cows in a transition management facility using survival and decision tree analysis
- Research Article
- 10.2139/ssrn.3810726
- Mar 24, 2021
- SSRN Electronic Journal
Decision trees can help lawyers better counsel clients and predict outcomes. They can serve as a bridge from indecipherable data to a better informed and counseled client. Marjorie Corman Aaron’s new book, Risk and Rigor: A Lawyer’s Guide to Decision Trees for Assessing Cases and Advising Clients, has the potential to help bring about more widespread and effective use of decision tree analysis in lawyering. Risk and Rigor makes a comprehensive contribution to decision tree analysis learning. This review proceeds in four parts. Part I provides an overview of how decision tree analysis can serve the lawyer, on one hand, and law school teachers and their students, on the other. Part II examines the nuts-and-bolts parts of the book where the author teaches the willing reader how decision trees work and how to build them. Part III engages with the author’s assessment of the risks and limitations of decision tree analysis. Part IV queries whether decision trees may pose a barrier to some. Risk and Rigor had me wondering if decision tree analysis is equally accessible and welcomed by all, or if there might be some barriers to using decision trees with clients and law students. Risk and Rigor leaves relatively unexplored whether or not educational, gender, racial, and other biases may compromise the effectiveness of the use of decision tree analysis as a means of cultivating better client communication and counseling. Ultimately, Risk and Rigor makes a persuasive case for the use of decision tree analysis. A practical, user-friendly addition to dispute resolution literature on decision analysis, Risk and Rigor challenges the reader to consider not just the benefits but also the limitations of decision trees.
- Research Article
- 10.17265/1537-1506/2014.06.004
- Jun 28, 2014
- Chinese Business Review
Research topic of this paper is to discuss theory of Human Resources Management (HRM) and to discuss using of quantitative methods in HRM. Firstly, five variables establish HRM theory. They are HRM practices, positive organizational behaviors, individual performance, performance of business departments, and firm performance. Transactions among those variables enable Human Resources (HR) practitioners to apply HRM theory in their organizations. Secondly, this paper discusses use of quantitative methods in HRM. They are vector analysis and decision tree analysis. Those analyses enable HR practitioners to make effective HR decisions. Decision tree sets HR alternatives to efficiently implement HRM practices in organizations. Research question is how HR practitioners apply quantitative methods in department of HRM in firms. Finally this research comes out a conclusion that quantitative methods may be used in HRM.
- Research Article
26
- 10.1089/neu.2008.0841
- Apr 16, 2009
- Journal of Neurotrauma
Traumatic brain injury is a major socioeconomic burden, and the use of statistical models to predict outcomes after head injury can help to allocate limited health resources. Earlier prediction models analyzing admission data have been used to achieve prediction accuracies of up to 80%. Our aim was to design statistical models utilizing a combination of both physiological and biochemical variables obtained from multimodal monitoring in the neurocritical care setting as a complement to earlier models. We used decision tree and logistic regression analysis on variables including intracranial pressure (ICP), mean arterial pressure (MAP), cerebral perfusion pressure (CPP), and pressure reactivity index (PRx), as well as multimodal monitoring parameters to assess brain tissue oxygenation (PbtO(2)), and microdialysis parameters to predict outcomes based on a dichotomized Glasgow Outcome Score. Further analysis was carried out on various subgroup combinations of physiological and biochemical parameters. The reliability of the head injury models was assessed using a 10-fold cross-validation technique. In addition, the confusion matrix was also used to assess the sensitivity, specificity, and the F-ratio. In all, 2,413 time series records were extracted from 26 patients treated at our neurocritical care unit over a 1-year period. Decision tree analysis was found to be superior to logistic regression analysis in predictive accuracy of outcome. The combined use of microdialysis variables and PbtO(2), in addition to ICP, MAP, and CPP was found have the best predictive accuracy. The use of physiological and biochemical variables based on a decision tree analysis model has shown to provide an improvement in predictive accuracy compared with other previous models. The potential application is for outcome prediction in the multivariate setting of advanced multimodality monitoring, and validates the use of multimodal monitoring in the neurocritical care setting to have a potential benefit in predicting outcomes of patients with severe head injury.
- Research Article
- 10.53713/nhsj.v4i3.318
- Sep 10, 2024
- Nursing and Health Sciences Journal (NHSJ)
No studies have analyzed the path of predicting the experience of cognitive dysfunction by considering various characteristics in elderly, especially focusing on sleep duration. Thus, this study aimed to predict the experience of cognitive dysfunction according to sleep duration in older individuals. This cross-sectional study used data from 3,361 older individuals from the 2021 Community Health Survey (CHS). Participants were included in two groups according to their experience of cognitive dysfunction (yes or no). Sleep duration was categorized into the following three groups: lack of sleep(<6h), normal sleep (6 to <10h), and oversleep (≥10h). Decision tree and logistic regression analyses were used to identify factors related to cognitive dysfunction in elderly. According to the decision model, those who slept for ≥10h had depression and experienced the highest rate (89.2%)of cognitive dysfunction. In contrast, people aged 65-74 years with a lack of sleep or average sleep duration and low stress levels were the least likely to experience cognitive dysfunction (63.0%). Older individuals who were asleep for ≥10h and had depression showed the highest rate of cognitive dysfunction. Community-based programs to improve cognition in the elderly or healthcare providers caring for the elderly need to continuously assess and consider their age, sleep time, and depression to prevent and manage cognition dysfunction in elderly.
- Research Article
- 10.15294/sji.v10i2.44027
- May 19, 2023
- Scientific Journal of Informatics
Purpose: The goal of this research is to create a precise prediction model that can differentiate between spiral and non-spiral galaxies using the Zoo galaxy dataset. Decision tree analysis and random forest models will be used to construct the model, and various conditions within the dataset will be employed to classify the data accurately. The model's performance will be evaluated using a confusion matrix, and the probability of predicting spiral galaxies will be analyzed. The research will also investigate the differences in Total Power among signal types and identify Peak Frequency and Bandwidth values consistent across all signal types. This study is expected to provide important insights into galaxy classification and signal characteristics, specifically in the fields of astronomy and astrophysics.Methods: This study utilized the decision tree analysis research method to create a predictive model for identifying spiral galaxies using the Zoo galaxy dataset. The research approach focused on analyzing data before constructing a prediction model. The study did not involve random sampling, making it an observational study. Decision tree analysis was employed to classify galaxies into homogeneous groups, and a random forest model was used to classify galaxy types. This research provides insights into how decision tree analysis can be utilized to comprehend galaxy classification and can serve as a foundation for future research. To strengthen the conclusions, combining this research with other approaches such as experiments or random sampling can be considered.Result: This study developed a predictive model for classifying galaxies based on their Spiral type using decision tree analysis on the Zoo galaxy dataset. The model divided the data into specific groups based on certain conditions, and the results demonstrated exceptional accuracy of the random forest model in categorizing galaxy types. In addition, the study investigated various signal types in galaxies and found variations in Total Power, but consistent values for Peak Frequency and Bandwidth at 2 in all signals. These findings provide valuable insights into galaxy classification and signal characteristics, which could have practical applications in communication, signal processing, and analysis. The utilization of decision tree analysis and random forest models for galaxy classification and signal analysis represents an innovative approach in this field.Novelty: The novelty of this research lies in the new approach to categorizing galaxy types using decision tree and random forest models. Previously, the approach used to categorize galaxy types was through visual methods and observations via telescopes. This new approach provides a new and potentially more efficient way of processing galaxy image data, resulting in faster and more accurate categorization. Moreover, this research contributes to the development of signal analysis applications such as Total Power, Peak Frequency, and Bandwidth, which were previously only used in the fields of astronomy and astrophysics. However, they have the potential for wider applications in the fields of communication, signal processing, and analysis beyond astronomy
- Research Article
- 10.1080/07317115.2025.2597965
- Dec 11, 2025
- Clinical Gerontologist
Objectives Young-onset dementia (YOD) presents unique care challenges, particularly due to behavioral and psychological symptoms of dementia (BPSD). BPSD impacts long-term care acceptance; nonetheless, which symptoms most influence facility acceptance decisions remains understudied. We aimed to investigate how long-term care insurance (LTCI) facilities’ acceptance policies for individuals with YOD relate to their perceptions of BPSD difficulty. Methods A cross-sectional survey was conducted in 360 LTCI facilities in Sapporo City. Perceived difficulty of 12 BPSD domains and facility acceptance policies were assessed. Statistical analyses included chi-squared tests, decision tree analysis, and logistic regression. Results Eight BPSD – such as delusions, anxiety, and nighttime behavioral disturbances – were significantly associated with negative acceptance policies. Decision tree and regression analyses showed that facilities perceiving nighttime disturbances and irritability as difficult were significantly less likely to accept individuals with YOD. Conversely, universally challenging symptoms such as agitation/aggression did not distinguish acceptance decisions. Conclusions Specific combinations of perceived BPSD difficulties – particularly nighttime disturbances and irritability – were associated with the willingness of these facilities to accept individuals with YOD. Clinical Implications Targeted training focusing on nighttime disturbances, irritability, and delusions, along with enhanced information-sharing and YOD-specific support networks, may reduce care barriers and promote acceptance in LTCI settings.
- Research Article
- 10.1080/21622965.2025.2526380
- Jul 15, 2025
- Applied Neuropsychology: Child
Executive functions are fundamental to the success of students in higher education. Our objective was to develop an explanatory model based on the interaction of executive functions. This study used a cross-sectional design with a sample of 1,233 Cuban university students. The Cuban adaptation of the University Executive Function Scale, which assesses seven executive function dimensions, was employed. In the first phase, descriptive statistics were used to analyze the scores of these functions in the studied population, identifying those with the lowest performance. The second phase applied a decision tree analysis using the CHAID method, considering risk and accuracy estimators, to determine the main predictors of executive functions. Finally, in the third phase, a structural equation model was developed to examine the relationships between variables and the predictors of the least developed executive functions, assessing model fit using the CFI, TLI, RMSEA, and SRMR indices. The results indicate that the majority of scores in the executive functions of Cuban university students fall within the average range, although below-average scores were observed in Conscious Regulation of Behavior and Conscious Monitoring of Responsibilities. The decision tree analysis identified that the Supervisory Attention System is the main predictor of Conscious Monitoring of Responsibilities, while Conscious Regulation of Emotions emerged as the strongest predictor of Conscious Regulation of Behavior. Structural equation models reveal that the Supervisory Attention System and Verification of Behavior for Learning are key predictors of Conscious Monitoring of Responsibilities, and that the latter also positively influences Emotional and Behavioral Regulation.
- Research Article
82
- 10.1097/00005768-199808000-00007
- Aug 1, 1998
- Medicine& Science in Sports & Exercise
Few studies have examined the relationship between directly measured oxygen uptake (VO2) and self-reported physical function (PF). The purpose of this study was: 1) to examine the relationship between peak V02 and PF and 2) to determine whether a threshold or cut point exist that distinguishes between individuals reporting required assistance in the performance of functional tasks (low PF) and those who report ability to perform tasks independently (high PF). Participants were 161 community-dwelling adults, ages 65-90, who had a baseline evaluation for a clinical trail that included measurement of peak V02 and PF consisted of a summary score combining scores from the Older Americans Resources and Services Multidimensional Functional Assessment Questionnaire, Nagi Disability Study. Rosow-Breslau Scale, Physical Function Scale of the Medical Outcomes Study, and the Falls Efficacy Scale. Decision tree, cubic spline, and logistic regression analyses explored these relationships with age, gender, education, race, body mass index, depression, and total number of chronic diseases included as important covariates. Among all covariates examined, peak V02 was most strongly associated with (P = 0.004) with PF. There was not threshold effect. Decision tree analyses indicated that 18.3 mL.kg-1.min-1 was the optimal cut point distinguishing between low PF and High PG (P < 0.0001). Between-gender differences in PF (P = 0.002) were no longer significant when peak V02 was included in the PF model (P = 0.17). These data indicate that individuals with a V02 < 18 mL.kg-1min-1 report significant difficulty in the performance of daily tasks and that differences in peak V02 may explain, in part, why women report more impairment in PF.
- Conference Article
16
- 10.2118/194795-ms
- Mar 15, 2019
As oil prices are fluctuating, decision makers are challenged to make the "best" decisions for field's developments. Decision Tree Analysis (DTA) can help decision makers to make the "best" decisions. DTA focuses on managerial decisions, such as whether to do workover or not, whether the additional information will be valuable or not. The aim of this work is to review the applications of DTA in petroleum engineering and provide a clear methodology on how to apply DTA for any petroleum engineering application. The combination of Expected Monetary Value (EMV) and DTA is one of the most common methods used in the decision-making process. If EMV is positive, the decision is considered to be feasible. However, that doesn't mean the decision will be successful at all times. It simply means that if a similar decision is made for a larger number of cases, the decision will be successful. DTA will account for the uncertainty in the probability. A good number of papers about the applications of DTA in petroleum engineering were read and summarized into three categories. Also, a clear methodology on how to apply the DTA for any petroleum engineering application was established. After reading and summarizing a good number of papers and case histories about the applications of DTA in petroleum engineering, it was concluded that the applications can be classified into three main categories; applications of DTA and EMV for the whole oil and gas prospect projects, applications of DTA and EMV for a specific operation or development, applications of DTA, EMV, Monte Carlo simulations, and other methods to assess the value of information. These applications were summarized into tables. In addition, a clear methodology accomplished by a flowchart that explains how to successfully apply the EMV and DTA for any petroleum engineering application was provided. The method consists of three main steps: 1) how many scenarios need to be considered and what are they 2) collection of the required data 3) use the visual tool (DTA) or programming to find EMV. Each of the previous steps has its own challenges, thus these challenges were addressed and the solutions to overcome the challenges were provided. Finally, practical guidelines have were developed that when used with the accompanying flow chart will serve as a quick reference to apply the DTA for any petroleum engineering application. As the petroleum engineering applications becoming more complicated nowadays, accomplished by the oil prices fluctuations, the decision-making processes becoming more difficult. The DTA is a very important tool for the decision makers to make the "best" decision. This paper provides a clear methodology on how to successfully apply the DTA which can serve as a reference for any future DTA applications in petroleum engineering.
- Research Article
2
- 10.14309/00000434-201810001-00073
- Oct 1, 2018
- American Journal of Gastroenterology
Introduction: Though cyst fluid glucose levels have been shown to be a reliable marker for differentiating Mucinous and Non-Mucinous pancreatic cysts, a number of cysts have a more complex association with CEA and amylase enzyme values. Our project aimed to utilize machine learning to develop a decision tree that will provide clinicians with a practical tool to optimally determine the mucinous character of the lesion from lab results alone. Methods: This is a retrospective study conducted at a high-volume advanced endoscopy center. Clinical and lab data was abstracted from the electronic medical records of all patients who underwent pancreatic cyst aspiration between June 2015 and November 2017. Statistical analysis was done using STATA v 15.1. A decision tree was built using the J48 classifier algorithm in Weka 3.8.0. Factors used for data modelling were age, gender, diabetic status, and levels of glucose, CEA and amylase. The decision tree thus generated was validated using 10-fold cross validation. Results: A total of 57 lesions were included in the analysis. The average age of the cohort was 62.6 years (SD ±16.4). The most common diagnosis was pseudocyst (n=21, 36.8%), followed by 20 IPMNs (14 branch ducts (24.6%), 3 main duct (5.3%), and 3 mixed duct cysts (5.3%)), 11 mucinous Cystadenomas (19.3%), 4 lymphoepithelial cysts (7.0%), and 1 ciliated foregut cyst (1.6%). Thus, there were 31 Mucinous and 26 Non-Mucinous pancreatic cysts in our cohort. Mucinous cysts had higher CEA levels compared to Non-Mucinous cysts (Median 56.4 ng/mL vs 3.1 ng/mL, Mann-Whitney-U, pvalue=0.0018). However, Glucose and Amylase levels were lower in Mucinous cysts compared to Non-mucinous cysts (Both Mann-Whitney-U, p-value= 0.0002 and 0.0107 respectively). The decision tree algorithm revealed that pancreatic cysts will be non-mucinous if the glucose level is greater than 16 mg/dL. Additionally, cysts will non-mucinous if their CEA < 3.1 ng/mL and undetectable glucose levels. This model thus correctly classified 77.2% of all cysts, and, had a kappa statistic of 0.5473. Its ROC was 0.757 and r2-statistic was 0.85. Conclusion: Our decision tree analysis reveals that in addition to the expected non-mucinous cysts with high glucose values, there exists a small group of zero-glucose low-CEA cysts with CEA < 3.1 ng/mL and undetectable glucose levels. Our decision tree can be used by endoscopists in clinical practice as a simple and easy tool to predict the mucinous/non-mucinous character of pancreatic cysts.73_A Figure 1. Pancreatic Cysts Descriptive Statistics73_B Figure 2. Pancreatic Cyst Fluid Decision Tree Analysis - Results73_C Figure 3. Pancreatic Cyst Decision Tree Model
- Research Article
3
- 10.3785/j.issn.1008-9292.2019.12.02
- Dec 25, 2019
- Zhejiang da xue xue bao. Yi xue ban = Journal of Zhejiang University. Medical sciences
To evaluate the application of decision tree method and Logistic regression in the prediction of acute myocardial infarction (AMI) events. The clinical data of 295 patients, who underwent coronary angiography due to angina or chest pain with unidentified causes in Zhejiang provincial People's Hospital during October 2018 and April 2019, were retrospectively analyzed. Fifty five patients were identified as AMI. Logistic regression and decision tree methods were performed to establish predictive models for the occurrence of AMI, respectively; and the models created by decision tree analysis were divided into Logistic regression-independent model (Tree 1) and Logistic regression-dependent model (Tree 2). The performance of Logistic regression and decision tree models were compared using the area under the receiver operating characteristic (ROC) curve. Logistic regression analysis showed that history of coronary artery disease, multi-vessel coronary artery disease, statin use and apolipoprotein (ApoA1) level were independent influencing factors of AMI events (all P<0.05). Logistic regression-independent decision tree model (Tree 1) showed that multi-vessel coronary artery disease was the root node, and history of coronary artery disease, ApoA1 level (the cutoff value:1.314 g/L) and anti-platelet drug use were descendant nodes. In Logistic regression-dependent decision tree model (Tree 2), multi-vessel coronary artery disease was still the root node, but only followed by two descendant nodes including history of coronary artery disease and ApoA1 level. The area under the curve (AUC) of ROC of Logistic regression model was 0.826, and AUCs of decision tree models were 0.765 and 0.726, respectively. AUC of Logistic regression model was significantly higher than that of Tree 2 (95% CI=0.041-0.145, Z=3.534, P<0.001), but was not higher than that of Tree 1 (95% CI=-0.014-0.121, Z=-1.173, P>0.05). The predictive value for AMI event was comparable between Logistic regression-independent decision tree model and Logistic regression model, implying the data mining methods are feasible and effective in AMI prevention and control.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.