A design methodology of crystal growth furnace and process aided by two-step optimization using machine learning models and genetic algorithm
A design methodology of crystal growth furnace and process aided by two-step optimization using machine learning models and genetic algorithm
- Research Article
2
- 10.1016/j.cjche.2024.04.015
- May 18, 2024
- Chinese Journal of Chemical Engineering
Thiourea crystal growth kinetics, mechanism and process optimization during cooling crystallization
- Research Article
9
- 10.3390/s17061248
- May 31, 2017
- Sensors (Basel, Switzerland)
Many techniques are used to monitor one or more of the phenomena involved in the crystallization process. One of the challenges in crystal growth monitoring is finding techniques that allow direct interpretation of the data. The present study used a low-cost system, composed of a commercial webcam and a simple white LED (Light Emitting Diode) illuminator, to follow the calcium carbonate crystal growth process. The experiments were followed with focused beam reflectance measurement (FBRM), a common technique for obtaining information about the formation and growth of crystals. The images obtained in real time were treated with the red, blue, and green (RGB) system. The results showed a qualitative response of the system to crystal formation and growth processes, as there was an observed decrease in the signal as the growth process occurred. Control of the crystal growth was managed by increasing the viscosity of the test solution with the addition of monoethylene glycol (MEG) at 30% and 70% in a mass to mass relationship, providing different profiles of the RGB average curves. The decrease in the average RGB value became slower as the concentration of MEG was increased; this reflected a lag in the growth process that was proven by the FBRM.
- Research Article
2
- 10.3390/en17122840
- Jun 9, 2024
- Energies
Accurate prediction of energy consumption in district heating systems plays an important role in supporting effective and clean energy production and distribution in dense urban areas. Predictive models are needed for flexible and cost-effective operation of energy production and usage, e.g., using peak shaving or load shifting to compensate for heat losses in the pipeline. This helps to avoid exceedance of power plant capacity. The purpose of this study is to automate the process of building machine learning (ML) models to solve a short-term power demand prediction problem. The dataset contains a district heating network’s measured hourly power consumption and ambient temperature for 415 days. In this paper, we propose a hybrid evolutionary-based algorithm, named GA-SHADE, for the simultaneous optimization of ML models and feature selection. The GA-SHADE algorithm is a hybrid algorithm consisting of a Genetic Algorithm (GA) and success-history-based parameter adaptation for differential evolution (SHADE). The results of the numerical experiments show that the proposed GA-SHADE algorithm allows the identification of simplified ML models with good prediction performance in terms of the optimized feature subset and model hyperparameters. The main contributions of the study are (1) using the proposed GA-SHADE, ML models with varying numbers of features and performance are obtained. (2) The proposed GA-SHADE algorithm self-adapts during operation and has only one control parameter. There is no fine-tuning required before execution. (3) Due to the evolutionary nature of the algorithm, it is not sensitive to the number of features and hyperparameters to be optimized in ML models. In conclusion, this study confirms that each optimized ML model uses a unique set and number of features. Out of the six ML models considered, SVR and NN are better candidates and have demonstrated the best performance across several metrics. All numerical experiments were compared against the measurements and proven by the standard statistical tests.
- Research Article
1
- 10.1038/s41598-022-20012-1
- Sep 30, 2022
- Scientific Reports
Deep neural networks (DNNs) have shown success in image classification, with high accuracy in recognition of everyday objects. Performance of DNNs has traditionally been measured assuming human accuracy is perfect. In specific problem domains, however, human accuracy is less than perfect and a comparison between humans and machine learning (ML) models can be performed. In recognising everyday objects, humans have the advantage of a lifetime of experience, whereas DNN models are trained only with a limited image dataset. We have tried to compare performance of human learners and two DNN models on an image dataset which is novel to both, i.e. histological images. We thus aim to eliminate the advantage of prior experience that humans have over DNN models in image classification. Ten classes of tissues were randomly selected from the undergraduate first year histology curriculum of a Medical School in North India. Two machine learning (ML) models were developed based on the VGG16 (VML) and Inception V2 (IML) DNNs, using transfer learning, to produce a 10-class classifier. One thousand (1000) images belonging to the ten classes (i.e. 100 images from each class) were split into training (700) and validation (300) sets. After training, the VML and IML model achieved 85.67 and 89% accuracy on the validation set, respectively. The training set was also circulated to medical students (MS) of the college for a week. An online quiz, consisting of a random selection of 100 images from the validation set, was conducted on students (after obtaining informed consent) who volunteered for the study. 66 students participated in the quiz, providing 6557 responses. In addition, we prepared a set of 10 images which belonged to different classes of tissue, not present in training set (i.e. out of training scope or OTS images). A second quiz was conducted on medical students with OTS images, and the ML models were also run on these OTS images. The overall accuracy of MS in the first quiz was 55.14%. The two ML models were also run on the first quiz questionnaire, producing accuracy between 91 and 93%. The ML models scored more than 80% of medical students. Analysis of confusion matrices of both ML models and all medical students showed dissimilar error profiles. However, when comparing the subset of students who achieved similar accuracy as the ML models, the error profile was also similar. Recognition of ‘stomach’ proved difficult for both humans and ML models. In 04 images in the first quiz set, both VML model and medical students produced highly equivocal responses. Within these images, a pattern of bias was uncovered–the tendency of medical students to misclassify ‘liver’ tissue. The ‘stomach’ class proved most difficult for both MS and VML, producing 34.84% of all errors of MS, and 41.17% of all errors of VML model; however, the IML model committed most errors in recognising the ‘skin’ class (27.5% of all errors). Analysis of the convolution layers of the DNN outlined features in the original image which might have led to misclassification by the VML model. In OTS images, however, the medical students produced better overall score than both ML models, i.e. they successfully recognised patterns of similarity between tissues and could generalise their training to a novel dataset. Our findings suggest that within the scope of training, ML models perform better than 80% medical students with a distinct error profile. However, students who have reached accuracy close to the ML models, tend to replicate the error profile as that of the ML models. This suggests a degree of similarity between how machines and humans extract features from an image. If asked to recognise images outside the scope of training, humans perform better at recognising patterns and likeness between tissues. This suggests that ‘training’ is not the same as ‘learning’, and humans can extend their pattern-based learning to different domains outside of the training set.
- Research Article
18
- 10.3390/app13106138
- May 17, 2023
- Applied Sciences
A mortality prediction model can be a great tool to assist physicians in decision making in the intensive care unit (ICU) in order to ensure optimal allocation of ICU resources according to the patient’s health conditions. The entire world witnessed a severe ICU patient capacity crisis a few years ago during the COVID-19 pandemic. Various widely utilized machine learning (ML) models in this research field can provide poor performance due to a lack of proper feature selection. Despite the fact that nature-based algorithms in other sectors perform well for feature selection, no comparative study on the performance of nature-based algorithms in feature selection has been conducted in the ICU mortality prediction field. Therefore, in this research, a comparison of the performance of ML models with and without feature selection was performed. In addition, explainable artificial intelligence (AI) was used to examine the contribution of features to the decision-making process. Explainable AI focuses on establishing transparency and traceability for statistical black-box machine learning techniques. Explainable AI is essential in the medical industry to foster public confidence and trust in machine learning model predictions. Three nature-based algorithms, namely the flower pollination algorithm (FPA), particle swarm algorithm (PSO), and genetic algorithm (GA), were used in this study. For the classification job, the most widely used and diversified classifiers from the literature were used, including logistic regression (LR), decision tree (DT) classifier, the gradient boosting (GB) algorithm, and the random forest (RF) algorithm. The Medical Information Mart for Intensive Care III (MIMIC-III) dataset was used to collect data on heart failure patients. On the MIMIC-III dataset, it was discovered that feature selection significantly improved the performance of the described ML models. Without applying any feature selection process on the MIMIC-III heart failure patient dataset, the accuracy of the four mentioned ML models, namely LR, DT, RF, and GB was 69.9%, 82.5%, 90.6%, and 91.0%, respectively, whereas with feature selection in combination with the FPA, the accuracy increased to 71.6%, 84.8%, 92.8%, and 91.1%, respectively, for the same dataset. Again, the FPA showed the highest area under the receiver operating characteristic (AUROC) value of 83.0% with the RF algorithm among all other algorithms utilized in this study. Thus, it can be concluded that the use of feature selection with FPA has a profound impact on the outcome of ML models. Shapley additive explanation (SHAP) was used in this study to interpret the ML models. SHAP was used in this study because it offers mathematical assurances for the precision and consistency of explanations. It is trustworthy and suitable for both local and global explanations. It was found that the features that were selected by SHAP as most important were also most common with the features selected by the FPA. Therefore, we hope that this study will help physicians to predict ICU mortality for heart failure patients with a limited number of features and with high accuracy.
- Research Article
2
- 10.1108/ecam-06-2024-0706
- Sep 26, 2024
- Engineering, Construction and Architectural Management
Purpose The cash flow from government agencies to contractors, called progress payment, is a critical step in public projects. The delays in progress payments significantly affect the project performance of contractors and lead to conflicts between two parties in the Turkish construction industry. Although some previous studies focused on the issues in internal cash flows (e.g. inflows and outflows) of construction companies, the context of cash flows from public agencies to contractors in public projects is still unclear. Therefore, the primary objective of this study is to develop and test diverse machine learning-based predictive models on the progress payment performance of Turkish public agencies and improve the predictive performance of these models with two different optimization algorithms (e.g. first-order and second-order). In addition, this study explored the attributes that make the most significant contribution to predicting the payment performance of Turkish public agencies. Design/methodology/approach In total, project information of 2,319 building projects tendered by the Turkish public agencies was collected. Six different machine learning algorithms were developed and two different optimization methods were applied to achieve the best machine learning (ML) model for Turkish public agencies' cash flow performance in this study. The current research tested the effectiveness of each optimization algorithm for each ML model developed. In addition, the effect size achieved in the ML models was evaluated and ranked for each attribute, so that it is possible to observe which attributes make significant contributions to predicting the cash flow performance of Turkish public agencies. Findings The results show that the attributes “inflation rate” (F5; 11.2%), “consumer price index” (F6; 10.55%) and “total project duration” (T1; 10.9%) are the most significant factors affecting the progress payment performance of government agencies. While decision tree (DT) shows the best performance among ML models before optimization process, the prediction performance of models support vector machine (SVM) and genetic algorithm (GA) has been significantly improved by Broyden–Fletcher–Goldfarb–Shanno (BFGS)-based Quasi-Newton optimization algorithm by 14.3% and 18.65%, respectively, based on accuracy, AUROC (Area Under the Receiver Operating Characteristics) and F1 values. Practical implications The most effective ML model can be used and integrated into proactive systems in real Turkish public construction projects, which provides management of cash flow issues from public agencies to contractors and reduces conflicts between two parties. Originality/value The development and comparison of various predictive ML models on the progress payment performance of Turkish public owners in construction projects will be the first empirical attempt in the body of knowledge. This study has been carried out by using a high number of project information with diverse 27 attributes, which distinguishes this study in the body of knowledge. For the optimization process, a new hyper parameter tuning strategy, the Bayesian technique, was adopted for two different optimization methods. Thus, it is available to find the best predictive model to be integrated into real proactive systems in forecasting the cash flow performance of Turkish public agencies in public works projects. This study will also make novel contributions to the body of knowledge in understanding the key parameters that have a negative impact on the payment progress of public agencies.
- Research Article
342
- 10.1016/j.actamat.2019.11.067
- Dec 4, 2019
- Acta Materialia
Phase prediction in high entropy alloys with a rational selection of materials descriptors and machine learning models
- Research Article
37
- 10.1016/j.jcrysgro.2020.125828
- Aug 6, 2020
- Journal of Crystal Growth
Optimization of the melt/crystal interface shape and oxygen concentration during the Czochralski silicon crystal growth process using an artificial neural network and a genetic algorithm
- Preprint Article
- 10.5194/egusphere-egu23-11636
- May 15, 2023
For recent years, Machine Learning (ML) models have been proven to be useful in solving problems of a wide variety of fields such as medical, economic, manufacturing, transportation, energy, education, etc. With increased interest in ML models and advances in sensor technologies, ML models are being widely applied even in civil engineering domain. ML model enables analysis of large amounts of data, automation, improved decision making and provides more accurate prediction. While several state-of-the-art reviews have been conducted in each sub-domain (e.g., geotechnical engineering, structural engineering) of civil engineering or its specific application problems (e.g., structural damage detection, water quality evaluation), little effort has been devoted to comprehensive review on ML models applied in civil engineering and compare them across sub-domains. A systematic, but domain-specific literature review framework should be employed to effectively classify and compare the models. To that end, this study proposes a novel review approach based on the hierarchical classification tree “D-A-M-I-E (Domain-Application problem-ML models-Input data-Example case)”. “D-A-M-I-E” classification tree classifies the ML studies in civil engineering based on the (1) domain of the civil engineering, (2) application problem, (3) applied ML models and (4) data used in the problem. Moreover, data used for the ML models in each application examples are examined based on the specific characteristic of the domain and the application problem. For comprehensive review, five different domains (structural engineering, geotechnical engineering, water engineering, transportation engineering and energy engineering) are considered and the ML application problem is divided into five different problems (prediction, classification, detection, generation, optimization). Based on the “D-A-M-I-E” classification tree, about 300 ML studies in civil engineering are reviewed. For each domain, analysis and comparison on following questions has been conducted: (1) which problems are mainly solved based on ML models, (2) which ML models are mainly applied in each domain and problem, (3) how advanced the ML models are and (4) what kind of data are used and what processing of data is performed for application of ML models. This paper assessed the expansion and applicability of the proposed methodology to other areas (e.g., Earth system modeling, climate science). Furthermore, based on the identification of research gaps of ML models in each domain, this paper provides future direction of ML in civil engineering based on the approaches of dealing data (e.g., collection, handling, storage, and transmission) and hopes to help application of ML models in other fields.
- Research Article
6
- 10.1016/j.pcrysgrow.2016.04.023
- Jun 1, 2016
- Progress in Crystal Growth and Characterization of Materials
Observing crystal growth processes in computer simulations
- Research Article
1
- 10.1515/revce-2024-0047
- Jan 29, 2025
- Reviews in Chemical Engineering
Amine absorption has been regarded as an efficient solution in reducing the atmospheric carbon dioxide (CO2) concentration. Machine learning (ML) models are applied in the CO2 capture field to predict the CO2 solubility in amine solvents. Although there are other similar reviews, this systematic review presents a more comprehensive review on the ML models and their training algorithms applied to predict CO2 solubility in amine-related solvents in the past 10 years. A total of 55 articles are collected from Scopus, ScienceDirect and Web of Science following Preferred Reporting Items for Systematic Review and Meta-Analyses guidelines. Neural network is the most frequently applied model while committee machine intelligence system is the most accurate model. However, relatively the same optimisation algorithm was applied for each type of ML models. Genetic algorithm has been applied in most of the discussed ML models, yet limited studies were found. The advantages and limitations of each ML models are discussed. The findings of this review could provide a database of the data points for future research, as well as provide information to future researchers for studying ML application in amine absorption, including but not limited to implementation of different optimisation algorithms, structure optimisation and larger scale applications.
- Research Article
24
- 10.1175/jcli-d-21-0113.1
- Jun 8, 2021
- Journal of Climate
In this study, four machine learning (ML) models (gradient boost decision tree (GBDT), light gradient boosting machine (LightGBM), categorical boosting (CatBoost) and extreme gradient boosting (XGBoost)) are used to perform seasonal forecasts for non-monsoonal winter precipitation over the Eurasian continent (30-60°N, 30-105°E) (NWPE). The seasonal forecast results from a traditional linear regression (LR) model and two dynamic models are compared. The ML and LR models are trained using the data for the period of 1979-2010, and then, these empirical models are used to perform the seasonal forecast of NWPE for 2011-2018. Our results show that the four ML models have reasonable seasonal forecast skills for the NWPE and clearly outperform the LR model. The ML models and the dynamic models have skillful forecasts for the NWPE over different regions. The ensemble means of the forecasts including the ML models and dynamic models show higher forecast skill for the NWEP than the ensemble mean of the dynamic-only models. The forecast skill of the ML models mainly benefits from a skillful forecast of the third empirical orthogonal function (EOF) mode (EOF3) of the NWPE, which has a good and consistent prediction among the ML models. Our results also illustrate that the sea ice over the Arctic in the previous autumn is the most important predictor in the ML models in forecasting the NWPE. This study suggests that ML models may be useful tools to help improve seasonal forecasts of the NWPE.
- Research Article
20
- 10.2196/47833
- Nov 20, 2023
- JMIR Medical Informatics
Machine learning (ML) models provide more choices to patients with diabetes mellitus (DM) to more properly manage blood glucose (BG) levels. However, because of numerous types of ML algorithms, choosing an appropriate model is vitally important. In a systematic review and network meta-analysis, this study aimed to comprehensively assess the performance of ML models in predicting BG levels. In addition, we assessed ML models used to detect and predict adverse BG (hypoglycemia) events by calculating pooled estimates of sensitivity and specificity. PubMed, Embase, Web of Science, and Institute of Electrical and Electronics Engineers Explore databases were systematically searched for studies on predicting BG levels and predicting or detecting adverse BG events using ML models, from inception to November 2022. Studies that assessed the performance of different ML models in predicting or detecting BG levels or adverse BG events of patients with DM were included. Studies with no derivation or performance metrics of ML models were excluded. The Quality Assessment of Diagnostic Accuracy Studies tool was applied to assess the quality of included studies. Primary outcomes were the relative ranking of ML models for predicting BG levels in different prediction horizons (PHs) and pooled estimates of the sensitivity and specificity of ML models in detecting or predicting adverse BG events. In total, 46 eligible studies were included for meta-analysis. Regarding ML models for predicting BG levels, the means of the absolute root mean square error (RMSE) in a PH of 15, 30, 45, and 60 minutes were 18.88 (SD 19.71), 21.40 (SD 12.56), 21.27 (SD 5.17), and 30.01 (SD 7.23) mg/dL, respectively. The neural network model (NNM) showed the highest relative performance in different PHs. Furthermore, the pooled estimates of the positive likelihood ratio and the negative likelihood ratio of ML models were 8.3 (95% CI 5.7-12.0) and 0.31 (95% CI 0.22-0.44), respectively, for predicting hypoglycemia and 2.4 (95% CI 1.6-3.7) and 0.37 (95% CI 0.29-0.46), respectively, for detecting hypoglycemia. Statistically significant high heterogeneity was detected in all subgroups, with different sources of heterogeneity. For predicting precise BG levels, the RMSE increases with a rise in the PH, and the NNM shows the highest relative performance among all the ML models. Meanwhile, current ML models have sufficient ability to predict adverse BG events, while their ability to detect adverse BG events needs to be enhanced. PROSPERO CRD42022375250; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=375250.
- Research Article
3
- 10.13031/jnrae.15647
- Jan 1, 2023
- Journal of Natural Resources and Agricultural Ecosystems
Highlights Machine Learning (ML) models are identified, reviewed, and analyzed for HAB predictions. Data preprocessing is vital for efficient ML model development. ML models for toxin production and monitoring are limited. Abstract. Harmful algal blooms (HABs) are detrimental to livestock, humans, pets, the environment, and the global economy, which calls for a robust approach to their management. While process-based models can inform practitioners about HAB enabling conditions, they have inherent limitations in accurately predicting harmful algal blooms. To address these limitations, Machine Learning (ML) models can potentially leverage large volumes of IoT data to aid in near real-time predictions. ML models have evolved as efficient tools for understanding patterns and relationships between water quality parameters and HAB expansion. This review describes ML models currently used for predicting and forecasting HABs in freshwater ecosystems and presents model structures and their application for predicting algal parameters and related toxins. The review revealed that regression trees, random forest, Artificial Neural Network (ANN), Support Vector Regression (SVR), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) are the most frequently used models for HABs monitoring. This review shows ML models' prowess in identifying significant variables influencing algal growth, HAB drivers, and multistep HAB prediction. Hybrid models also improve the prediction of algal-related parameters through improved optimization techniques and variable selection algorithms. While ML models often focus on algal biomass prediction, few studies apply ML models for toxin monitoring and prediction. This limitation can be associated with a lack of high-frequency toxin datasets for model development, and exploring this domain is encouraged. This review serves as a guide for policymakers and researchers to implement ML models for HAB prediction and reveals the potential of ML models for decision support and early prediction for HAB management. Keywords: Cyanobacteria, Freshwater, Harmful algal blooms, Machine learning, Water quality.
- Research Article
2
- 10.1038/s41598-024-70530-3
- Aug 21, 2024
- Scientific Reports
Microscopic evaluation is one of the most effective methods in materials research. High-quality images are essential to analyze microscopic images using artificial intelligence. To overcome this challenge, we propose the machine learning of “fake micrographs” in this study. To verify the effectiveness of this method, we chose to analyze the optical microscopic images of the crystal growth process of a Ge thin film, which is a material in which it is difficult to obtain a contrast between the crystal and amorphous states. By learning the automatically generated fake micrographs that mimic the crystal growth process, the machine learning model can now identify the low-resolution real micrographs as crystalline or amorphous. Comparing the three types of machine learning models, it was found that ResUNet ++ exhibited high accuracy, exceeding 90%. The technology developed in this study for the automatic and rapid analysis of low-resolution images is widely helpful in material research.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.