Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Application of XGBoost Algorithm in Sentiment Classification of MOBA Game Reviews on Google Play Store

  • TL;DR
  • Abstract
  • Literature Map
  • Similar Papers
TL;DR

This study compares XGBoost and Random Forest algorithms for sentiment classification of 10,000 Google Play Store reviews of the GoBiz app, finding XGBoost achieves an accuracy of 86.81% with score-based labeling, while Random Forest reaches 84.98%, demonstrating effective methods for enhancing digital business service quality.

Abstract
Translate article icon Translate Article Star icon

In the rapidly evolving digital era, business applications like GoBiz play a crucial role in supporting the operations of Micro, Small, and Medium Enterprises (MSMEs). This study aims to analyze user sentiment toward the GoBiz app based on reviews on the Google Play Store by applying two machine learning algorithms: Extreme Gradient Boosting (XGBoost) and Random Forest. Two labeling approaches were used: score-based labeling, which refers to star ratings, and lexicon-based labeling using the VADER method. Data from 10,000 reviews were collected through web scraping and processed through preprocessing, labeling, TF-IDF feature extraction, model training, and evaluation. The evaluation results showed that the XGBoost algorithm excelled in score-based labeling with the highest accuracy of 86.81%, while Random Forest was more stable than the VADER approach with an accuracy of 84.98%. Both models performed well, but their effectiveness depended on the type of labeling used. This research contributes to the development of a sentiment classification system in digital business applications, and can be utilized by GoBiz application developers to improve service quality based on user perceptions.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 21
  • 10.1155/2022/2043369
The Construction of Corporate Financial Management Risk Model Based on XGBoost Algorithm
  • Jan 1, 2022
  • Journal of Mathematics
  • Rongyuan Qin

Corporate financial management is a tedious task, and it is a complicated thing to rely solely on the human resources of financial personnel to manage. With the continuous development of intelligent algorithms and machine learning algorithms, new ideas have been brought to enterprise financial risk assessment. This method will not only save a lot of financial and material resources but also improve the accuracy of enterprise financial risk assessment. Compared with machine learning algorithms such as random forests and support vector machines, the extreme gradient boosting (XGBoost) algorithm is more widely used, and it has unique advantages in terms of speed and accuracy. This study selects the XGBoost learning algorithm to predict the risk assessment in corporate finance. In this study, the data preprocessing method is used to preprocess and classify the enterprise financial data source effectively, and then the XGBoost algorithm is used to assess the risk of enterprise financial data, and finally a set of enterprise financial risk assessment model is established. The research results show that the XGBoost model selected in this paper has high reliability in predicting the financial risk assessment of enterprises, and the prediction errors are all within 3%. The largest forecast error is only 2.68%, which comes from the profit and loss of the enterprise’s financial situation. The smallest error is only 0.56%, which is a trustworthy enough error for corporate financial forecasting. There is a high correlation between the type of enterprise financial risk assessment and the actual type of risk. At the same time, this paper also has a good dependence on the preprocessing method of enterprise financial data.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 29
  • 10.2196/26139
Use of Multiprognostic Index Domain Scores, Clinical Data, and Machine Learning to Improve 12-Month Mortality Risk Prediction in Older Hospitalized Patients: Prospective Cohort Study
  • Jun 21, 2021
  • Journal of Medical Internet Research
  • Richard John Woodman + 4 more

BackgroundThe Multidimensional Prognostic Index (MPI) is an aggregate, comprehensive, geriatric assessment scoring system derived from eight domains that predict adverse outcomes, including 12-month mortality. However, the prediction accuracy of using the three MPI categories (mild, moderate, and severe risk) was relatively poor in a study of older hospitalized Australian patients. Prediction modeling using the component domains of the MPI together with additional clinical features and machine learning (ML) algorithms might improve prediction accuracy.ObjectiveThis study aims to assess whether the accuracy of prediction for 12-month mortality using logistic regression with maximum likelihood estimation (LR-MLE) with the 3-category MPI together with age and gender (feature set 1) can be improved with the addition of 10 clinical features (sodium, hemoglobin, albumin, creatinine, urea, urea-to-creatinine ratio, estimated glomerular filtration rate, C-reactive protein, BMI, and anticholinergic risk score; feature set 2) and the replacement of the 3-category MPI in feature sets 1 and 2 with the eight separate MPI domains (feature sets 3 and 4, respectively), and to assess the prediction accuracy of the ML algorithms using the same feature sets.MethodsMPI and clinical features were collected from patients aged 65 years and above who were admitted to either the general medical or acute care of the elderly wards of a South Australian hospital between September 2015 and February 2017. The diagnostic accuracy of LR-MLE was assessed together with nine ML algorithms: decision trees, random forests, extreme gradient boosting (XGBoost), support-vector machines, naïve Bayes, K-nearest neighbors, ridge regression, logistic regression without regularization, and neural networks. A 70:30 training set:test set split of the data and a grid search of hyper-parameters with 10-fold cross-validation—was used during model training. The area under the curve was used as the primary measure of accuracy.ResultsA total of 737 patients (female: 370/737, 50.2%; male: 367/737, 49.8%) with a median age of 80 (IQR 72-86) years had complete MPI data recorded on admission and had completed the 12-month follow-up. The area under the receiver operating curve for LR-MLE was 0.632, 0.688, 0.738, and 0.757 for feature sets 1 to 4, respectively. The best overall accuracy for the nine ML algorithms was obtained using the XGBoost algorithm (0.635, 0.706, 0.756, and 0.757 for feature sets 1 to 4, respectively).ConclusionsThe use of MPI domains with LR-MLE considerably improved the prediction accuracy compared with that obtained using the traditional 3-category MPI. The XGBoost ML algorithm slightly improved accuracy compared with LR-MLE, and adding clinical data improved accuracy. These results build on previous work on the MPI and suggest that implementing risk scores based on MPI domains and clinical data by using ML prediction models can support clinical decision-making with respect to risk stratification for the follow-up care of older hospitalized patients.

  • Research Article
  • Cite Count Icon 4
  • 10.3760/cma.j.cn121430-20230930-00832
Constructing a predictive model for the death risk of patients with septic shock based on supervised machine learning algorithms
  • Apr 1, 2024
  • Zhonghua wei zhong bing ji jiu yi xue
  • Zheng Xie + 7 more

To construct and validate the best predictive model for 28-day death risk in patients with septic shock based on different supervised machine learning algorithms. The patients with septic shock meeting the Sepsis-3 criteria were selected from Medical Information Mart for Intensive Care-IV v2.0 (MIMIC-IV v2.0). According to the principle of random allocation, 70% of these patients were used as the training set, and 30% as the validation set. Relevant predictive variables were extracted from three aspects: demographic characteristics and basic vital signs, serum indicators within 24 hours of intensive care unit (ICU) admission and complications possibly affecting indicators, functional scoring and advanced life support. The predictive efficacy of models constructed using five mainstream machine learning algorithms including decision tree classification and regression tree (CART), random forest (RF), support vector machine (SVM), linear regression (LR), and super learner [SL; combined CART, RF and extreme gradient boosting (XGBoost)] for 28-day death in patients with septic shock was compared, and the best algorithm model was selected. The optimal predictive variables were determined by intersecting the results from LASSO regression, RF, and XGBoost algorithms, and a predictive model was constructed. The predictive efficacy of the model was validated by drawing receiver operator characteristic curve (ROC curve), the accuracy of the model was assessed using calibration curves, and the practicality of the model was verified through decision curve analysis (DCA). A total of 3 295 patients with septic shock were included, with 2 164 surviving and 1 131 dying within 28 days, resulting in a mortality of 34.32%. Of these, 2 307 were in the training set (with 792 deaths within 28 days, a mortality of 34.33%), and 988 in the validation set (with 339 deaths within 28 days, a mortality of 34.31%). Five machine learning models were established based on the training set data. After including variables at three aspects, the area under the ROC curve (AUC) of RF, SVM, and LR machine learning algorithm models for predicting 28-day death in septic shock patients in the validation set was 0.823 [95% confidence interval (95%CI) was 0.795-0.849], 0.823 (95%CI was 0.796-0.849), and 0.810 (95%CI was 0.782-0.838), respectively, which were higher than that of the CART algorithm model (AUC = 0.750, 95%CI was 0.717-0.782) and SL algorithm model (AUC = 0.756, 95%CI was 0.724-0.789). Thus above three algorithm models were determined to be the best algorithm models. After integrating variables from three aspects, 16 optimal predictive variables were identified through intersection by LASSO regression, RF, and XGBoost algorithms, including the highest pH value, the highest albumin (Alb), the highest body temperature, the lowest lactic acid (Lac), the highest Lac, the highest serum creatinine (SCr), the highest Ca2+, the lowest hemoglobin (Hb), the lowest white blood cell count (WBC), age, simplified acute physiology score III (SAPS III), the highest WBC, acute physiology score III (APS III), the lowest Na+, body mass index (BMI), and the shortest activated partial thromboplastin time (APTT) within 24 hours of ICU admission. ROC curve analysis showed that the Logistic regression model constructed with above 16 optimal predictive variables was the best predictive model, with an AUC of 0.806 (95%CI was 0.778-0.835) in the validation set. The calibration curve and DCA curve showed that this model had high accuracy and the highest net benefit could reach 0.3, which was significantly outperforming traditional models based on single functional score [APS III score, SAPS III score, and sequential organ failure assessment (SOFA) score] with AUC (95%CI) of 0.746 (0.715-0.778), 0.765 (0.734-0.796), and 0.625 (0.589-0.661), respectively. The Logistic regression model, constructed using 16 optimal predictive variables including pH value, Alb, body temperature, Lac, SCr, Ca2+, Hb, WBC, SAPS III score, APS III score, Na+, BMI, and APTT, is identified as the best predictive model for the 28-day death risk in patients with septic shock. Its performance is stable, with high discriminative ability and accuracy.

  • Research Article
  • Cite Count Icon 5
  • 10.3389/fpubh.2024.1303958
Physical frailty identification using machine learning to explore the 5-item FRAIL scale, Cardiovascular Health Study index, and Study of Osteoporotic Fractures index.
  • May 9, 2024
  • Frontiers in Public Health
  • Chen-Cheng Yang + 7 more

Physical frailty is an important issue in aging societies. Three models of physical frailty assessment, the 5-Item fatigue, resistance, ambulation, illness and loss of weight (FRAIL); Cardiovascular Health Study (CHS); and Study of Osteoporotic Fractures (SOF) indices, have been regularly used in clinical and research studies. However, no previous studies have investigated the predictive ability of machine learning (ML) for physical frailty assessment. The aim was to use two ML algorithms, random forest (RF) and extreme gradient boosting (XGBoost), to predict these three physical frailty assessment models. Questionnaires regarding demographic characteristics, lifestyle habits, living environment, and physical frailty assessment were answered by 445 participants aged 60 years and above. The RF and XGBoost algorithms were used to assess their scores for the three physical frailty indices. Furthermore, feature importance and Shapley additive explanations (SHAP) were used to determine the important physical frailty factors. The XGBoost algorithm obtained higher accuracy for predicting the three physical frailty indices; the areas under the curve obtained by the XGBoost algorithm for the 5-Item FRAIL, CHS, and SOF indices were 0.84. 0.79, and 0.69, respectively. The feature importance and SHAP of the XGBoost algorithm revealed that systolic blood pressure, diastolic blood pressure, age, and body mass index play important roles in all three physical frailty models. The XGBoost algorithm has a more accurate predictive rate than RF across all three physical frailty assessments. Thus, ML can be a useful tool for the early detection of physical frailty.

  • Conference Article
  • Cite Count Icon 3
  • 10.54941/ahfe1001820
Classifying mental workload using EEG data: A machine learning approach
  • Jan 1, 2022
  • AHFE international
  • Şeniz Harputlu Aksu + 1 more

Mental workload is related to the difference between the available mental resource capacity of the operator and the mental resource required by the job. To decide the number of tasks assigned to operator and the difficulty levels of those tasks, it is important to know the operator's mental workload. An overload occurs if the amount of resources required by the task exceeds the available capacity of the person. Mental workload analysis helps to recognize the mental fatigue, evaluate the human performance of different level tasks and adjust cognitive sources for safe and efficient human-machine interactions. Excessive levels of mental workload can lead to errors or delays in information processing. Monitoring brain activity has been verified to be sensitive and consistent reflector of mental workload changes. Classification, regression, clustering, anomaly detection, dimensionality reduction, and reward maximization are common machine learning models. Classification of mental workload has critical importance in the domain of human factors and ergonomics. In recent years, with the need to analyze continuous and large-scale data obtained by physiological methods, the use of machine learning algorithms has become widespread in estimating and classifying mental workload. The objectives of the current study were two-fold: (1) to investigate the relationship among EEG features, task difficulty levels and subjective self-assessment (NASA-TLX) scores and (2) to develop machine learning algorithms for classifying mental workload using EEG features. N-back tasks have been commonly used in the literature. In this study, N-back memory tests were performed at four different difficulty levels. As the number of n increases, so does the difficulty of the task. Four participants performed the tests. Seventy EEG features (5 frequency band power for 14 channels) were selected as independent variables. One output variable reflecting the difficulty level of N-Back memory was classified. The machine learning algorithms used in our study were K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LightGBM) and Extreme Gradient Boosting (XGBoost) algorithms. As the task difficulty increased, theta activity in prefrontal and frontal regions increased. Especially frontal theta power, parietal and occipital gamma power were significantly correlated to perceived workload scores obtained via NASA-TLX. Prefrontal beta-high activity had a significant negative relationship with self-assessment workload ratings. Prefrontal and frontal theta, prefrontal beta-high, occipital, parietal and temporal gamma and occipital alpha activities were found to be the most effective parameters. The results obtained for the four classes of classification problem reached the accuracy of 68% with EEG features as input and the Random Forest algorithm. In addition, the results obtained for the two classes of classification problem reached the accuracy of 87% with EEG features as input and the GBM algorithm. The results from the analysis indicate that EEG signals play an important role in the classification of mental workload. Another remarkable result was high classification performance of GBM, LightGBM and XGBoost algorithms that have been developed in the recent past and therefore not frequently used in studies on this subject in the literature.

  • Research Article
  • 10.20895/centive.v2025i1.526
Sentiment Classification of FatSecret Application Reviews with Machine Learning Models
  • Jan 28, 2026
  • Proceedings of the National Conference on Electrical Engineering, Informatics, Industrial Technology, and Creative Media
  • Mayang Gumelar + 1 more

In the current digital era, mobile applications have become an indispensable part of daily life, leading to a surge in user reviews as invaluable repositories of opinions. Health and fitness applications, such as FatSecret, generate millions of reviews rich with insights. However, specific sentiment analysis on FatSecret reviews using a structured Machine Learning (ML) approach remains limited. This study presents a comprehensive approach for sentiment classification of FatSecret application reviews using ML models. We collected Indonesian-language reviews from the Google Play Store, performed extensive data pre-processing (case folding, tokenization, filtering, normalization), and extracted features using Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BoW). Subsequently, we trained and evaluated five distinct sentiment classification algorithms: Random Forest, Decision Tree, Logistic Regression, SVM, and XGBoost, utilizing the StratifiedKFold method for automatic splitting in training and validation. Evaluation metrics include accuracy, precision, recall, and F1-score. The results of this research are expected to provide deep insights into user perceptions of FatSecret, identify favored and criticized features, and offer a replicable methodological framework for sentiment analysis of other applications in the future.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 16
  • 10.3389/fbioe.2022.903426
Performance of Machine Learning Algorithms for Predicting Adverse Outcomes in Community-Acquired Pneumonia
  • Jun 29, 2022
  • Frontiers in Bioengineering and Biotechnology
  • Zhixiao Xu + 4 more

Background: The ability to assess adverse outcomes in patients with community-acquired pneumonia (CAP) could improve clinical decision-making to enhance clinical practice, but the studies remain insufficient, and similarly, few machine learning (ML) models have been developed.Objective: We aimed to explore the effectiveness of predicting adverse outcomes in CAP through ML models.Methods: A total of 2,302 adults with CAP who were prospectively recruited between January 2012 and March 2015 across three cities in South America were extracted from DryadData. After a 70:30 training set: test set split of the data, nine ML algorithms were executed and their diagnostic accuracy was measured mainly by the area under the curve (AUC). The nine ML algorithms included decision trees, random forests, extreme gradient boosting (XGBoost), support vector machines, Naïve Bayes, K-nearest neighbors, ridge regression, logistic regression without regularization, and neural networks. The adverse outcomes included hospital admission, mortality, ICU admission, and one-year post-enrollment status.Results: The XGBoost algorithm had the best performance in predicting hospital admission. Its AUC reached 0.921, and accuracy, precision, recall, and F1-score were better than those of other models. In the prediction of ICU admission, a model trained with the XGBoost algorithm showed the best performance with AUC 0.801. XGBoost algorithm also did a good job at predicting one-year post-enrollment status. The results of AUC, accuracy, precision, recall, and F1-score indicated the algorithm had high accuracy and precision. In addition, the best performance was seen by the neural network algorithm when predicting death (AUC 0.831).Conclusions: ML algorithms, particularly the XGBoost algorithm, were feasible and effective in predicting adverse outcomes of CAP patients. The ML models based on available common clinical features had great potential to guide individual treatment and subsequent clinical decisions.

  • Research Article
  • Cite Count Icon 5
  • 10.1002/cl2.130
Protocol for a Systematic Review: The Impacts of Business Support Services for Small and Medium Enterprises on Firm Performance in Low‐and Middle‐Income Countries: A Systematic Review
  • Jan 1, 2014
  • Campbell Systematic Reviews
  • Lauro Gonzalez + 4 more

Small and medium enterprises (SMEs), defined in this review as businesses with up to 250 employees, are believed to be both an important tool in the fight against poverty and an important contributor to economic growth in developing countries. SMEs are responsible for the majority of employment generation in developed as well as in developing countries (Ayyagari et al., 2007). Given that SMEs play an important role in the formal labour force, the health of the sector has implications for employment generation policies and growth. Ayyagari et al. (2007) show that formal SMEs are responsible for most of the private sector employment in developed countries - for example, SMEs are responsible for around 60-70 per cent of employment generation in Germany, Finland, Belgium and Canada. However, in African countries SMEs are responsible for a smaller share of formal employment generation, providing only about 20 per cent of employment in Nigeria, Cote d'Ivoire and Cameroon. Ayyagari et al. also note that the SME sector's contribution to employment shows a strong positive correlation with GDP per capita. Thus, the evidence suggests that increasing this sector's contribution to employment might generate growth (Ayyagari et al., 2007; Beck et al., 2005), and therefore that effective business support services may positively affect GDP per capita. African economies have a lower percentage of formal workers in SMEs due to the fact that these economies have a larger (not computed) and less productive informal sector. Thus, in the path towards a more formalised labour market, employment generation by the SME sector plays a very important role. SMEs can further be linked to economic growth through their ability to link knowledge, product commercialisation and total factor productivity (Acs et al., 2009; Solow, 2007). A seminal study using a cross-section of countries to analyse SMEs and growth was provided by Beck et al. (2005), who found a positive but not causal relationship between SMEs and growth. An exploration of other available empirical evidence however, shows that while studies that focus on developed nations suggest a positive impact of SMEs and entrepreneurship on economic growth, studies examining developing countries suggest a negative impact (for example, Audretsch and Keilbach, 2004; Mueller, 2007; Cravo 2010; Cravo et al., 2012; Cravo et al., 2014).1 Acs et al. (2008) have attributed these differences in empirical results to different entrepreneurship responses to institutional arrangements. Moreover, heterogeneity in institutional arrangements is likely to provide different incentives to rent-seeking activities (Baumol, 1990). Thus, the role of SMEs in a given economy can be expected to vary depending on the institutional settings and level of development. Development agencies provide a considerable amount of targeted assistance to SMEs in low-and middle-income country economies (Beck et al., 2006). For instance, the World Bank devoted US$9.8 billion to SME projects during the period 2006–12 (IEG, 2013). For the same period, the support of the International Finance Corporation (IFC) of the World Bank Group directed to SMEs amounted to US $25 billion. However, there is limited evidence on the impact of SME support in the literature, due either to an insufficient number of studies employing convincing identification strategies to isolate the causal impact of the intervention under consideration, or to limited information regarding the mechanism underlying such interventions. This systematic review will draw on economic theory and qualitative studies to uncover the channels through which a particular intervention can affect the outcomes of interest. This research will therefore separate the outcomes into two categories, intermediate and final, wherever possible in order to uncover the theory of change of each intervention. In developing countries, programmes that support SMEs are based on the view that there are institutional constraints that impede SMEs from reaching their full potential to generate jobs and profits. Thus, the large amount of financial resources allocated to the development of the SME sector by governments and development organisations is designed to address institutional failures, and allow SMEs to operate more efficiently, thus leading to productivity growth (Beck et al., 2005).2 Various approaches are used to provide support services to SMEs. These mainly aim to improve the institutional setting and to remove those institutional constraints that prevent these firms from reaching their full potential and thus contributing effectively to economic growth and poverty alleviation. Based on a preliminary review of the literature, we have identified the main approaches to SME support as programs related to formalisation and the business environment, access to external markets, value chains and clusters, training and technical assistance, SME financing and innovation policy. This literature can be divided into two distinct themes. The first considers indirect support that addresses the constraints that prevent SMEs from getting access to credit, whereas the second addresses the impact of direct business support to SMEs. In the first strand, many studies look at the impact of an indirect type of public support aimed at SMEs, such as tax simplification, which intend to provide incentives for informal SMEs to formalise. The underlying assumption is that formal firms are less credit-constrained than their informal counterparts and therefore formalisation would be an effective way of helping entrepreneurs. Formalised firms are expected to have higher economies of scale and consequently be more productive, demand a more skilled labour force, and have higher profits. If informal firms are prevented from growing due to credit constraints, reducing the cost of formalisation should, indirectly, give firms the opportunity to escape from the low-scale-low-productivity trap. This intervention is an indirect form of public support because it is targeted to all firms with annual revenues below some threshold. All informal firms are incentivised to formalise through tax simplification. Those that decide to formalise are not directly offered any other type of public support. The second group of studies addresses the impact of direct business support to SMEs. They generally estimate the impact of a support programme to SMEs within a specific sector in a specific country, with the intervention based on the assumption that SMEs face constraints such as a limited pool of skilled labour, limited innovation capability and coordination failures. In this view, SMEs need public support to break the vicious circle of low investment and low productivity. A successful intervention might even generate (spillover) effects on firms that do not belong to the target group of the programme – firms from other sectors and/or informal firms in the same sector. This kind of support comes in the form of training programs, support for innovation or value chain and association strategies (for example, clusters) to address coordination failures. Notice that, unlike the indirect public support programmes, the unit of intervention is the firm itself. Firms are directly targeted with programmes that aim to help them shift from a low equilibrium (small size and scale) to a high equilibrium (bigger scale and dynamism). Workers are offered training, and transportation costs, spillover effects and coordination failures are directly affected by the creation of productive agglomerates. Since this review will investigate the impact of a diverse array of interventions, it is challenging to come up with a general theory of change. Although we provide a general theory of change based on our preliminary search of the literature in this section, it is with the caveat that each type of intervention identified in the initial search of the literature is based on an institution's belief in a particular causal chain. Therefore our approach to building out this theory of change will involve taking a case-by-case perspective on the assumptions regarding the causal chain of each of the programs analysed. As mentioned in Section 1.2, in general, support to SMEs is related to productivity growth and employment generation. Overall, the theory of change behind SME support services is linked to the improvement or creation of institutions that allow SMEs to reach their full potential. Figure 1 below provides a more general illustration of the theory of change for the intervention models we aim to survey in this review, as detailed in Table 1. Theory of change Within this general theory of change are contained those which are specific to the particular interventions shown: Tax simplification initiatives can be seen as a type of indirect business support to SMEs. These interventions aim at improving firm performance through the channel of formalisation. Economic theory suggests that formal firms will be able to grow with access to credit markets and by taking advantage of economies of scale. A tax simplification program could affect outcomes such as employment and profit through two intermediate outcomes: 1) formalisation rate, and 2) access to credit. The causal chain could be simplified as following: The necessary conditions for a tax simplification program shifts the informal entrepreneurs trapped in one equilibrium, characterised by low productivity and profits, to another where they face less constraints to growth after formalisation. There are plenty of studies that concentrate only on final outcomes, however, and shed no light on the mechanisms. Consequently, policy makers interested in knowing how such an intervention worked are given no guidance. We note that sub-components within the business support interventions that this review analyses may overlap. We will develop a conceptual model of intervention types to ensure appropriate categorisation of interventions for the analysis. A review such as this has the potential for significant policy relevance, given the amount of attention governments, development agencies and organisations around the world have dedicated to sponsoring a range of assistance programs targeted to SMEs and aimed at spurring firms' performance regarding innovation, productivity, exports and employment generation. Broader impacts on the economy such as higher wages and poverty reduction are also seen as by-products of such interventions (Beck et al., 2006). However, in spite of their prevalence worldwide, too little is known about the impact of SME support interventions. In a recent survey on SME policies in African countries, McKenzie (2011) shows that African firms are in general small, with up to 10 employees, but very heterogeneous in terms of employment, sales and access to external market. He also shows that although SMEs have been supported in several ways in African countries, rigorous evaluation of such policies is scant. This is surprising given that the SME sector is one of the main targets of international and national aid agencies (Cravo et al., 2014). This research intends to fill part of this gap by summarising systematically the rigorous evaluations done in the field so far, and feeding back the results to policymakers working on this problem worldwide. The policy relevance of this review is increased by the fact that it aims to distill the evidence on what works in Africa, and should therefore be particularly useful to policymakers and donor organisations interested in supporting SMEs in Africa. Among the Africa-specific issues we aim to address with this review, are the question of SMEs' potentially limited contribution to employment in African countries relative to other regions, and, in contrast, the potentially greater contribution to poverty reduction these enterprises may make in the African region in comparison to larger ones. The initial literature search for impact evaluations of indirect business support services suggests the existence of a considerable number of studies for Asian and Latin American low- and middle-income economies. Fajnzylber et al. (2011) and Monteiro and Assunção (2012) use quasi-experimental techniques to analyse the effect of a tax simplification program in Brazil on formalisation and firms' performance. McKenzie and Sakho (2010) use instrumental variable (IV) estimations and provide evidence on how tax registration affects profitability in Bolivia. Mel et al. (2012) study the effect of formalisation on profit, sales, new workers and other outcomes in Sri Lanka using IV estimations and Rand and Torm (2012) use matching and difference-in-difference techniques to assess how formalisation affects profit, access to credit and investment in Vietnam. For the African context, the available evidence is likely to be more limited. However, a detailed, comprehensive search and synthesis of the literature is necessary, with a particular focus on its applicability to the African context. As with the indirect interventions, the initial search of the literature for impact evaluations of direct support services indicates that there is limited evidence for Africa. In one of the few studies available, Mano et al. (2012) conduct a randomised experiment in Ghana to analyse the effect of SME training programs on sales, added value and profit. In the context of low-and-middle income countries as a whole, a considerable amount of evidence is available for Latin America. Benavente and Crespi (2003) analyse the effect of an association strategy on productivity in Chile, using difference-in-difference and matching methods. In another study of the Chilean case, Arraiz et al. (2012) analyse the effect of value chain support on sales, employment and exports using propensity score matching and difference-in-difference estimators. The literature also presents evidence on support for innovation in low- and middle-income countries. Castillo et al. (2011) provide evidence of the impact of process and innovation support on exporting, employment, wages and survival in Argentina, by combining propensity score matching and a difference-in-difference approach. Other studies analyse different types of support. Tan (2009) provides evidence for Chile for different SME programs of technical assistance, cluster programs, technology programs and credit programs on sales, output, employment, wage, productivity and exports. In addition, Ibarraran et al. (2009) study how training programs, access to credit, product innovation and ISO certification affect productivity using instrumental variables and matching methods in Latin American countries. Though most of the papers cited above indicate a positive effect of SME support programs on selected outcomes, there is a need to systematically review and synthesise the evidence to provide an unbiased account of the impact of these programs on firm performance. As the evidence appears to be predominantly from Latin America, its applicability to African countries, or any other context, is not straightforward due to lack of external validity that mark these studies. A comprehensive understanding of the mechanisms underlying the causal chain of an SME intervention is therefore crucial if one is interested in designing SME interventions in different contexts. Therefore, one of the aims of this review is to shed light on the impact of various programs, as well as on the mechanisms that could help us understand why similar programs succeed in some countries or contexts but fail in others. This review has some similarities with another Campbell-registered review, by Grimm and Paffhausen (2013). This other review, however, focuses on employment creation and business creation and will not systematically review evidence on firm performance such as productivity, revenues, profits, innovation, formalisation and access to credit – all of which are the main outcomes of interest of this review. To answer these questions, the research will cover both intermediate outcomes, such as access to credit, training, and formalisation, and final outcomes, such as higher profits, employment generation, productivity and access to external market, and will look for context-specific variables that can help us understand the causal chain of the intervention. We recognise that this is a very challenging exercise to be fully addressed by this systematic review. In fact, the main objective is to shed some light on the potential moderator variables linked to the institutional setting and level of development of each country. Assessing applicability of the results to specific local African context is not an easy task and goes beyond the scope of the systematic review, however, in order to allow the reader to relate the review findings to a specific context, the document will present relevant contextual and implementation information. This review will focus only on studies that evaluate policies aimed at supporting SMEs in low-and middle-income countries (as defined by the World Bank's classification), with an emphasis on African countries wherever possible. The focus on LMICs is justified firstly because private firms in these countries tend to be more labour intensive and less innovative, and consequently are the main employer of a large proportion of the labour force. Secondly, restricting the scope to LMICs helps to identify the binding constraints that SMEs might face in similar institutional contexts, such as in some African countries. The term SME covers a wide range of definitions and measures, varying from country to country and between the sources reporting SME statistics. Some of the commonly used criteria are the number of employees, total net assets, sales and investment level (Ayyagari et al., 2007). The most common criterion used to classify SMEs is based on employment information, due to data availability, and the cut-off used to define SMEs is usually 250 employees4. This review will use this cut-off of 250 employees. Consequently, other types of interventions aimed only at supporting entrepreneurship and the creation of microenterprises, such as microfinance5, will not be part of this research. This is because self-employed and micro-entrepreneurs have a different nature in comparison to SMEs6. The former, especially in LMICs, are comprised of less productive or informal enterprises of few employees in the fringe of the markets. Furthermore, these enterprises are not eligible to most of the public interventions to be covered in this review. Thus, the definition of SME based on number of employees fits well our purpose of covering a broad set of interventions and of considering relevance for African countries7. Since our prior assumption is that there will be only a few studies examining public interventions in African countries, a proper contextualisation of the interventions, a comprehensive understanding of the designs, the target groups, and the moderator variables ranging from those related to firms themselves (size, sector, number of years in operation) to those related to the country where the intervention take place will be crucial to this review. This will allow us to be able to shed some light on whether the intervention has some external validity and consequently whether it could potentially work in an African context.8 In order to address the likely problem of limited evidence, particularly of relevance to Africa, the scope of the review will include all studies identifying final and intermediate outcomes. This will also better inform the causal chain analysis which will help inform our tentative findings about generalisability to African countries. In the studies selected, we will then search for any information on how and why interventions worked or did not work. The literature recommends that synthesis is informed by the theory of change embedded in the design of an intervention (see Waddington et al., 2012b). However, our focus is not only on the impacts directly anticipated by the intervention but also included unanticipated impacts. We will include the following interventions: Formalisation/ Business Environment (Institutional Improvement): such as tax simplification, intended to provide incentives for informal SMEs to formalise. Underlying assumption: that formal firms are less credit-constrained than their informal counterparts and therefore formalisation would be an effective way to help entrepreneurs. Indirect support to SMEs may include policies regarding business registration, property registration and regulatory frameworks (Fajnzylber et al., 2011; Monteiro and Assunção, 2012; McKenzie, 2013). Exports/Access to External Markets: defined as interventions that correct market failures such as information externalities and help SMEs overcome obstacles to exporting (Volpe and Carballo, 2010; Volpe et al., 2010; World Bank, 2010). Support for innovation policies is based on the idea that social returns to innovation exceed private returns (Lundvall and Borras, 2005; Acs and designed to support innovation This review will different types of innovation support such as matching and tax as identified in the preliminary For instance, et al. evaluate the of matching provided after an for et al. (2012) analyse the impact of matching and credit for innovation, and et al. (2011) evaluate the effect of tax on Other of innovation support may also be identified during the search and interventions: defined as interventions that help firms from externalities and overcome the coordination failures that prevent SMEs from these externalities and et al., and technical defined as interventions that provide support for training and technical assistance, based on the idea that improve and wages of workers and to firm productivity et al., 2011; et al., 2007). This type of intervention also services and such as those by the World Bank et al. and et al. (2013). SME and in credit markets generate financial constraints, which in SME activities (Beck and and 2007; et al., The review will in this of interventions that provide or services to SMEs, such as those in World Bank (2010) for credit and in et al. (2009) for credit We note that sub-components within the business support interventions this review analyses may overlap. In this case, it will be important to them as as possible. If there is a analysis will be using detailed information on intervention however, this may not be possible if only a number of are To the of our knowledge, most of the papers the impact of a public policy targeted to SMEs a group with a group comparison group in the of quasi-experimental However, we will be studies that and from studies that have more than two we will also separate the evidence to the intervention In the of for instance, an intervention can use a an cluster or (see et al., have two they identify different and so and they in terms of data different rate, different of and so The selected studies on at one impact to do with outcomes, either or For the of this review, we will define firm performance impacts to to objective such as revenues, profits, innovation, formalisation, number of workers and access to credit. of firm performance impacts will be on and will be outcomes of SME support around better firm performance and growth and therefore can be revenues, profits, employment, productivity, innovation, and survival The following are of studies that we would to include in the review at these outcomes: Mano et (2012) experiment in Ghana to analyse the effect of SMEs training programme on sales, value added and Benavente and (2003) study of the effects of an association strategy on productivity in Arraiz et (2012) of the effect of value chain support on sales, employment and exports in (2009) evaluation of different Chilean SMEs programs for technical assistance, cluster programs, technology programs and credit programs on sales, output, employment, wage, productivity and and Castillo et (2011) study of the effects of process and innovation support on exporting, employment, wages and survival in outcomes vary to the type of but can be defined access to credit, training, tax simplification aimed at firms' formalisation, formalisation rate, policies aimed at improving the value and growth. These are all of direct intervention through outcomes. that provide access to credit aim to allow firms to an economic and/or As the firms in the market and the intended outcomes are survival and in productivity. with SME support related to innovation, training and the value chain the underlying assumption is that more skilled workers and a better value chain will in higher productivity, employment generation, access to markets and others. For instance, Ibarraran et al. (2009) focus on how interventions such as training programs, access to credit, product innovation and certification affect productivity of SMEs in Latin American countries. The review will draw on a broad search to identify studies that relate to the interventions aimed at SMEs in To address to the review will focus on analysis and include only studies that use and quasi-experimental such as design instrumental matching on propensity score matching and any other methods that to for (for example, selected have for the of program or into the and quasi-experimental methods are seen as the the main objective is to estimate the causal impact of an intervention or policy (see for et al., an intervention is designed or the identification strategy of an study convincing the findings on the impact of the program or intervention are to have that one can that the in the outcomes between and was by the This review will thus only studies that assess the impact of an intervention the and the at one or more in In where more than two are the can also involve comparison of the two The studies will therefore be from and data studies that on data show or use a matching to for in using matching for instance, should the intervention of the program to be able to make the that the problem of is due to the studies included will document the impact of any business support on SMEs to as In addition, the review will the impact of different types of business support on firm performance. As in Waddington et al. on studies that use and quasi-experimental methods may the studies that can be included in the review. Although this might be a particularly if one is interested in different interventions, we this because findings of studies that do not for their are of little relevance, and for The search strategy aims to cover as comprehensive a set of and sources as within the period because in the of the interventions of it is most likely that these have been in the formal literature on SMEs or in the literature on the part of national and international

  • Research Article
  • 10.36085/jsai.v8i1.7454
Predicting the Sustainability of Small and Medium Enterprises (SMEs) Using Machine Learning Algorithms
  • Jan 1, 2025
  • JSAI (Journal Scientific and Applied Informatics)
  • Terttiaavini Terttiaavini

Small and Medium Enterprises (SMEs) contribute approximately 60% to Indonesia's Gross Domestic Product (GDP) and absorb more than 97% of the workforce. However, SMEs face various challenges that hinder sustainability, such as limited capital and market instability. This study aims to develop a predictive model to map the sustainability of SMEs based on variables that influence business continuity. The methods used include clustering with Agglomerative Clustering, K-Means, and DBSCAN, as well as classification using algorithms such as Logistic Regression, Random Forest, and XGBoost. The results show that the Agglomerative Clustering method provides the best performance with a Silhouette Score of 0.68. All classification models initially achieved an accuracy of 1.0 with a standard deviation of 0.0, but indicated overfitting due to class imbalance between the "Continues" and "Does Not Continue" categories, where the minority class consists of only 16 data points. To address this issue, the application of the SMOTE (Synthetic Minority Over-sampling Technique) method and 5-Fold Cross-Validation was implemented. The results showed an improvement in the model's ability to recognize patterns in the minority class, making the model's accuracy more representative of both classes. This research is expected to provide valuable insights for the Office of Cooperatives and SMEs in Palembang to support the sustainability of the SME sector in Palembang.

  • Research Article
  • 10.37899/journallamultiapp.v5i5.1541
Impact of Feature Extraction on Multi-Aspect Sentiment Classification for Livin'byMandiri Using BiLSTM
  • Sep 2, 2024
  • Journal La Multiapp
  • Balqis Sayyidahtul Atikah + 2 more

Mobile applications are currently experiencing very rapid development including applications in the financial sector. Livin'byMandiri is one of the mobile applications used to transact online without the need to go to the bank. This makes it very easy for customers to transact anywhere and anytime. Application reviews are user reviews that reflect the reputation of the application among the community, these application reviews can be found anywhere, so many companies use application reviews as a reference in developing their applications in the future. However, people's opinions on apps can vary and are influenced by many aspects. Therefore, aspect-based sentiment analysis can be applied to app reviews to get better results. This research focuses on analyzing the sentiment of Livin'byMandiri app reviews on the Google Play Store. In this research, the Bidirectional LSTM (Bi-LSTM) method is combined with TF-IDF and Word2Vec feature extraction. From the results of the experiments that have been carried out, the best accuracy results for the access aspect are 81.18% and F1-Score of 81.03%, the service aspect produces an accuracy of 82.82% and F1-Score of 82.74%, and for the convenience aspect produces an accuracy of 77.28% and F1-Score of 77.19%. In this experiment, it is also found that feature extraction has an effect on sentiment analysis, this is evidenced by an increase in accuracy of more than 1% for each aspect when TF-IDF feature extraction is added and also the combination of TF-IDF and Word2vec in the initial model built using only the Neural Network embedding layer.

  • Book Chapter
  • 10.1007/978-981-16-8656-6_10
Research on Time Window Prediction and Scoring Model for Trauma-Related Sepsis
  • Jan 1, 2022
  • Ke Luo + 2 more

Based on the MIMIC-III database of the Massachusetts Institute of Technology, this paper studies and analyzes the symptoms of trauma-related sepsis. Use SOFA score as the Inclusion and Exclusion Criteria, extract the relevant patient medical index data with the guidance of a professional clinician. Sequential forward search is applied to search the optimal index combination based on the eXtreme Gradient Boosting (XGBoost) algorithm. Twenty independent replicates perform to obtain 7 key risk indicators (Urea Nitrogen, Prothrombin Time, PO2, Sodium, Red Blood Cells, Carbon Dioxide, International Normalized Ratio). The time window prediction model builds by four machine learning algorithms (decision tree, random forest, decision tree-based adaptive reinforcement (Adaboost) algorithm, XGBoost). The results show that the time window prediction model of trauma-related sepsis has good generalization ability. The prediction effect of the random forest and XGBoost algorithm is better than the other two. Finally, using the multi-factor Logistic regression method build the risk scoring tool for sepsis-induced by trauma-related infection base on the key risk indicators and the opinions of professional clinicians. The results show that the data-driven risk scoring tool can effectively predict the outcome of patients with trauma-related sepsis, which has high clinical significance.KeywordsTrauma-related SepsisBig dataKey risk indicatorsTime window forecastRisk scoreMachine learning

  • Research Article
  • Cite Count Icon 1
  • 10.31849/jieb.v21i1.17184
MICRO, SMALL AND MEDIUM ENTERPRISES (MSME) FINANCIAL MANAGEMENT IN INDONESIA AND MALAYSIA: A COMPARISON
  • Mar 29, 2024
  • Jurnal Ilmiah Ekonomi Dan Bisnis
  • Jeni Wardi + 4 more

This study examines the financial management of Micro Small and Medium Enterprises (MSMEs) in Indonesia and Malaysia which are the drivers of the economy in both countries, but this sector has not been able to become an independent sector and become the foundation of the national economy in both countries. The problem is that financial management in Micro Small and Medium Enterprises (MSMEs) ignores the importance of financial management standards, the problem is that poor financial management makes Micro Small and Medium Enterprises (MSMEs) insignificant in advancing the economy. The method used is descriptive qualitative with a case study approach. Data were obtained from MSME actors by distributing questionnaires and interviews. The results of this study indicate that MSME financial management in Indonesia is not as good as Micro Small and Medium Enterprises (MSMEs) in Malaysia, meaning that Malaysia has better MSME management, this can be seen from various research indicators, namely: planning indicators, budget use, recording, reporting and controlling. Micro Small and Medium Enterprises (MSMEs) in Indonesia and Malaysia when compared to Indonesian Micro Small and Medium Enterprises (MSMEs) do not have good planning, have not carried out standard records, standardized reporting, are not concerned with standard financial statements, balance sheets, profit and loss, cash flow, do not have or install systems in their business units, such as control of systems and procedures, billing records of sales notes, it is very clear that Micro Small and Medium Enterprises (MSMEs) in Indonesia have not done so. Meanwhile, from the other side, when compared to Indonesian MSME respondents, the level is very small, more so for micro cart businesses, small shops that are not in the form of their own buildings, the context is very small. In Malaysia, micro, small and medium enterprises are not comparable to the conditions in Indonesia, while in Malaysia, the Micro Small and Medium Enterprises (MSMEs) already have a more appropriate place. Thus, it is easier for Micro Small and Medium Enterprises (MSMEs) in Malaysia to get banking support, while Micro Small and Medium Enterprises (MSMEs) in Indonesia are still difficult to upgrade and are still difficult to enter the bank compared to Malaysia.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 40
  • 10.3390/rs13122242
Comparative Analysis of Two Machine Learning Algorithms in Predicting Site-Level Net Ecosystem Exchange in Major Biomes
  • Jun 8, 2021
  • Remote Sensing
  • Jianzhao Liu + 12 more

The net ecosystem CO2 exchange (NEE) is a critical parameter for quantifying terrestrial ecosystems and their contributions to the ongoing climate change. The accumulation of ecological data is calling for more advanced quantitative approaches for assisting NEE prediction. In this study, we applied two widely used machine learning algorithms, Random Forest (RF) and Extreme Gradient Boosting (XGBoost), to build models for simulating NEE in major biomes based on the FLUXNET dataset. Both models accurately predicted NEE in all biomes, while XGBoost had higher computational efficiency (6~62 times faster than RF). Among environmental variables, net solar radiation, soil water content, and soil temperature are the most important variables, while precipitation and wind speed are less important variables in simulating temporal variations of site-level NEE as shown by both models. Both models perform consistently well for extreme climate conditions. Extreme heat and dryness led to much worse model performance in grassland (extreme heat: R2 = 0.66~0.71, normal: R2 = 0.78~0.81; extreme dryness: R2 = 0.14~0.30, normal: R2 = 0.54~0.55), but the impact on forest is less (extreme heat: R2 = 0.50~0.78, normal: R2 = 0.59~0.87; extreme dryness: R2 = 0.86~0.90, normal: R2 = 0.81~0.85). Extreme wet condition did not change model performance in forest ecosystems (with R2 changing −0.03~0.03 compared with normal) but led to substantial reduction in model performance in cropland (with R2 decreasing 0.20~0.27 compared with normal). Extreme cold condition did not lead to much changes in model performance in forest and woody savannas (with R2 decreasing 0.01~0.08 and 0.09 compared with normal, respectively). Our study showed that both models need training samples at daily timesteps of >2.5 years to reach a good model performance and >5.4 years of daily samples to reach an optimal model performance. In summary, both RF and XGBoost are applicable machine learning algorithms for predicting ecosystem NEE, and XGBoost algorithm is more feasible than RF in terms of accuracy and efficiency.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/iccaie.2011.6162165
A hybrid framework of Digital Business Ecosystem for Malaysian small and medium Enterprises (SMEs)
  • Dec 1, 2011
  • Muhammad Abdul Tawab Khalil + 4 more

Digital Business Ecosystem (DBE), as it sounds, is a digitized form of business ecosystem amalgamating ICT (Information and Communication Technology) with business networks. Malaysian small and medium enterprises (SME) can be tuned into a collaborative and interdependent socio-economic business environment when they are jacketed with ecosystem. In a DBE, small and medium enterprises are provided with the freedom to integrate their services across organization and turn them into offerings. This helps them in many ways resulting in improved performance of employees and customer satisfaction. Surveys have shown that Perceived Usefulness (PU) and Perceived Ease of Use of SME employees are some of the key variables which are amplified. In Malaysian context, small and medium enterprises are very important to regional economy which is largely dependent open state market. DBE links the small and medium enterprises in a way to create a win-win situation for all the stake holders. The paper provides an evolutionary framework for small and medium enterprises which is hoped to achieve the long standing goal of a true collaborated network. It should help small and medium enterprises climb up the ladder of Eadoption by taking a step further ahead of ecommerce and ebusiness.

  • Conference Article
  • Cite Count Icon 7
  • 10.1109/fit.2011.37
A Study to Examine If Integration of Technology Acceptance Model's (TAM) Features Help in Building a Hybrid Digital Business Ecosystem Framework for Small and Medium Enterprises (SMEs)
  • Dec 1, 2011
  • Muhammad Abdul Tawab Khalil + 3 more

Digital Business Ecosystem (DBE) is a rapidly growing technology which caters online businesses in both sectors of Business to Consumer (B2C) and Business to Business (B2B). It takes networked business organizations into account and helps them in digitally establishing inter and intra-organizational link. We are focusing on Small and Medium Enterprises (SME) which provides healthy breeding ground for Digital Business Ecosystem framework mainly due to the reason that small and medium enterprises depict more adaptability to own the technology. Digital Business Ecosystem fuses ICT (Information and Communication Technology) with business network of small and medium enterprises. The character of being eco provides the due collaboration among the peer organizations and with the consumers as well. Digital Business Ecosystem allows small and medium enterprises to create and integrate their services. In pursuit of a hybrid framework for Digital Business Ecosystem, one of the critical factors for its future implementation is the willingness of employees of small and medium enterprises. Technology Acceptance Model (TAM) suggests that acceptance of a new technology depends upon Perceived Usefulness (PU) and Perceived Ease of Use (PEOU) of the users which are employees of small and medium enterprises in our study. In this paper, we identify and discuss different DBE features and judge the Perceived Usefulness and Perceived Ease of Use of SME Employees by a pilot study. The result of the preliminary but comprehensive survey is seen as vital for further study on Digital Business Ecosystem and its subsequent implementation.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant