LSTM neural network combined with data mining techniques for financial crisis early warning model construction

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Inderscience is a global company, a dynamic leading independent journal publisher disseminates the latest research across the broad fields of science, engineering and technology; management, public and business administration; environment, ecological economics and sustainable development; computing, ICT and internet/web services, and related areas.

Similar Papers
  • Front Matter
  • Cite Count Icon 10
  • 10.2174/1874431101004020021
Data Mining Techniques in Medical Informatics
  • May 28, 2010
  • The Open Medical Informatics Journal
  • U Rajendra Acharya + 1 more

The advent of high-performance computing has benefited various disciplines in finding practical solutions to their problems, and our health care is no exception to this.Signal processing, image processing, and data mining tools have been developed for effective analysis of medical information, in order to help clinicians in making better diagnosis for treatment purposes.Data mining has become a fundamental methodology for computing applications in medical informatics.Progress in data mining applications and its implications are manifested in the areas of information management in healthcare organizations, health informatics, epidemiology, patient care and monitoring systems, assistive technology, large-scale image analysis to information extraction and automatic identification of unknown classes.Various algorithms associated with data mining have significantly helped to understand medical data more clearly, by distinguishing pathological data from normal data, for supporting decision-making as well as visualization and identification of hidden complex relationships between diagnostic features of different patient groups.There are nine papers in this Special issue, covering different areas in medical informatics.

  • Research Article
  • Cite Count Icon 18
  • 10.4067/s0718-18762014000100001
Editorial: Data Mining in Electronic Commerce - Support vs. Confidence
  • Jan 1, 2014
  • Journal of theoretical and applied electronic commerce research
  • Cesar Astudillo + 2 more

In the year 2001, one of the authors of this editorial wrote an article about support versus confidence in the data mining technique, association rules. […]

  • Research Article
  • Cite Count Icon 35
  • 10.17485/ijst/2016/v9i39/102078
Prediction of Heart Disease using Data Mining Techniques
  • Oct 24, 2016
  • Indian Journal of Science and Technology
  • S Kiruthika Devi + 2 more

Objectives: The objective of our work is to analyse various data mining tools and techniques in health care domain that can be employed in prediction of heart disease system and their efficient diagnosis. Methods/Statistical Analysis: A heart disease prediction model, which implements data mining technique, can help the medical practitioners in detecting the heart disease status based on the patient’s clinical data. Data mining classification techniques for good decision making in the field of health care addressed are namely Decision trees, Naive Bayes, Neural Networks and Support Vector Machines. Hybridizing or combining any of these algorithms helps to make decisions quicker and more precise. Findings: Data mining is a powerful new technology for the extraction of hidden predictive and actionable information from large databases that can be used to gain deep and novel insights. Using advanced data mining techniques to excavate valuable information, has been considered as an activist approach to improve the quality and accuracy of healthcare service while lowering the healthcare cost and diagnosis time. Using this technique presence of heart disease can be predicted accurately. Using more input attributes such as controllable and uncontrollable risk factors, more accurate results could be achieved. Applications/Improvements: This method can be further expanded. It can use many of input attributes. Other data mining techniques are also be used for predication such as Clustering, Time series, Association rules. The unstructured data available in healthcare industry database can also be mined using text mining.

  • Research Article
  • Cite Count Icon 3
  • 10.11648/j.se.20180604.13
Risk Assessment Predictive Modelling in Ethiopian Insurance Industry Using Data Mining
  • Jan 18, 2019
  • Sisay Wuyu + 1 more

Risk management has long been a topic worth pursuing, and indeed several industries are based on its successful applications, insurance companies and banks being the most notable. Data Mining (DM) - is one of the most effective alternatives to extract knowledge from the great volume of data, discovering hidden relationships, patterns and generating rules to predict and correlate data, that can help the institutions in faster decision-making or, even reach a bigger degree of confidence. This research was conducted in a form of case study in the Ethiopian Insurance Corporation (EIC) at its main branch located at Legehar- Addis Ababa. The general objective of the study is to examine the potential of data mining tools and techniques in developing models that could help in the effort of Risk level pattern analysis with the aim of supporting insurance risk assessment activities at EIC. In this research two data mining technique which are decision tree and neural network. The best decision tree model, which is selected as a working model among the numerous models generated during the training phase, was able to correctly classify 75% percent of the 3100 policies in the validation data set. 96% of low-risk policies were correctly classified. Significant number of misclassification was observed on high risk level. The output of these experiments indicated that the classification task of records using the Risk level, both decision tree and neural network have performed with significant error. Decision tree has shown an accuracy rate of 75 percent while neural networks classified 58% records correctly. The overall performance of decision tree was better in classifying values than neural network.

  • Research Article
  • Cite Count Icon 5
  • 10.5897/sre11.817
Signal based approach for data mining in fault detection of induction motor
  • Oct 31, 2011
  • Scientific Research and Essays
  • Selim Güllülü + 1 more

The aim of this paper is to introduce a new method which combines data mining and signal processing techniques for identifying potential faults in electric motors. The vibration signals measured in the initial (healthy) state of the electric motor are used as source data for application of data mining technique. In this sense, a new data mining technique is introduced by the definition of a feature transfer function application which is best on the Continuous Wavelet Transform. Hence it constitutes a blind algorithm which can extract the features that are hidden in the data and also all characteristic features are detected by an auto associative neural network from the error variation. Key words: Signal processing, data mining, wavelet transform, neural networks, feature extraction.

  • Conference Article
  • Cite Count Icon 9
  • 10.1109/iccs.2018.00039
Performance Analysis of Data Mining Techniques in IoT
  • Aug 1, 2018
  • Isha Batra + 2 more

Internet of Things (IoT) accumulates bulk of data from heterogeneous devices implanted with sensors. This data is accumulated over a period of time from sensory devices and is maintained on a server. To take optimal decisions in real time, meaningful information need to be extracted out from the data accumulated. Numerous data mining (DM) techniques are available for analyzing the data and then to make future predictions based on the discovered information. The number of devices connections in IoT is expected to reach 25 to 30 billion by 2020 and so as the new applications of IoT are going to emerge. Therefore, an efficient and fast DM technique is required to make predictions and take decisions in real time to maintain the goodwill of IoT in society. This paper first discusses three different DM techniques: classification, clustering, and association based mining and their possible combinations. Later, this work highlights the applications of using a particular DM technique in IoT. Finally, a comparative analysis is made among each DM technique on the basis of its precision, accuracy and recall value. This will lead to identity the best DM technique that can be applied in IoT.

  • Conference Article
  • Cite Count Icon 18
  • 10.2118/163829-ms
Development of the Brittle Shale Fracture Network Model
  • Feb 4, 2013
  • Amir M Nejad + 5 more

This paper discusses the workflow for the development of a brittle shale model using a data mining approach. A database of more than 1,000 fracture stages and associated microseismic mapping results in the Barnett Shale was assembled. The fracture database is comprised of fracture design parameters including treatment volumes, rates, proppant mass and size, perforation length, fracture pressure, surface pressure trend and fracture dimensions on horizontal well bores. The goal of this analysis is to establish the relationship between frac design, pressure and frac network geometry. Data mining techniques are used on this complex database to find possible hidden relationships to explain the nature of the data. The outcome of this study is to develop a predictive model for fracture networks in shale. Also, using the predictive model, improvements in the current fracture design in Barnett shale are made. Various aspects of this dataset are examined using data modeling and mining techniques including self-organizing maps (SOM). SOMs are unsupervised artificial neural networks that can cluster large amounts of data into two dimensional maps. Using SOM, frac design parameters are clustered and studied in depth. Then, a forward predictive neural network model is trained with fracture design parameters as inputs and fracture network length, width, height, and fracture volume as outputs. The network is trained with the help of genetic algorithm (GA). Sensitivity study on the trained network demonstrates the effect of different parameters on the fracture geometry. For example, an increase in slick-water volume will have a positive effect on fracture network width and length and negative effect on height. On the other hand, higher injection rates tend to accelerate height growth. Perforation length is also having a negative impact on the total stimulated or affected reservoir volume and tighter perforation designs are preferred. The results of this work potentially helps understanding of the development of fracture networks in shale reservoirs and the recommendations on improving stimulated reservoir volume. This will potentially help operators on more effective treatment designs and reducing the operational costs associated with fracturing in a brittle shale environment.

  • Research Article
  • Cite Count Icon 12
  • 10.4236/iim.2015.73014
Comparing Data Mining Techniques in HIV Testing Prediction
  • Jan 1, 2015
  • Intelligent Information Management
  • Tesfay Gidey Hailu

Introduction: The present work compared the prediction power of the different data mining techniques used to develop the HIV testing prediction model. Four popular data mining algorithms (Decision tree, Naive Bayes, Neural network, logistic regression) were used to build the model that predicts whether an individual was being tested for HIV among adults in Ethiopia using EDHS 2011. The final experimentation results indicated that the decision tree (random tree algorithm) performed the best with accuracy of 96%, the decision tree induction method (J48) came out to be the second best with a classification accuracy of 79%, followed by neural network (78%). Logistic regression has also achieved the least classification accuracy of 74%. Objectives: The objective of this study is to compare the prediction power of the different data mining techniques used to develop the HIV testing prediction model. Methods: Cross-Industry Standard Process for Data Mining (CRISP-DM) was used to predict the model for HIV testing and explore association rules between HIV testing and the selected attributes. Data preprocessing was performed and missing values for the categorical variable were replaced by the modal value of the variable. Different data mining techniques were used to build the predictive model. Results: The target dataset contained 30,625 study participants. Out of which 16,515 (54%) participants were women while the rest 14,110 (46%) were men. The age of the participants in the dataset ranged from 15 to 59 years old with modal age of 15 - 19 years old. Among the study participants, 17,719 (58%) have never been tested for HIV while the rest 12,906 (42%) had been tested. Residence, educational level, wealth index, HIV related stigma, knowledge related to HIV, region, age group, risky sexual behaviour attributes, knowledge about where to test for HIV and knowledge on family planning through mass media were found to be predictors for HIV testing. Conclusion and Recommendation: The results obtained from this research reveal that data mining is crucial in extracting relevant information for the effective utilization of HIV testing services which has clinical, community and public health importance at all levels. It is vital to apply different data mining techniques for the same settings and compare the model performances (based on accuracy, sensitivity, and specificity) with each other. Furthermore, this study would also invite interested researchers to explore more on the application of data mining techniques in healthcare industry or else in related and similar settings for the future.

  • Book Chapter
  • 10.1201/9781003144526-3
A Review of Different Data Mining Techniques Used in Big Data Applications
  • Dec 20, 2021
  • Chandrakanta Mahanty + 2 more

Big data includes a substantial amount of unorganized, unclear, and inaccurate data. Big data can’t be treated by using traditional data handling tools. Big data is produced from many fields, including analytics, health care, social networking, and so on. Big Data analytics is the capacity to generate beneficial data from such enormous datasets. The capacity to extract precious information from these large data sets is the mining method of large data. In the field of Big Data Analytics, data mining (DM) techniques provide great assistance. DM techniques such as Classification techniques (Naïve Bayes, Neural network, Decision tree, Genetic algorithm, Clustering algorithms (K-nearest neighbors (KNN), support vector machine (SVM)), Regression, Association Rules Mining, Rough Sets Analysis are used to process massive data. In this chapter of the book, we evaluate how distinct DM methods have developed to deal with big data analysis methods. A comprehensive study is presented on various processes of big DM techniques such as dimensionality reduction, clustering, and classification for big data analysis. This book chapter analyzes DM algorithms implemented with Hadoop and Spark technology. Using the MapReduce framework, we also addressed hybrid DM algorithms. We discussed neutrosophic association rule mining for mining big data efficiently and effectively. We also focus on how various techniques of DM are used in the fields of health care and agriculture to fix big data problems.

  • Research Article
  • Cite Count Icon 11
  • 10.4025/actascitechnol.v24i0.2549
Parâmetros na escolha de técnicas e ferramentas de mineração de dados
  • Jan 1, 2002
  • Acta Scientiarum-technology
  • Maria Madalena Dias

Apesar da existência de técnicas e ferramentas de mineração de dados, muitas organizações ainda desconhecem o quanto o computador pode dar suporte à tomada de decisão. A pouca utilização dessas técnicas e ferramentas pode estar relacionada à dificuldade na escolha da técnica e/ou ferramenta de mineração de dados mais adequada ao tipo de aplicação. A escolha da técnica de mineração de dados depende do problema de negócio a ser solucionado e das características dos dados disponíveis para análise, enquanto que na escolha da ferramenta de mineração de dados deve-se levar em consideração vários parâmetros, tais como: características gerais da ferramenta, conexão a bancos de dados, critérios de desempenho computacional, critérios de funcionalidade, critérios de usabilidade, etc. Neste artigo são apresentados parâmetros a serem considerados na escolha da técnica e da ferramenta de mineração de dados, sugeridos por vários autores. Também são mostrados os resultados obtidos com a aplicação de duas técnicas de mineração de dados.

  • Book Chapter
  • 10.1007/978-3-031-14841-5_33
Estimation of the Local and Global Coherence of Ukrainian Texts Using Transformer-Based, LSTM, and Graph Neural Networks
  • Jan 1, 2022
  • Artem Kramov + 1 more

In this paper, the different models for the estimation of both local and global coherence of Ukrainian-language texts have been considered. In order to evaluate the local coherence of a document, Transformer-based and LSTM neural networks have been proposed with further training on a Ukrainian-language news corpus. It has been shown that the LSTM-based approach outperforms the corresponding network based on the Transformer architecture according to the accuracy metrics while solving typical tasks on both test datasets. In order to investigate the connection between sentences revealed by the neural network, the Uniform Manifold Approximation and Projection dimension reduction technique has been utilized for the projection of sentences’ embedding into 2D space. The clusters obtained may indicate the consideration of both the structure of a sentence and different types of connections between them by the designed model. In order to estimate the global coherence of a document, a model based on a graph convolutional neural network has been suggested. The appropriateness of taking into account the connection between all sentences despite their positions has been shown. The results obtained for the designed and trained global coherence estimation model may indicate the different aspects of the analysis of a text by the designed models that can lead to the usage of both local and global coherence estimation models according to an assigned task.KeywordsLocal and global coherence of a documentTransformer-based neural networkSentence embeddingGraph convolutional networkUkrainian corpora

  • Research Article
  • Cite Count Icon 1
  • 10.13088/jiis.2012.18.4.059
Development of Predictive Models for Rights Issues Using Financial Analysis Indices and Decision Tree Technique
  • Jan 1, 2012
  • Journal of Intelligence and Information Systems
  • Myeong-Kyun Kim + 1 more

This study focuses on predicting which firms will increase capital by issuing new stocks in the near future. Many stakeholders, including banks, credit rating agencies and investors, performs a variety of analyses for firms' growth, profitability, stability, activity, productivity, etc., and regularly report the firms' financial analysis indices. In the paper, we develop predictive models for rights issues using these financial analysis indices and data mining techniques. This study approaches to building the predictive models from the perspective of two different analyses. The first is the analysis period. We divide the analysis period into before and after the IMF financial crisis, and examine whether there is the difference between the two periods. The second is the prediction time. In order to predict when firms increase capital by issuing new stocks, the prediction time is categorized as one year, two years and three years later. Therefore Total six prediction models are developed and analyzed. In this paper, we employ the decision tree technique to build the prediction models for rights issues. The decision tree is the most widely used prediction method which builds decision trees to label or categorize cases into a set of known classes. In contrast to neural networks, logistic regression and SVM, decision tree techniques are well suited for high-dimensional applications and have strong explanation capabilities. There are well-known decision tree induction algorithms such as CHAID, CART, QUEST, C5.0, etc. Among them, we use C5.0 algorithm which is the most recently developed algorithm and yields performance better than other algorithms. We obtained data for the rights issue and financial analysis from TS2000 of Korea Listed Companies Association. A record of financial analysis data is consisted of 89 variables which include 9 growth indices, 30 profitability indices, 23 stability indices, 6 activity indices and 8 productivity indices. For the model building and test, we used 10,925 financial analysis data of total 658 listed firms. PASW Modeler 13 was used to build C5.0 decision trees for the six prediction models. Total 84 variables among financial analysis data are selected as the input variables of each model, and the rights issue status (issued or not issued) is defined as the output variable. To develop prediction models using C5.0 node (Node Options: Output type = Rule set, Use boosting = false, Cross-validate = false, Mode = Simple, Favor = Generality), we used 60% of data for model building and 40% of data for model test. The results of experimental analysis show that the prediction accuracies of data after the IMF financial crisis (59.04% to 60.43%) are about 10 percent higher than ones before IMF financial crisis (68.78% to 71.41%). These results indicate that since the IMF financial crisis, the reliability of financial analysis indices has increased and the firm intention of rights issue has been more obvious. The experiment results also show that the stability-related indices have a major impact on conducting rights issue in the case of short-term prediction. On the other hand, the long-term prediction of conducting rights issue is affected by financial analysis indices on profitability, stability, activity and productivity. All the prediction models include the industry code as one of significant variables. This means that companies in different types of industries show their different types of patterns for rights issue. We conclude that it is desirable for stakeholders to take into account stability-related indices and more various financial analysis indices for short-term prediction and long-term prediction, respectively. The current study has several limitations. First, we need to compare the differences in accuracy by using different data mining techniques such as neural networks, logistic regression and SVM. Second, we are required to develop and to evaluate new prediction models including variables which research in the theory of capital structure has mentioned about the relevance to rights issue.

  • Research Article
  • Cite Count Icon 12
  • 10.1111/j.1468-0394.2008.00446.x
Data mining technique for medical informatics: detecting gastric cancer using case‐based reasoning and single nucleotide polymorphisms
  • Apr 16, 2008
  • Expert Systems
  • Se‐Chul Chun + 4 more

Abstract: Although data mining and knowledge discovery techniques have recently been used to diagnose human disease, little research has been conducted on disease diagnostic modelling using human gene information. Furthermore, to our knowledge, no study has reported on diagnosis models using single nucleotide polymorphism (SNP) information. A disease diagnosis model using data mining techniques and SNP information should prove promising from a practical perspective as more information on human genes becomes available. Data mining and knowledge discovery techniques can be put to practical use detecting human disease, since a haplotype analysis using high‐density SNP markers has gained great attention for evaluating human genes related to various human diseases. This paper explores how data mining and knowledge discovery can be applied to medical informatics using human gene information. As an example, we applied case‐based reasoning to a cancer detection problem using human gene information and SNP analysis because case‐based reasoning has been applied in medicine relatively less often than other data mining techniques. We propose a modified case‐based reasoning method that is appropriate for associated categorical variables to use in detecting gastric cancer.

  • Research Article
  • 10.2139/ssrn.300679
Use of Recurrent Neural Networks for Strategic Data Mining of Sales
  • Apr 3, 2002
  • SSRN Electronic Journal
  • Jayavel Shanmugasundaram + 3 more

An increasing number of organizations are involved in the development of strategic information systems for effective linkages with their suppliers, customers, and other channel partners involved in transportation, distribution, warehousing and maintenance activities. An efficient inter-organizational inventory management system based on data mining techniques is a significant step in this direction. This paper discusses the use of neural network based data mining and knowledge discovery techniques to optimize inventory levels in a large medical distribution company. The paper defines the inventory patterns, describes the process of constructing and choosing an appropriate neural network, and highlights problems related to mining of very large quantities of data. The paper identifies the strategic data mining techniques used to address the problem of estimating the future sales of medical products using past sales data. We have used recurrent neural networks to predict future sales because of their power to generalize trends and their ability to store relevant information about past sales. The paper introduces the problem domain and describes the implementation of a distributed recurrent neural network using the real time recurrent learning algorithm. We then describe the validation of this implementation by providing results of tests with well-known examples from the literature. The description and analysis of the predictions made on real world data from a large medical distribution company are then presented.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/iccsec.2017.8446725
Modeling of Heat Supply Heating Load Based on LSTM Neural Network
  • Dec 1, 2017
  • Li Qi + 1 more

The central heating system structure is complex, and there are serious hysteresis, strong coupling, nonlinear characteristics, Aiming at the problem that it is difficult to identify through mechanism modeling, a modeling method of heat source energy saving control for central heating system based on LSTM neural network is proposed. A heat supply load modeling method for central heating system is established by using LSTM neural network, to predict the backwater temperature of primary heat source, the neural network model is built by using Tensorflow as the calculation framework, compared with the traditional BP neural network modeling, the simulation results show that the heat source model established by LSTM neural network is better than the BP neural network model, and can effectively improve the modeling accuracy and model generalization ability, meet the demand of heat source heating load modeling. After analysis it can be concluded that the neural network modeling method based on data-driven modeling is more simple and has more reference value than traditional experience modeling and mechanism modeling, and provides an effective method for modeling other complex nonlinear systems.

Save Icon
Up Arrow
Open/Close