Abstract
The analysis on the financial data is highly crucial and critical as the results or the conclusion communicated based on the analysis can generate a greater impact on the personal and enterprise scale business processes. The primary source of the financial data is the business process and often the data is collected by automation tools deployed at various points of the business process data flow. The data entered in the business process is primary done by the stake holders of the process and at various levels of the process the data is modified, translated and sometimes completed transverter, due to which the impurities or anomalies are introduced in the data. These impurities, such as outliers and missing values, cause a high impact on the final decision after processing these datasets. Hence an appropriate pre-processing for financial data is the demand of the research. A good number of parallel research outcomes can be observed to solve these problems. Nonetheless, majority of the solutions are either highly time complex or not accurate effectively. Thus, this work proposes an automated framework for identification and imputation of the outliers using the iterative clustering method, identification and imputation of the missing values using Differential count based binary iterations method and finally the secure data storage using regression based key generation. The proposed framework has showcased nearly 100% accuracy in detection of outliers and missing values with highly improved time complexity.
Highlights
The financial data is primarily considered to be time series data, which is variant to the time
The financial data analysis is the recent trend in research and many parallel research outcomes are focusing primarily on the aspect of cleaning the data for making the dataset completely anomaly free
In the due course of study, this work identified that, firstly, the detections of the outliers are primarily focused on the mean deviation, which is a time complex process, needs to be optimized
Summary
The financial data is primarily considered to be time series data, which is variant to the time. The major complexity with the time series data analysis is two. The speed of data change is very high. The algorithm designed to analyse the data, must be highly time efficient. The time series data is collected from various sources, the format of the data is highly critical. These problems are well furnished in the work by C. The time series data sets are the only option for a valid data analysis for making financial decisions as these financial decisions are expected to be highly time dependent [13]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Advanced Computer Science and Applications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.