Abstract
With the introduction of computers into our lives, digital data sizes are increasing gradually. Non-standard values (outliers) which behave differently from the others can be found in these data produced in the digital world. Detection of these values, especially in big data sets; has great importance in fields such as security, insurance, finance, medicine and genetics. Clustering methods of data mining techniques are frequently used in outlier detection in big data sets. Density based DBSCAN (Density-based spatial clustering of applications with noise) algorithm from clustering algorithms which are sensitive to noisy and outlier values is one of the most important methods in outlier detection. In this study, an application was developed using DBSCAN algorithm in C# programming language for the detection of outliers. In the developed application, 2 data sets with different data numbers were examined and analyzed. For the shortest possible data analysis time, serial and parallel programming techniques were used separately. In order to shorten the analysis time of big data sets, parallel class members in TPL (Task Parallel Library) provided with .Net 4.0 were used. In series of analysis of data sets, it was seen that DBSCAN algorithm produces more accurate results and is more practicable than other selected algorithms in terms of outlier detection. When considered in terms of computing performance, parallel programming has become more efficient as the number of data increases.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Bilecik Şeyh Edebali Üniversitesi Fen Bilimleri Dergisi
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.