Data Cleaning in Text File

Arup Kumar Bhattacharjee

doi:10.9790/0661-0921721

Data Cleaning in Text File

Arup Kumar Bhattacharjee

Open Access

https://doi.org/10.9790/0661-0921721

Copy DOI

Journal: IOSR Journal of Computer Engineering	Publication Date: Jan 1, 2013
Citations: 1

#Text File #Numeric Errors + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Data cleaning is an automated process of detecting, removing and correcting incomplete, incorrect, inaccurate and irrelevant data from a record set. Our system works on simple text (*.txt) files using Extract, Transform and Load (ETL) model. In this paper we present a set of algorithms to correct errors such as alpha- numeric errors, invalid gender, invalid ID pattern and redundant ID error. The text files are used as data storage which stores data in a tabular format and the algorithms are applied on each field value depending on its nature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: IOSR Journal of Computer Engineering

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.