Abstract

This paper reviews some of the recently developed computer methods for checking data quality. The work started in the USA and Canada, but a substantial project is now underway at Newcastle based on these methods. It is commonplace today to process very large quantities of data on computers, not only in censuses and surveys, but also in other applications. The object of this paper is to review computerised methods of ensuring good data quality. The procedures divide into three types. First, there are procedures to check the quality of data coding and key punching operations. Secondly, there are theories of logical and arithmetic edits, which can assist both in detecting erroneous records, and also in locating erroneous fields on these erroneous records. Thirdly, there are statistical checks which can be run on the data once it is input. The simultaneous application of these procedures can achieve very high quality at reasonable costs. A popular (in theory at least) method of controlling data quality is the use of double or triple entry, but in any case this clearly does not identify logical and arithmetic errors already existing on the data sheets. Minton (1969) drew attention to the defects in this type of verification procedure and devised a new approach outlined below, based on sampling of the processed data. Fellegi & Holt (1976) were the first to propose the use of theories of logical edits. Their system facilitates the verification of sets of edits, and enables these edits to be used to help to pinpoint errors. This paper will not be concerned with procedures which are not computer-aided, such as interviewer training systems for feedback of error types to field staff, etc. We shall also ignore aspects of survey design and of questionnaire design, which can be computer aided. Some statistician feel that errors in data can often be detected at the analysis stage, so that the data entry phase can be rushed through. We feel that this is a mistaken view, and that the procedures outlined below should have a widespread application.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.