Since only a few points' difference distinguishes treatment groups in Alzheimer's clinical trials, errors put trial outcomes at risk. The initial publication and subsequent ADAS-cog manuals do not provide great detail on administration procedures, nor on rating language or drawings, and the global language items require formal language assessment skills typical raters do not have. Despite training and certification programs, raters make errors; some are identified and addressed in central monitoring programs. An independent blinded review by a neuropsychologist of a sample of ADAS-cog data from a phase 2 clinical trial in subjects with Mild to Moderate Probable AD was requested to determine the quality of the data. The review revealed a pattern of errors typical of paper and pencil tests that would be precluded by computerized testing. Review of a sample of 102 subjects, 6 visits each, focused on accuracy of the 10 psychometric item scores for each ADAS-cog 13, the three global language ratings were not reviewed for accuracy. Not all errors were counted: Identifying Constructional Praxis errors was minimized as scoring is subjective, discordance between Orientation scoring and written responses was not counted. 220 (3.7%) ADAS-cog psychometric item scores were incorrect. Four items accounted for most of the errors, and some sites contributed disproportionately to the number of errors: Remembering Test Instructions (49% of the 220 incorrect item scores; 30% of the sites made 84% of these errors), Word Recognition (22%; 8.5% of sites made 69% of errors), Digit Cancellation (15%; 11% of sites made 45% of errors). International sites made more errors than US sites. Constructional Praxis accounted for 10% of the errors with wide variation in what was considered correct/incorrect. Computerized edit checks comparing Word Recall, Delayed Word Recall, and Word Recognition identified potential scoring of correct rather than incorrect responses.