LACK OF RELIABILITY of scores was pointed out by Holroyd (148) as the first defect of the essay test. The reasons given for the unreliability of scores were the lack of objectivity and the influence of such factors as English construction, spelling, penmanship, neatness, arrangement of form, sympathy for the hard working but slow student, general improvement, and personal attributes on the grade. Other criticisms were: (a) restricted usefulness with almost no opportunity for diagnosis; (b) encouragement of cramming; (c) little basis for comparison between students or classes; (d) encouragement of bluffing; (e) consumption of an overshare of students' and instructor's time; (f) lack of any known formula for correction of guessing, as in objective examination; and (g) the restricted range of material that can be tested in a given time. Criticisms of the essay test as used by many English teachers were listed by Stalnaker (157). The first objection is that teachers try to teach the pupil to write charming bits of nonsense on subjects of no interest to him instead of aiding him to express himself clearly and accurately within the range of his interests and abilities. Another weakness is the vagueness in the instructions given the students. The essay test is rarely read with a reliability of over .60, when it should be read with at least a reliability of .90. Explanations offered for this inconsistency in the rating of essay tests or themes were: (a) disagreement among masters in English on what constitutes a good theme, (b) the influence of the reader's physical condition on his grades, (c) the objection to grading a theme high, and (d) the traditional use of optional topics. Kandel (150) offered as his objections to the essay examination the unreliability of scoring and the time involved in the construction and marking of the tests. Wrightstone (160) objected to the essay tests on the basis of (a) time-consumption, (b) narrow range of information tested, and (c) unreliability and subjectivity variations in grading. One of the real limitations of the essay test in actual practice may be that it is not measuring what it is assumed to measure. Doty (146) analyzed the essay test items and answers for 214 different items prepared by teachers in fifth and sixth grades and found that only twelve of these items, less than 6 percent, unquestionably measured something more than recall. Doty set up a number of criteria for determining whether the answers involved a significant amount of reorganization of knowledge,
Read full abstract