Teaching competence is not easily documented The university teacher is one of few professionals whose work is seldom observed by his peers. His reputation is often based more on hearsay than on substantive evidence. For this reason, promotion committees frequently make final tenure decisions without seriously considering information about teaching. To counter this tendency, many campuses have recently instituted a system of student instructional ratings. Unfortunately, this approach has little more credibility with faculty than hearsay system. Even though students are only people who observe a teacher's course daily, many faculty seriously question reliability and validity of student instructional ratings. My present purpose is to analyze these issues in light of recent research developments and to indicate what implications this work may have for faculty evaluation process. Although there are several acceptable procedures for assessing reliability of a measuring instrument, most meaningful index of reliability for student rating data is extent to which students agree about strengths and weaknesses of different teachers. In my Science report of October 5, 1973, 13 different instructors were rated by students in two multisection calculus courses. The ability of students to discriminate among teachers on several different traits was examined. When asked to comment on clarity of teacher's presentations, students were consistent in selecting same teachers as and worst instructors on this trait. Statistical analysis of this outcome indicated that a difference as large as that observed would be expected on a chance basis less than one time in a million. When students were asked to comment on work load in their calculus classes, differences in their mean ratings for each instructor were also extraordinarily reliabla These results suggest that students can clearly discriminate among performances of different instructors. To analyze this reliability issue in more detail, I re-examined some data from an investigation which I had previously reported in Spring 1973 issue of American Educational Research Journal. Ratings from 72 students who had enrolled in both biology and organic chemistry were analyzed. The students had been asked to respond on a sevenpoint rating scale to item the teacher explained concepts clearly and made good use of examples and illustrations. Their responses were analyzed with a two-way analysis of variance (2 instructors x 72 raters). The point of interest was whether variation among ratings could be attributed to differences among students or to differences between teachers. My analysis indicated that 44 percent of variance could be attributed to differences between teachers, 5 percent of variance was attributable to differences among student raters in their tendencies to use high or low ratings in general, and 51 percent of variance was attributable to differences among students in way they perceived two teachers. This result clearly indicates that student ratings provide considerable information about teacher. It is a rare event when an independent variable in a behavioral science experiment accounts for more than 10 percent of total variance. The fact that 44 percent of variance in my ratings was related to instructors argues strongly that students can function as reliable observers. This demonstration that students can make consistent ratings does not necessarily mean, however, that ratings are valid in sense of indicating which instructors are best This validity issue is more difficult to analyze than reliability issue. Some faculty insist that their real goal in is to impart philosophical values or to inculcate a special attitude toward learning rather frhfln to simply help student to master subject matter. Intangible outcomes such as these are not easily measured, and therefore it is difficult to determine whether teachers who successfully accomplish these goals are also ones who are rated highly by their students. Only if we define teaching in a relatively narrow way, such as preparing students to solve calculus problems on a written examination, is it feasible to determine if some teachers do a more adequate job than others. In Science, I recently reported evidence indicating that teachers who were best at preparing their students for a calculus exam were also ones who received highest student ratings. Since there is contradictory evidence on this point, let us consider this research and its implications in more detail.
Read full abstract