Given the widespread use of student evaluations of teaching for tenure and promotion decisions, it is important to be aware of possible bias in the evaluations. One definition of bias is if a teacher or course characteristic affects teacher evaluations, either positively or negatively, but is unrelated to criteria of good teaching, such as increased student learning. Class size, for example, affects evaluations in that teachers of classes with under 15 students get higher evaluations. But if students learn more in small classes than they do in large classes, perhaps because small classes allow for more personal attention, then class size is not truly biasing the evaluations; rather, the evaluations are reflecting increased learning. A second, more general, definition of bias is when a known characteristic of students systematically affects their ratings of teachers. The gender of the student, particularly how it interacts with the gender of the teacher, is an example of this possible bias in student evaluations. Do male students tend to rate women teachers lower than men teachers because of a gender bias, especially in fields that are male dominated, such as the natural sciences? Do female students judge women teachers to be more effective than men because they feel more comfortable with them? These are important questions that directly affect the validity of the evaluations when used for personnel decisions. Ideally student evaluations should be related to what they learn from a teacher and not to gender or to other personal characteristics of the teacher (e.g., age, ethnicity). Studies that have investigated gender bias have thus far produced conflicting results. Some studies have found no (or extremely small) differences between the evaluation of female and male instructors on the basis of student gender alone (Basow & Distenfeld, 1985; Basow & Howe, 1987; Bennett, 1982; Elmore & LaPointe, 1974; Harris, 1975; Kaschak, 1981). Other studies reported gender bias, with male students rating female instructors lower than male instructors (Basow & Silberg, 1987; Etaugh & Riley, 1983; Kaschak, 1978; Lombardo & Tocci, 1979; Paludi & Bauer, 1983). Two studies conducted in actual classrooms did not report gender bias in overall evaluations. Bennett's (1982) used a course evaluation questionnaire that included teaching performance ratings, perceptual orientation scales, and indicators of the degree and context of student-instructor interaction. Her data included the evaluations of 11 female and 28 male instructors by 253 students enrolled in nonscience introductory courses at a liberal arts college. Female and male students did not differentiate between faculty members of different gender. Although there was no evidence of direct bias in formal student evaluations of instructors, there was evidence of gender related differences in regard to student-instructor relationships and instructor warmth, support, and accessibility. Elmore and LaPointe (1974) found no interaction between faculty gender and student gender in their analysis of 38 pairs of courses (paired on the basis of course number and gender of instructor) evaluated by 1,259 students. The courses were selected from a variety of departments and colleges within one research university. No attempts were made, however, to compare results from various disciplines or results for students in the same classes. Research that reported gender bias included Basow and Silberg's (1987) study of 16 pairs of instructors (paired on rank, discipline, and years of experience), in which male students rated female instructors less favorably than male instructors. Similar results by Kaschak (1978) and Lombardo and Tocci (1979) were found in a simulated rather than actual classroom setting. In these studies female students saw no difference in the effectiveness of male and female teachers. Feldman conducted two reviews of students' views of male and female college teachers. …