Student Evaluation of Teachers: Professional Practice or Punitive Policy? (continued)
A self-fulfilling prophecy
A concern to address in the use of student evaluations is the impact
the act of evaluation has on the students' perceptions of the teachers
and on the teachers themselves.
There are biases in evaluating a person's personality, performance
and competence – biases that can lead to flawed information gathering
strategies that are self fulfilling (Harris, 1994). A
self-fulfilling prophecy as defined by Merton (1948)
basically means that an incorrect perception, belief or definition of a
set of circumstances can evoke behaviour that makes the incorrect
perceptions or beliefs come true.
In the composition of the SETEs the administrators bring their own
expectations about the teachers to the procedure. These expectations
profoundly effect the way they design the SETEs and the information
gathering strategies they use.
In clinical psychology in the study of interpersonal expectancy
effects or behavioural confirmation, the problem of making incorrect
diagnosis supported by presumptive questioning strategy is a serious
ethical issue that remains a central focus. Observers, no matter how
well trained and how ethical, will carry out their evaluations based
on incorrect hypothesis.
Snyder and Swann (1978), in a classic study, gave subjects
a list (personality profile) describing either an extroverted personality or
an introverted personality and then asked them to choose 12 questions from a
longer list that would best allow them to test the hypothesis for the profile
they received for a target person. Analysis demonstrated a heavy emphasis on
The process of question selection and the process applying those
questions to the evaluation of a person's behaviour are difficult for
well trained clinicians to perform objectively – the situation of
untrained students and administrators and teachers is even more problematic.
When an administration or administrator has decided that teachers fit
certain stereotypes or engage in certain types of behaviour – negative or
constructive – the administrator will select hypothesis confirming questions
for the students to answer.
For example, students are asked if the teacher is humourous, do they like
the teacher, does the teacher stimulate or encourage them, is the teacher
enthusiastic and dynamic – an entire battery of subjective parameters appear
on SETEs that lead the students to believe that the teacher must conform to
certain and possibly irrelevant behavioural parameters that actually have a
different appeal to each individual student.
As a student answers objective and subjective questions – what will a
student rely on – what they feel confident they can answer or what they
are unsure about?
The nature of objective questions present certain problems. How can a
student know whether a teacher is well prepared – how do they assess
preparedness? How can a student evaluate a teacher's expertise in their
field – if they know so much about the field why are they the student?
Yet students will give answers to these types of questions which shows that
even when they do not have a defensible point of view – they will give an
opinion. This is not the way to solicit informed opinions.
Additionally, it is not the students' opinions that have necessarily been
solicited; they will be answering someone else's questions without having
given the matter any thought until the point in time when they are supposed
to 'evaluate' the teacher.
The administrators' perceptions of the teachers can also profoundly effect
the teachers' perceptions of their own effectiveness. Teachers who are told
that they are teaching poorly because they don't appeal to the parameters
the students are asked to rate on the SETEs may in fact be teaching at a
competent level but the administrations' input from the tainted SETEs can
be amplified by insisting that they are accurate and show the teacher to
be less than competent.
"the underlying belief [s] that the process of education is predominantly the sole burden of the teacher.
. . . In this scenario, there is no room for a well rounded evaluation of the students, the management,
the facility, the social pressures and inhibitions Ð a long list of variables is ignored."
And through all of this is the underlying belief that the process of
education is predominantly the sole burden of the teacher. The assumption
that the teacher is primarily responsible completely colours the students'
attitude and the evaluation designer's intent. In this scenario, there is
no room for a well rounded evaluation of the students, the management, the
facility, the social pressures and inhibitions – a long list of variables
In real classrooms
Students' subjective opinions can be so varied that the overall results
are untrustworthy. Students who are specifically shown that certain SETE
parameters have been fulfilled may still evaluate related criteria ambivalently.
Students may pointedly refer to a teacher's physical characteristics or manner
in very negative or positive terms and judge the teacher on the basis of these
characteristics – as if teachers who are not aesthetically acceptable are
rendered less capable of teaching.
The entire process of SETEs becomes a convenient matter of picking and
choosing what serves to comply with the original hypothesis of the SETE
designer/administrator rather than actually engaging in an honest evaluation.
This means the evaluation is rather like a shopping list of potentially
conforming characteristics that further the administrators' personal biases.
A proposed paradigm
Adapted from Arnoult and Anderson (1988) to provide for a better paradigm
for the evaluation of teacher effectiveness in the academic environment so as
to reduce an evaluator's biases: (a) gather as much evidence as possible, (b)
employ multiple evaluators who have different view points and interests, (c)
vary the observational circumstances to provide for different emphasis in
the environment, (d) review video tapes for greater accuracy, (e) compare
the criteria on balance sheets to establish evidence for and against an
evaluation, (f) solicit an explanation of the results and the subsequent
conclusions made by evaluators to reveal gaps in reasoning. This paradigm
constitutes constructive advice for the evaluations we make of others in a
This type of evaluation is an example of a structured attempt at
measuring professional competence with regard for the various facets of
the evaluating process which is primarily designed to inform the teachers
rather than to judge them – a philosophy that serves better to encourage
improvement rather than to punish.
Arnoult, L. & Anderson, C. A. (1988). Identifying and reducing causal reasoning
biases in clinical practice. In D. C. Turk & P. Salovey (Eds.), Reasoning,
inference, and judgment in clinical psychology (pp. 209-232). New York:
Basow, S. A. (1995). Student evaluations of college professor:
When gender matters. Journal of Educational Psychology. 87, 656-665.
Darley, J. M., Fleming, J. H., Hilton, J. L., & Swann, W. B. (1988).
Dispelling negative expectancies: The impact of interactional goals and
target practices on the expectancy of the confirmation process.
Journal of Experimental Social Psychology, 24, 19-36.
Feldman, K. A. (1978). Course characteristics and college students'
ratings of their teachers: What we know and what we don't.
Research in Higher Education, 9, 199-242.
Feldman, K. A. (1984). Class size and college students' evaluations of
teachers and courses: A closer look. Research in Higher Education,
Harris, M. J. (1993). Information gathering strategies in social perception.
Unpublished manuscript, University of Kentucky, Lexington. Cited in Harris, 1994.
Harris, M. J. (1994). Self-fulfilling prophecies in the clinical context:
Review and implications for clinical practice. Applied and Preventive
Psychology, 3 (3) 145-158.
Kayne, N. T. & Alloy, L. B. (1988). Clinician and patient as aberrant
acutaries: Expectation-based distortions in assessment of covariation.
L. Y. Abramson (Ed.) Social cognition and clinical psychology:
A synthesis, (pp. 295-365). New York: Guilford Press.
Kishor, N. (1995). The effect of implicit theories on raters' inference
in performance judgement: consequences for the validity of student ratings
of instruction. Research in Higher Education, 36 (2) 177-195.
Marsh, H. W., & Dunkin, M. J. (1992). Student's evaluations of university
teaching: A multidimensional perspective. In J. C. Smart (Ed.).
Higher education: Handbook of theory and research.
(Vol. 8. pp. 143-233). New York: Agathon Press.
Merton R. K. (1948). The self-fulfilling prophecy. Antioch Review,
Nielsen, R. S. (1993). The impact of the 1985 reform legislation on the
formative evaluation practices of one central Illinois school district.
Doctoral Dissertation, University of Illinois at Urbana-Champaign (in Harris,
O'Connell, D. Q., & Dickinson, D. J. (1993). Student ratings of instruction as
a function of testing conditions and perceptions of amount learned.
Journal of Research and Development in Education, 27 (1) 18-23.
Sackett, P. R. 1982. The interviewer as hypothesis tester. The effects of
impressions of an applicant on interviewer questioning strategy.
Personnel Psychology, 35, 789-804.
Seldin, P. (1993, July 21). The use and abuse of student ratings of professors.
The Chronicle of Higher Education, p. A40.
Shiozawa T. (1995). The change of the Monbusho guidelines and their
impact on language education. Paper. JALT 95, Nagoya Japan. Reprinted
in PALE Newsletter,(1996) 2, 1.
Smith, M. L. & Glass, G. V. (1980). Meta-analysis of research on class size
and its relationship to attitudes and instruction. American Education Research
Journal, 17, 419-433.
Snyder, M., & Campbell, B. (1980). Testing hypothesis about other people: the
role of the hypothesis. Personality and Social Psychology Bulletin,
Snyder, M., & Swann, W. B. (1978). Hypothesis-testing processes in social
interaction. Journal of Personality and Social Psychology, 36, 1202-1212.
Snyder, M., & Thomasen, C. J. (1988). Interactions between therapists and clients:
Hypothesis testing and behavioural confirmation. In C. D. Turk & P. Salovey (Eds.),
Reasoning, inference and judgement in clinical psychology. New York:
The Free Press.
Stedman, C. H. (1983). The reliability of teaching effectiveness rating scale for
assessing faculty performance. Tennessee Education, 12 (3) 25-32.
Sugeno K. (1992). Japanese Labour Law, (Leo Kanowitz, Translator)
Tokyo: University of Tokyo Press.
Swann, W. B., Jr., & Ely, R. J. (1984). A battle of wills: Self-verification
versus behavioural confirmation. Journal of Personality and Social
Psychology, 46, 1287-1302.
Swann, W. B., Jr., & Giuliano, T. 1987. Confirmatory search strategies in
social interaction: How, when, why, and with what consequences. Journal of
Social and Clinical Psychology, 5, 511-524.
Tagomori, H. T. (1993). A content analysis of instruments used for student
evaluation of faculty in schools of education at universities and colleges
accredited by the national council for accreditation of teacher education.
Unpublished Ed. Doctorate dissertation. University of San Francisco.
Turk, C. D., & Salovey, P. (Eds.) 1988., Reasoning, inference and judgement in
clinical psychology. New York: The Free Press.
Wigington, H., Tollefson, N. & Rodriguez, E. (1989). Student's ratings of
instructors revisited: Interactions among class and instructor variables.
Research in Higher Education, 30 (3) 331-344.
Whitten, B. J., & Umble, M. M. (1980). The relationship of class size, class
level and core vs. non-core classification for class to student ratings of
faculty: Implications for validity. Educational and Psychological
Measurement, 40, 419-423.
- Return to Part 1 of this article -