Some new proposals and responses in ascertaining the reliability
and validity of Japanese university entrance exams
Michael Guest (University of Miyazaki)
One can easily find criticisms of Japan's university entrance English exams. Claims of a lack of
reliability and/or validity are legion, leading to a widespread view that poorly-designed or ill-considered university entrance exams are
to blame for outdated and unproductive pedagogical practices in high schools (McVeigh, 2001; Gorsuch, 1998; Chujo, 2006). Most foundational
among the critical research is that of Brown and Yamashita (1995), with follow-up research and proposals from Brown (1996, 2000, 2002), Kikuchi
(2006), and Ichige (2006). But could it be that some of these viewpoints and interpretations are based upon notions of validity and reliability
that do not do justice to the parameters surrounding university entrance exams in Japan? And could some of these criticisms have failed
to note the bigger picture? Are some out of date, missing the point, or even be contradictory?
[ p. 7 ]
[ p. 8 ]
[ p. 9 ]
[ p. 10 ]
[ p. 11 ]Focus upon individual lexical items or specific grammar patterns tend to fall into this category. If a test is very specific in its goals and has a clear and set criterion then one might be able to justify more discrete-item questions and the specific skills measured by a very narrowly defined criterion would render an integrative approach invalid. However, it is not necessary that a multiple-choice item also be a discrete-point item (Oller, 1979), and the two should not be conflated, as they seem to be in Brown and Yamashita (p. 9). Therefore, it behooves more generalized tests, such as both Japanese university entrance exams, to have more integrative items, as an indicator of validity. Also, an integrative approach generally warrants more passage-dependant tasks so as to allow the examinee to display a more holistic comprehension of a complete passage (as opposed to a narrow item within that passage). Therefore, criticisms that more passage-dependant items somehow threaten test validity and reliability (Brown and Yamashita, p.28) would seem to contradict a call for a more integrative approach.
[ p. 12 ]