Optimizing scoring formulas for yes/no vocabulary tests with linear models

Article appearing in Shiken 16.2 (Nov 2012) pp. 2-7.

Authors: Raymond Stubbe1 & Jeffrey Stewart2
1. Kyushu Sangyo University
2. Kyushu Sangyo University, Cardiff University

Yes/No tests offer an expedient method of testing learners' vocabulary knowledge, although a drawback of this method is that since the method is self-report, actual knowledge cannot be con- firmed. "Pseudowords" have been used within such lists to test if learners are reporting knowledge of words they cannot possibly know, but it is unclear how to use this information to adjust scores. Although a variety of scoring formulas have been proposed in the literature, empirical research (e.g., Mochida & Harrington, 2006) has found little evidence of their efficacy.

The authors propose that a standard least squares model (multiple regression), in which the counts of words reported known and counts of pseudowords reported known are added as separate predictor variables, can be used to generate scoring formulas that have substantially higher predictive power. This is demonstrated on pilot data, and limitations of the method and goals of future research are discussed.

Download full article (PDF)