Teacher development and assessment literacy

Appendix A:

Foreign Language Assessment Literacy Test - Preliminary Item Screening

Part 1:
Part 2:
Part 3:
Test Interpretation
Part 4:
Assessment Ethics

PART III. Test Interpretation

(A) Open questions

60. The following statement contains a statistical error. Identify the error and provide a better rewording -
A recent survey of the TOEIC® scores for male and female university students showed a
correlation of 0.75 between the gender of the student and their TOEIC® scores.

Hint: Think about the types of variables involved in this comparison.

Level A Level B Level C
61-62. Three possible reasons for an observed correlation between two variables are mentioned below.
Think of at least two more possible reasons which would account for an observed correlation. -

Possibility 1: X influences Y.
Possibility 2: X and Y are both influenced by Z.
Possibility 3: A confounding factor is present.
Possibility 4: Random chance is at play.
Possibility 5:
Possibility 6:
Level A
Level B
Level C
63-64. Mention two accommodations that might compromise the validity of a high-stakes EFL reading/writing test.
Accomodation 1:
Accomodation 2:
Level A Level B Level C
65. Cite one example of an ecological fallacy from any recently published testing article
and mention what claim should have actually been made in that article.

Level A Level B Level C
66. Read the article from the source cited below and mention
at least three ways in which the information is presented incorrectly.

Source: ELT News. (2004, July). Japan's English Lagging Behind E. Asian Neighbors.
Available online at http://www.eltnews.com/news/archives/2004_07.shtml. [24 Aug. 2006].
Level A
Level B
Level C

The table below compares the writing of an experimental group of 50 Taiwanese EFL students who received one semester of CAI instruction with a control group of 50 students who received "traditional instruction". The respondents wrote a narrative essay which was marked according to a list of pre-set grammatical criteria. The null hypothesis for this study was that no significant difference for the experimental and control group existed at a 05 level.

Table 6 One-Way ANOVA for Overall Error Rates
Sum of Squares df Mean Square F Sig.
Between Groups .005 1 .005 .736 .393
Within Groups .722 98 .007
Total .727 99

Source: Chen, L. L. (2006, June). The effect of the use of L1 in a multimedia tutorial on grammar learning:
An error analysis of Taiwanese beginning EFL learners' English essay. Asian EFL Journal, (8) 2. Article 4.
Retrieved on August 24, 2006 from http://www.asian-efl-journal.com/June_06_llc.php.

67. Based on the available information, what can statements can be made about
these two groups?
Level A Level B Level C
68. Any concerns about the statistics or information in the table?
Level A Level B Level C
69. Read the article online, then mention at least two concerns about the overall study.
Concern 1:
Concern 2:
Level A Level B Level C

[ p. 66 ]

70. A teacher is interested in conducting Rasch research. He writes a paper comparing the multiple- choice reading scores of 62 students from three different populations within the same university: one group of 33, another group of 21, and a third group of 8. Analyzing their test performance in terms of chi square values, location order, and fit statistics, he notes how these three groups of students perform differently on a 30-item MC test.

Mention at least one problem with this study.

Level A
Level B
Level C

71. Here is one multiple-choice TOEIC® Part 3 listening item.
The task is to select the best response (A-D) for the question.
Man: This job is the worst, isn't it?
Woman: I know I shouldn't complain, but things could be better.
Man: I don't think so. I've never had a more interesting job than this one.
Q: What does the man think of the job?

Source: Longman Preparation Series for the TOEIC Test - Introductory Course,
(3rd. Edition, 2005, Pearson Education.) (p. 71 of text and p. 205 of tape script)

Please write at least two points about this test item that are problematic.
Level A
Level B
Level C


72. Here is a multiple-choice item from an entrance exam.
The task is to select the correct accent of the two underlined words.

A group of international journalists published (a) photographs of whales in a magazine which discusses (b) ecological changes.
(1) (a) phótographs (b) ecólogical
(2) (a) phótographs (b) ecológical
(3) (a) photógraphs (b) ecólogical
(4) (a) photógraphs (b) ecológical

Source: 2006 Japanese University Center English Examination, Section 1-A, Version 2110.

What construct, if any, is this test item measuring?

Level A Level B Level C

Here is a table comparing how 47 Japanese university students performed on two different listening tests.
Notice the way the table is constructed and the way the data appear, then complete the tasks below.

Table 1: Subjects' listening scores
Test No. of Items Average
TOEFL 50 17.96 (35.91%)
IELTS 40 13.32 (33.32%)
TOEFL 90 31.28 (34.75%)

Source: Squires, T. (2004, Spring). Classroom implications of authentic listening tasks
on standardized tests of English proficiency. On CUE, (12), 1, 10.

Mention at least two problematic points with the information that is presented in the table above
and also two additional statistics that should be probably included with this data.

73. Problematic Issue #1:
Problematic Issue #2:

74. Missing Statistic #1:
Missing Statistics #2:

75. Based on the data presented here, what sort of interpretations can be made about how closely
the three tests in the table above correlate for this survey group?


Main Article Appendix A: I   II   III   IV Appendix B Appendix C: I   II   III   IV

2006 Pan SIG-Proceedings: Topic Index Author Index Page Index Title Index Main Index
Complete Pan SIG-Proceedings: Topic Index Author Index Page Index Title Index Main Index

[ p. 67 ]
Last Next