Confidence intervals, limits, and levels?
James Dean Brown
University of Hawai'i at Mānoa
[ p. 25 ]One simple way to look at the mean of a set of scores is to think about it as a sample-based estimate of the mean of the population from which the sample was drawn. Since that estimate is never perfect, it is reasonable to want to know how much error there may be in that estimate of the population mean. The magnitude of this error can be calculated using the seM as follows:
[ p. 24 ]The confidence intervals, limits, and levels that you asked about in your question, all have to do with the next step after you have the standard error calculated. This next step is to interpret the standard error. In order to do so, we need to understand the differences among confidence intervals, limits, and levels so we can clearly think, talk, and write about our interpretations of standard errors.
[ p. 25 ]
[ p. 26 ]The see calculated in the example above turned out to be 5.72, which can be used to further estimate confidence intervals (CIs) that indicate how many score points of variation can reasonably be expected with 68%, 95%, or 98% probability around any given predicted Test Y score in a regression analysis. Test users need to know that the actual Test Y score for any examinee is likely to fall within one see plus or minus of the Test Y score predicted from Test X 68% of the time. Let's say a student's predicted Test Y score is 50; that student (or any student with that same score) has a 68% probability of actually getting a score between 44.28 and 55.72 (50 - 5.72 = 44.28; 50 + 5.72 = 55.72) by chance alone. Similarly, any examinee with a score of 50 is likely to fall within two sees (5.72 + 5.72 = 11.44) plus or minus (50 - 11.44 = 38.56; 50 + 11.44 = 61.44), for a band from 38.56 to 61.44, 95% of the time by chance alone. And finally, an examinee falling within three sees (3 x 5.72 = 17.16) plus or minus (50 - 17.16 = 32.24; 50 + 17.16 = 67.16), or a band from 32.24 to 67.16, is likely to fluctuate within that band 99% of the time. In practical terms, language testers may want to use this information to examine the degree to which the prediction is accurate (e.g., the see of 5.72 in the example here does not seem to indicate a terribly accurate prediction (a glance at the correlation coefficient of .80 above further supports this conclusion), or to make their predictions fairer by at least taking into account the fact that examinees actual scores on Test Y would likely be within the band of plus or minus one see in order to increase the reliability of the prediction making. Whether the tester chooses to use the 68%, 95%, or 98% confidence level is once again judgment call.
Where to Submit Questions:|
Please submit questions for this column to the following e-mail or snail-mail addresses:
Department of ESL, UHM
1890 East-West Road
Honolulu, HI 96822 USA
[ p. 27 ]