The present study employed the Rasch model to investigate the presence of Differential Item Functioning (DIF) in the performances of participants on a high-stakes language proficiency test in a foreign language learning context. A general academic English test which consists of three separate sections, i.e., structure, vocabulary, and reading comprehension, was used in this study. The participants (N=5,236), screened from a population of 7,355 examinees, belonged to two general academic backgrounds, i. e., the Humanities (N=3,585) and Science and Technology (N=1,651). The DIF analysis showed that 68 percent of the items were functioning differentially for the two groups. The distribution of DIF items in the three subtests was not equal. As the Science and Technology group outperformed the Humanities students on both the total test and an item composite constructed out of only neutral items which displayed no DIF, it may be concluded that the presence of a large number of items on the test has not led to bias against any group. While the majority of the DIF items in the grammar part were in the Science and Technology students’ favor, the DIF items in the reading and vocabulary sections were mostly in the Humanities group’s favor. While the subtests may be biased against a particular group, the distribution of DIF items in the total test may be such that the total test is not biased. Hence, it is suggested that both the total test and the subtests be examined in bias analysis.
II. LITERATURE REVIEW