Testlet Response Model for IRT True Score Equating

이 연구는 단위검사로 구성된 검사에 적용되는 문항반응이론 진점수 동등화와 관련된 몇몇 이슈를 다루고 있다.단위검사로 구성된 검사는 문항반응이론의 기본 가정인 지역독립성 가정이 흔히 위배되기 때문에 이러한 가정에 기반을 두고 있는 일반적인 동등화 방법은 편파된 동등화 결과를 초래할 수 있다(Lee,Kolen,Frisbie,& Ankenmann,2001;Li, Bolt,& Fu,2005).이 연구에서는 단위검사로 구성된 검사에 대한 반응 자료를 시뮬레이션 기법으로 생성하였다.새로운 검사 형과 동등화 될 이전의 검사형에 대한 1,500명의 피험자 반응이 생성되었고,검사는 7개의 단위검사와 단위검사 당 6개의 문항이 속한 구조로 총 42개의 문항으로 구성되었다.연구 결과를 통해,등급반응모형과 단위검사반응모형 진점수 동등화 방법이 3모수 로지스틱 진점수 동등화 방법 보다 실제 동등화 결과에 유사한 결과를 산출하였다.단위검사로 구성된 검사의 경우,3모수 로지스틱 모형과 같은 이분문항반응모형은 문항반응이론 가정을 위배하게 되기 때문에 등급반응모형이나 단위 검사반응모형에 비해 상대적으로 큰 편파성이 예상된다.등급반응모형은 단위검사반응모형과 단위검사로 구성된 검사의 동등화 결과와 유사한 결과를 산출하여 현장에서 대안적 사용이 가능할 것으로 예상된다.단위검사로 구성된 검사의 동등화 결과 분석을 통해,동등화 오차의 많은 부분이 문항반응모형의 선정에 따른 편파성에 기인하는 것으로 나타났고,추정 과정에서 오는 임의 오차 부분은 상대적으로 적은 것으로 분석되었다.

The present study was designed to address several issues of item response theory(IRT) true score equating for testlet-composed tests(e.g., reading comprehension tests). Because the fundamental local independence assumption in IRT for testlet-composed tests is often violated, standard IRT equating methods based on that assumption could lead to biased equating relationships(Lee, Kolen, Frisbie, & Ankenmann, 2001; Li, Bolt, & Fu, 2005). Response data sets fortestlet-composed tests were simulated. We generate 50 data sets of 1,500 examinees of both old and new test forms composed of seven testlet swith six items per testlet(42 items in total). We found that the graded response model (GRM) and testlet response model(TRM)true score equating methods providede quating relationships that were more similar to the true equating equivalents than did the three parameter logistic (3PL) true score equating method. Because the IRT assumption for dichotomous item response models is often violated intests composed of testlets, it would be expected that a larger bias would be found in the 3PL method than in the GRM and TRM methods. The GRM and TRM method scould be considered for equating test scores composed of testlets. Finally, the total errors in equating were influenced mainly by the bias component rather than the random estimation component for using different IRT models.

(0)

(0)

(0)

(0)

Testlet Response Model for IRT True Score Equating

(0)

(0) 팝업 열기 팝업 닫기

(0)

(0)

(0)