상세검색
최근 검색어 전체 삭제
다국어입력
즐겨찾기0
학술저널

논술고사의 신뢰성에 영향을 미치는 채점자 특성 분석

Analyzing rater characteristics affecting the reliability of essay examinations

  • 689
049296.jpg

  본 연구는 논술고사의 신뢰성에 채점자의 엄격성과 전문성이 미치는 영향을 알아보기 위해 다국면 라쉬 모형을 적용하여 대입 모의 논술고사를 분석하였다. 2회 실시된 동형의 논술고사에 참여한 피험자는 각각 647명과 873명이었으며, 채점은 동일한 10명의 교수에 의해 실시되었다. 그 중 4명은 전문가이고 나머지 6명은 비전문가에 속한다. 채점자 엄격성에 대한 분리신뢰도지수를 조사해본 결과는 채점자간에 유의미한 엄격성 차이가 존재한다는 것을 보여주었다. 그러나 전문가와 비전문가를 비교해볼 때, 전문가는 채점자간 엄격성 차이는 낮은 반면, 적합도는 높은 것을 알 수 있었다. 또한 전문가 집단은 2회에 걸친 논술고사에서 엄격성과 적합성이 비전문가보다 안정성있는 것으로 나타났다. 이는 논술고사의 신뢰성을 높이기 위해서는 채점자간 엄격성 차이를 고려해야 하며, 채점자의 전문성을 높이는 것이 선행되어야 한다는 것을 의미한다.

  This paper analyzes the effects of rater "severity" and "expertise" on the reliability of essay examinations by using the many facet Rasch model(FACETS model). Two different essay examinations which administered to different student groups at a university were used for this study. The two examinations included three tasks having the same format and the same scoring method. There were 647 examinees in the first essay examination and 873 in the second one. Ten raters participated in both the first and the second examination. Among them, four raters could be considered as experts and six raters as non-experts in terms of their major content and rater experience. The data were analyzed by FACETS computer program with three facets of examinee, task, and rater. The separation reliability statistics and chi-square tests produced by FACETS showed that there were significant differences between raters for severity. From the comparison between experts and non-experts, it was found that experts tended to have better fits than non-experts. This means that experts can have higher intra-rater reliability. In addition, it was also disclosed that experts were more stable for severity and infit over two administrations. All these results suggest that the factors of rater severity and rater expertise should be seriously considered for the reliability of essay scoring. Especially, it should be recognized that rater severity can be influenced by rater expertise. This implies that rater trainings are required to improve the reliability of essay scoring.

Ⅰ. 서론<BR>Ⅱ. 이론적 배경<BR>Ⅲ. 연구방법<BR>Ⅳ. 연구결과<BR>Ⅴ. 결론 및 논의<BR>참고문헌<BR>저자소개<BR>〈ABSTRACT〉<BR>

(0)

(0)

로딩중