Evaluating Quality in Second Language Performance Tests

Scoring performance tests involves making judgments about what is valued in language san1ples gathered from a small number of tasks, and summarizing this in a score. We invest the score with a meaning that is interpreted in terms of what a learner can do in the real world. This process assumes that we are able to summarize complex performances in numbers, and that we can generalize that summary to contexts and tasks beyond the sample that generated the numbers. All performance testing contains an implicit validity claim that from scores we can predict to communicative success (or failure) in the real world. The focus of much validity research is concerned with the development of rating scales that allow movement from performance to score, and from score to inference. In this paper we outline some of the major scales that have been used in performance testing, and a number of approaches to scale development. We then identify a range of problems that language testers still face and research questions that remain to be addressed. It is argued that scoring and interpreting performance tests is a much more complex process than is often imagined.

I. INTRODUCTION

II. PERFORMANCE TESTING AND THE REAL WORLD

III. EARLY RATING SCALES

IV. SUBSEQUENT DEVELOPMENTS IN THE UNITED STATES

V. OTHER APPROACHES TO SCALE DEVELOPMENT

VI. TWO INSTITUTIONALISED SYSTEMS

VII. RECENT DEVELOPMENTS IN THE UNITED STATES

VIII. PROBLEMS WITH EVALUATING LANGUAGE IN PERFOMANCE TESTS

IX. CONCLUSION

REFERENCES

Evaluating Quality in Second Language Performance Tests

(0)

(0)

(0)

(0)

Evaluating Quality in Second Language Performance Tests

(0)

(0) 팝업 열기 팝업 닫기

(0)

(0)

(0)