Abstract
This paper reports on a validation study based on an assessment-use argument for four level tests developed under a larger project for adult learners of Turkish as a second language (L2). We treat the test scores as data on which we build the validity claim that the tests accurately classify learners into the intended Common European Framework of Reference for Languages (CEFR) levels. Four level tests (A1, A2, B1, and B2), each comprising listening and reading tasks, were administered to mixed groups of students, including those at Pre-A1 and C1 levels. Cut scores for each of the four tests were determined through the Angoff method and were considered as backing for the validity claim. To provide warrants, we assumed a 50% probability of a test-taker being at a CEFR level for Chi-square goodness-of-fit tests, which were conducted to assess the statistical significance between the expected and observed numbers of students under and above the cut score for each level. The distribution of student scores—with acceptable item difficulty and discrimination indices—cut scores placed in the intervals between adjacent levels, and chi-square analyses of all four tests enabled us to conclude that the tests have the potential to validly demonstrate the intended learner performance. With its innovative design and techniques in data collection and analysis, this paper offers theoretical, methodological and practical insights for practitioners in Turkish L2, based on solid empirical evidence.
Keywords
DOI: http://dx.doi.org/10.15390/EB.2024.12959