Abstract
In this research, it has been analyzed how variation in facet number affects reliability with the testing of reliability in generalizability theory by using different designs. The research data have been accessed with the scoring of performances towards non-routine problem solving of 132 6th, 7th and 8th grade students of a primary school in Kütahya in 2011-2012 spring term. In the research, p x t x r and p x t x r x a designs have been used in which (p=person) as a measurement object, and task, rater (t=task, r=rater) and rubric (a=rubric) have been seen as variation sources. The research results show that designs used in generalizability theory affect G and phi coefficients; as the number of source of variability increases, percentage of the description of total variance of the person which is the aim of testing decreases. Also, it has been found that the sort of rubric will affect reliability in testing, scores taken from analytical rubric have more reliability than the ones taken from holistic rubric.
Keywords
Performance-based assessment, Generalizability theory Interrater realibility, Analytical rubric, Holistic rubric
DOI: http://dx.doi.org/10.15390/EB.2015.2454