Findings from an Empirical Vertical Scaling Study with BILOG-MG

Hüseyin Hüsnü Yıldırım

Abstract

This study compared two procedures for the vertical scaling in the Item Response Theory (IRT) context: fixed estimation, and simultaneous estimation. The results favored the simultaneous estimation procedure to the fixed estimation procedure, especially when there were few anchor items. However, the results also revealed that using expected a posteriori estimates (EAP) of ability scores in 3-parameter IRT model may have a deteriorating effect on the vertically scaled test results through the simultaneous estimation procedure. Overall, the results of this empirical study showed that in the large scale tests which aim to monitor the development across grade levels, the simultaneous estimation procedure with the 2-parameter or the 3-parameter IRT models would be a reasonable choice.

Keywords

Vertical scaling, item response theory, Bilog-MG, Simultaneous estimation

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.