Abstract: | 社會科學常會使用評等量尺。在試題反應理論的架構中,評等量尺模式(rating scale model, [RSM]; Andrich, 1978)頗常被用來分析這種資料。該模式假設所有題目的閾參數都是一樣的,且其效果對所有受試者也是固定的。不過由於評等量尺通常需要受試者做出主觀的判斷,而該判斷通常會因人而異,因此該項固定效果的假設不易吻合。新近發展的隨機效果評等量尺模式(random-effects rating scale model, [RE-RSM]; Wang & Wilson, 2004),將閾參數改為隨機效果,因此比較能反映出評等量尺中主觀判斷的隨機特性。 本研究透過三個實際的評等量尺資料,比較並說明幾種常用於分析評等量尺的模式,包括RSM,部分得分模式(partial credit model; Masters, 1982)、混合評等量尺模式(mixed rating scale model; von Davier & Rost, 1995),RE-RSM,以及限制性的隨機效果部分得分模式(constrained random-effects partial credit model; Wang & Wilson),並展示如何利用這些模式診斷試題品質,探討隨機效果的大小與可能成因,以提供試題編製與修訂時的參考。 Rating scales have been widely used in social sciences. Within the framework of item response theory, the rating scale model (RSM; Andrich, 1978) is commonly used to fit rating scale data. In the RSM, the threshold difficulty is assumed to be constant across items and persons. However, rating scales usually require persons to make subjective judgments, which are likely to vary across persons. Therefore, the assumption of fixed-effects may not hold. In this study, we introduce the recently developed random-effects rating scale model (RE-RSM; Wang & Wilson, 2004) in which the threshold parameters are treated as random-effects rather than fixed-effects in order to better reflect the random nature of subjective judgments in responding rating scales. Through empirical analyses of three rating scale data sets, we describe and compare several models for rating scale data, including the RSM, the partial credit model (Masters, 1982), the mixed rating scale model (von Davier & Rost, 1995), the RE-RSM, and the constrained random-effects partial credit model (Wang & Wilson). We also demonstrate how to use these models to diagnose item quality, to explore the magnitudes and possible causes of randomness in subjective judgments, and to provide some suggestions for item writing and revision. |