Reference: | 余民寧(民80 )試題反應理論的介紹(一):測驗理論的發展趨勢
。 研習資訊, 8 卷( 6 期) , 13-18 頁。
余民寧(民81a )試題反應理論的介紹(二):基本概念與假設。
研習資訊, 9 卷( 1 期) , 5-9 頁。
余民寧(民81b )試題反應理論的介紹(三):試題反應模式及其
特性。研習資訊, 9 卷( 2 期) , 6-10 頁。
余民寧(民81c )測驗理論的發展趨勢。政治大學心理研究所主辦:
心理測驗之學術及實務研討會論文。
余民寧(民81d )試題反應理論的介紹(六):能力量尺。 研習資
訊, 9 卷( 5 期) , 8-12 頁。
余民寧(民81e )試題反應理論的介紹(七):訊息函數。 研習資
訊, 9 卷( 6 期) , 5 - 9 頁。
吳裕益(民75 )標準參照測驗通過分數設定方法之研究。政大教研
所博士論文,未出版。
林惠芬(民82 )通過分數設定方法在護理人員按竅筆試測驗之研究
。測驗年刊, 40 輯, 253-262 頁。
許擇基、劉長萱(民81) 試題作答理論簡介。 臺北:中國行為科學社。
郭生玉(民74) 心理與教育測驗。 臺北:精華。
陳英豪、吳裕益(民75 )新舊測驗理論之比較及其應用。 臺南師專
學報,19 期~ 253-290 頁。
Andrew, B. J. & Hecht, J. (1976). A preliminary
investigation of two procedures for setting examination
standards. Educational and Psychological Measurement,
36,45-50.
Angoff, W. H. (1971). Scales, norms, and equivalent scorea.
In R. L. Thorndike (Ed.), Educational Measurement(pp.508
-600). Washington, D. C.: American Council on
Education.
Beaton, A. E., & Allen, N. L.
through scale anchoring.
Statistics, !2, 191-204.
(1992). Interpreting scales
Journal of Educational
Behuniak, P. JR., Archambault, F. X., & Gable, R. K. (1982).
Angoff and Nedelsky standard setti ng procedures:
implications for the validity of Pr oficiency test score
interpretation. Educational and Psychological
measurement, 42, 247-255.
Berk, R. A. (1986). A consumer`s guide to setting
performance stsndards on criterion-referenced tests.
Review of Educational Research, 56(1), 137-172.
Berk, R. A. (1976). Determination of optiomal cutting scores
In criterion-referenced measurement.
Experimental Education, 45, 4-9.
Journal of
Beuk, C. H. (1984). A method for reaching a compromlse
between absolute and relative standards in examinations.
Journal of Educational Measurement, 21,147-152.
Birnbaum, A. (1968). Estimation of an ability. In F. M. Lord
and M. R. Novick, Statistical theories of mental test
scores (chapters 20). Reading, MA: Addison-Wesley.
Block, J. H. (1971). Critertion-referced measurements:
Potential. Shool Review,69,289-298.
Block, J. N. (1972). Student learning and the setting of
mastery performance standards. Educational Horizons, 50,
183-190.
Block, J. H. (1978). Standards and criteria: A respose.
Journal of Education Measurement, 15, 291-295.
Brennan, R. L., & Locb`JQod, R. E. (1980). A comparlson of
the Nedelsky and Angoff cutting score procedures using
Generalizability theory. Applied psychological
measurement, 4, 219-240.
Burton, N. W. (1978). Societal standards.Journal of
Educational Measurement, 15,263-271.
Cascio, W. F., Alexander, R. A., & Barrett, G. V. (1988).
Setting cutoff scores: Legal, psychometric, and
professional lssues and guidelines. Personnel
Psychology, 41, 1-24.
Crocker, L., & Algina, J. (1986). Introduction to classical
and modern test theory. New York: Bolt, Rinehart &
Winston.
Cross, L. B., Impara, J. C., Frary, R. B., & Jaeger, R. M.
(1984). A comparison of three methods for establishing
minimum standards on the national teacher examinationa.
Journal of Education Measurement, 21, 113-129.
Davis, F. B., Diamond, J. J. (1974). The preparation of
criterion-referenced tests. In C. W. Barris, M. C.
Alkin, & W. J. Popham. (Eds.), Problems ln criterion
referenced measurement. Los &ngeles: UCLA Graduate
school of Education, Center for the study of Evaluation.
de Gruijter, D. N. M., & Bambleton, R. K. (1984). On
problems encountered using decision theory to set cutoff
scores. Applied Psychological Measurement, 8, 1- 8.
Ebel, R. L.(1971). Critertion-referced measurements:
Limitation. Shool Review,69,282-288.
Ebe1, R. L. (1972). Essentials of Educational Measurement.
Englewood. Cli££s, N. J.:Prentice-Hall.
Ebel, R. L. (1978). The case for minimum competency testing.
phi Delta Kappan, April, 546-549.
Ebel, R. L. (1979). Essentials of Educational Mensurement
(3rd ed.). Englewood Cli£fs, NJ: Prentice-flaIl.
Emrick, J. A. (1971). An evaluation model for mastery
testing. Journal of Educational Measurement, ~(4),
321-326.
Gagn`e, R. M. (1985). The conditions of Learning and theory
of instruction. New York: Holt, Rinechart & Winston.
Garcia-Quintana, R. A., & Mappus, L. L. (1980). Using
norm-referenced data to set standards for a minimum
competency program III the stats of South Carolina:
Aieasibility study. Educational Evaluation and Policy
Analysis,~, 47-52.
Glass, G. V. (1978). Standards and criteria. Journal of
Educational Measurement, 15(4), 237-261.
Glaser, R. (1963). Instructional technology and the
measurement of learning outcomes . American Psychologist,
18, 519-521.
Glaser, R., & Klaus, D. J. (1962). Proficiency measurement:
Assessing human performance. In R. M. Gagne` (Ed),
Psychological Principles in Systems Development(pp.419-
474). New York: Holt, Rinhart and Winston.
Glaser, R. & Nitko,A J.(1971).Measurement in learning and
instruction. In R. L. Thorndike (Ed.), Educational
measurement(pp.625-670). Washington: American Council on
Education,
Guion, R. M., & Ironson, G. H. (1983). Latent trait theory
for organizational research. Organizational Behavior and
Human Performance, 31,54-87.
Haladyna, T. M., & Roid, G. H. (1983). A cornparlBon of two
approaches to criterion-referenced test construction.
Journal of Educational Measurement, 20,271-281.
Halpin, G., Sigmon, G.,
competency standards
& Halpin,
set by
G. (1983). Minimum
three judgmental
procedures:implications for validity. Educational and
psychological measurement , 43,185- 196.
Hambleton, R. K. (1978). On the use of cut- off scores with
criterion- referenced tests in instructional settings.
Journal of Educational Measurement, 15(4), 277-290.
Hambleton, R. K. (1979) Latent trait models and their
applications. In R. T. Guest, (Ed), Methodological
developments. Washington: Jossey-Bass.
Hambleton, R. K.(1980). Test score validity and
standars-setting methods. In R. A. Berk, (Ed.),
Criterion-referenced Measurement: The state of the
art(pp.80-128). Baltimore, MD:John Hopkins University
Press.
Hambleton, R. K. (1983). Application of item response models
to criterion referenced assessment. Applied
psychological Measurement, 7, 33-44.
Hambleton, R. K. (1989). Principles and selected
applications of item response theory. In R. L. Linn
(ED.), Educational measurment (3rd ed., pp. 147-200).
New York: Macmillan.
Hambleton, R. K. (1990). Criterion referenced-testing
methods and practices. In T. B. Gutkin & C. R.Reynolds
(Eds.), The handbook of school psychology (pp. 388-415).
New Jork:John Wiley & Sons.
Hambleton, R. K.,Algina, J., & coulson, D. S. (1978).
criterion- referenced testing and measurement:A review
of technical issues and developments. Review of
Educational Research, 48, 1- 47.
Hambleton, R. K., & Cook, L. L. (1977). Latent trait models
and their use in the analysis of educational test data.
Journal of Educational Measurement, 14,75 -96.
Hambleton, R. K., & de
Application of
criterion-referenced
Gruijter, D. N.
item
test
response
selection.
Educational Measurement, 20, 355-367.
M. (1983).
model to
Journal of
Hambleton, R. K., & Eignor, D. R. (1978). Guidelines for
evaluating criterion-referenced tests and test
manuals. Journal of Educational Measurement, 15,321-327.
Hambleton, R. K., & Eignor, D. R. (1980). Competency test
development ,validation,and standard setting. In R. M.
Jaeger & C. K. Tittle (Eds.), Minimum Competency
Achievement Testing: Motives, models, measures, and
consequences(pp.367-396). Berkeley, CA.: McCutchan.
Hambleton, R. K., Mills, C. N. & Simon, R. (1983).
Determining the lengths for criterion- referenced tests.
Journal of Educational Measurement, 20, 27-38.
Hambleton, R. K., & Novick, M. R. (1973). Toward an
integration of theory and method for
criterion-referenced tests. Journal of Education
Measurement, 10,159-170.
Hambleton, R. K., swaminathan, H., Algin a , J., & Coulson, D.
S. (1978). Criterion-referenced testing and measurement:
A review of technical issues and developments. Review of
Educational research, 48, 1-47.
Hambleton, R. K., & Swaminathan, H. (1985). Item response
theory : Principles and applications. Boston, Ma:
Kluwer-Nijhoff.
Hambleton, R. K., Swaminthan, H. & Rogers, H. J. (1991).
Fundamentals of item response theory. Newburry Park,
CA: SAGE.
Harasym, P. H. (1981). A comparison of the Nedelsky and
modified Angoff standard-setting procedure on evaluation
outcome. Educational and Psycholoical Measurement,
41,725-734.
Harris, C. W., (1972).An interpretation of Livingston`s
reliability coefficient for criterion-referenceed tests.
Journal of Educational Measurement, 9, 27-29.
Harris, D.J., & Subkoviak, M. J. (1986). Item analysis: A
short-cut statisitic for mastery tests. Educational and
Psychological Measurement, 46, 494-507.
Hu li l1; L. L., Drasgm`J, F., & Parsons, C. K. (1983). Item
response theory: Application to psychological
measurement. Homewood, IL: Dow Jones- Irwin.
Huynh, H. (1976). On the reliability of decisions In
domain-referenced testing. Journal of Educational
Measurement, 13, 253-264.
Huynh, E. (1978). Reliability of mutiple classifications.
Psychmetrika, 45, 317-325.
Euynh, E. (1985). Assessing Mastery of basic skills through
summative testing. In D. V. Levine, (Ed), Improving
student achievement through mastery learning programs.
San Francisco, Califoenia: Jossey-Bass.
Euynh, E., & Castel, J. (1985). A comparslon of the mllllmax
and Rasch approaches to set simultaneous- passlllg scores
for subtests. Journal of Education Statistics, 10,
334-344.
Jaeger, R. M. (1991). Selection of judges for
standard-setting. Educational Measurement: Issues and
Practice, 10(2), 3-6.
Jaeger, R. M. (1989). Certification of student competence.
In R. L. Linn (ED.), Educational rneasurment (3rd ed.,
pp. 147-200). New York: Macmillan.
Jaeger, R. M. (1982). An iterative structured judgment
process for establishing standards on competency tests:
Theory and application Educational Evaluation and
Policy Analysis, 4, 461-476.
Kane, M. T. (1987). On the use of IRT models with
judgemental standard setting procedures. Journal of
Educational Measurement, 24, 333-345.
Koffler, S. L. (1980). A comparlson of approaches for
setting proficiency standards. Journal of Educational
measurement, li, 167-178.
Kriewal, T. E. (1972). Aspects and applications of
criterion-referenced tests. I.llinois school research,
9, 5-18.
Levin, H. M. (1978). Educational performance standards:lmage
or substance? Journal of Educational Measurement, 15,
309-319.
Livingston, S. A. (1975). A utility-based approach to the
evaluation of pass/fall testing decision procedures (
Rep. No. Copa-75-01). Princeton, NJ: Center for
Occupational and Professional Assessment, Educational
Testing Sevice.
Livingston, S. A. (1980). Choosing minimum pessing score by
stochastic approximation techniques. Education and
Psychological Measurement, 40, 859-873.
Livingston, S. A., & Zieky, M. J. ("1982). Manual for setting
standards on the basic skills assessment tests.
Princeton, N. J.: Educational Testing Service.
Lord, F. M. (1980).Applications of item response theory to
practical test problem. Hillsdale, NJ: Lawrence Erlbaum
associates.
(P86未key)
Mislevy, R. J., & Bock, R. D. (1983). BILOG: Item analysis
and test with binary logistic models. Mooresville IN:
Scientific Software , Inc.
Mislevy, R. J., Johnson, E. G., & Muraki, E. (1992). Sclaing
procedures in N~. Journal of Education statistics, 17,
131-154.
Mislevy, R. J. & Stocking, M; L. (1989). A consumer`s guide
to LOGIST and BILOG. Applied Psychological Measurement,
13, 57-75.
Nedelsky, L. (1954). Absolute grading standards for
objective tests. Educational and Psychological
Measurement, 14, 3-19 .
Norcini, J. J., Lipner, R. S., Langdon, L. 0., & Strecker,
C. A. (1987). A comparlSon of three variations on a
standard-setting method. Journal of Educational
Measurement, 24, 56-64.
Novick , M. R. , & Lewis ; C. (1974)= Prescribing test length
for estimation criterion-referenced measurement. In C.
w. Harris, M. C. Alkin, & W. J. Popham (Eds.), Problems
in criterion-referenced measurement(CSE Monograph Series
in Evaluation, No.3, pp. 139-158). Los Angeles: Center
for the Study of Evaluation, University of California.
Novick, M. R., Lewis, C., & Jackson, P. H. (1973). The
estimation of proportions in m groups. Psychometrika,
38, 19-46.
Peng, C.-Y. J., & Subkoviak, M. J. (1980). A note on Huynh`s
nomal approximation procedure for estimating
criterion-referenced reliability. Journal of
Educational Measurement, 10(2), 359-368.
Plake, B. S., Melican, G. J., & Mills, c. N. (1991). Factore
influencing intrajudge consistency during
standard-setting. Educational Measurement: Issues and
practice, 10(2), 15-16,22.
Plake, B. S., & Kane, M. T. (1991). Comparison of method for
combining the minimum passing levels for individual item
into a passing. Journal of Educational Measurement, 28,
249-256.
Popham, W. K., & Husek, T. R. (1969). Implications of
criterion- referenced measurement. Journal of
Educational Measurement,6~, 1-9.
Popham, W. J. (1978). As always, provocative. Journal of
Educational Measurement, 15, 297-300.
Popham,W.J.(1981). Modern educational measurement.
Prentic-hall.
Rasch, G. (1980). Probabilistic models for some intelligence
and attainment tests. Chicago: The Oniversity of Chicago
Press (Or iginal edition was published in 1960).
Reid, J. B. (1991). Training judges to generate
standard-setting data. Educational Measurement: Issues
and practice, 10(2), 11-14.
Rowley, G. L.(1982). Historical antecedents of the
standard-setting debate: An inside account of the
minimal-beardedness controversy. Journal of Educational
Measurement, 19,87-95.
hannon, G. A., & Cliver, B. A. (1987). An application of
item response theory in the compariaon of four
conventional item discrimination indices for
criterion-referenced tests. Journal of Educational
Measurement, 24, 347-356.
aunders, J. C., Ryan, J.P., & Huynh, H. (1981). A
comparison of two approaches to setting passing scores
based on the nedelsky procedure. Applied Psychological
Measurement, 5, 209-217 •
. 1epard, L. (1980). Technical issures in minimum competence
testing. In D. C. Berlinger(Ed.), Review of research In
education (Vol. 8). Itasca, Illinois: F.E. Peacock.
lepard, L. A. (1984). setting performance standards. In R.
A. Berk (Ed), A guide to criterion-referenced test
construction (pp.169-198). Baltimore, MD: Johns Hopkins
University Press.
Skakun, E. N., & Kling, S. (1980). Comparablity of methods
for setting standards. Journal of Educational
Measurement, 17, 229-235.
Smith, R. L., & Smith, J. K. (1988). Di££erential use of
item in£ormation by judges ueing Angoff and Nedelsky
procedures. Jorn::nal of Educational Measurement,
25,259-285.
Subkoviak, M. J. (1976). Estimating reliability from a
single administraion of a criterion-referenced test.
Journal of Educational Measurement, 13/265-276.
Subkoviak, M. J.(1978). Empirical investigation of
procedures for estimating reliability for mastery tests.
Journal of Educational Measurement, 15, 111-115.
Subkoiak, M. J. (1980). Decision-consistency appoaches. In
R. A. Berk, (Ed.), criterion-referenced Measurement:The
state of the art(pp . 129-185) . Baltimore, Md . : Johns
University Press.
ubkoviak, M. J. (1988). A practitioner`s guide to
computation and interpretation of reliability indices
for mastery tests. Journal of Educational Measurement,
25, 47-55.
waminathan, H., Hambleton, R. K., & Algina, J. (1975). A
Bayesian Decision-theoretic procedure for use with
criterion-referenced tests. Journal of Educational
Measurement, 12, 87-98.
hissen, D. & steinberg, L.(1986). A taxonommy of item
response models. Psychmetrika, 51, 567-577.
an der Linden, W. J.(1978). Forgetting, guesslng, and
mastery: The Macready and Dayton models revisited and
compared with a latent trait approach. Journal of
Educational Statistics, 3, 305-317.
an der Linden, W. J. (1981). A latent trait look at
pretest-posttest validation of criterion-referenced
test items. Review of Educational Research, 51, 379-402.
In der Linden, W. J. (1982). A latent trait method for
determining intermining intra judge inconsistency in
the Angoff and Nedelsky techniques of standard setting.
Journal of Educational Measurement, 19, 295-308.
an der Linden, W. J. (1984). Some thoughts on the use of
decision theory to set cutoff scores: Comment on de
Gruijter and Hambleton. Applied Psychological
Measurement, 8, 9-17.
rm, T. A. (1978). Aprimer of item response theory.
Springfield, VA: National Technical Information Service.
lilcox, R. R. (1979). Prediction analysis and the
reliability of a mastery test. Educational and
Psychological Measurement, 39, 825-839.
oehr, D. J., Arthur, W. JR., & Fehrmann, M. L. (1991). An
empirical comparlson of cutoff score method for
content-related and criterion-related validity settings.
Educational and Psychological Mea surement, 51,
1029-1039.
reight, B. D. (1977). Solving measurement problems with the
Rasch model. Journal of Educational Measurement, 14,
97-166.
right, B. D., & Stone, M. H. (1979). Best test design.
Chicago: MESA Press.
en, W. M. (1987). A comparison of the efficiency and
accuracy of BILOG and LOGIST. Psychometrika, 52,
275-291.
_eky, M. J., & Livingston, S. A. (1977). Manual for setting
standards on the basic skills assessment tests.
Princeton, NJ: Educational testing service . |