政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/58646

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 118940/150005 (79%)
Visitors : 83828180 Online Users : 416

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 教育學院 > 教育學系 > 學位論文 > Item 140.119/58646

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/58646

Title:	Person-fit偵測作假之效用- 非參數試題反應理論的模擬與應用 Applying person-fit in faking detection- The simulation and practice of non-parametric item response theory
Authors:	許嘉家 Syu, Jia Jia
Contributors:	余民寧 Yu, Min Ning 許嘉家 Syu, Jia Jia
Keywords:	非參數試題反應理論作假樣本數 person-fit R Nonparametric item response theory faking sample size person-fit R
Date:	2012
Issue Date:	2013-07-01 14:06:16 (UTC+8)
Abstract:	在心理測驗中，作假的偵測是一個很重要的議題，因為其效果乃影響著變項間的關係、模型測試的正確性、以及測驗的公平性。目前，社會期許量表已被廣泛的應用於作假偵測，但增加題數，則亦增加作答者的負荷。因此，本研究欲探究應用person-fit統計數作為解決方法的可能性。雖然過去已有研究使用參數型的試題反應理論下的person-fit技術進行作假偵測，然而，參數型的試題反應理論的諸多假設，如：大樣本、常態分配、以及多題數等，在實際資料分析中並不容易滿足，因而導致不正確的結果及應用。據此，本研究乃聚焦於探究非參數試題反應理論下的person-fit技術之應用效用，取其使用情境較彈性，且更接近實際的情境之優點。本研究使用模擬資料及實際資料進行研究假設的檢驗。在研究一中，依據不同的樣本數、樣本能力分配、作假動機以及題目的異常率，以R產生模擬作答並求出person-fit數值，進而比較參數型與非參數型各person-fit指標的偵測率（detection rate），作為效用判斷之依據。研究二則將此技術應用於實際資料中，以社會期許量表與一份興趣量表進行本研究所採用之三種統計數（lz, U3p與Guttman errors）的偵測檢證，以瞭解其在實際情境中的實用性。研究結果指出，較佳的person-fit統計數需視不同的情境而定。Guttman errors最適合用於當樣本數小於100人，受試者能力值為常態分配及低闊峰，而作答異常率僅為部分的情況。當作答異常率達到100％，受試者能力分配為負偏態及低闊峰，且作假程度嚴重時，以U3p的偵測效果較佳。而lz則最適用於各種中等程度的作假情境。從實際資料的分析結果，指出不論是大樣本或小樣本，能力分配為常態性的假設皆不容易被滿足，且應用person-fit統計數於作假偵測是可行的，特別是使用非參數型的U3p指標。 Faking detection is a crucial issue because of the effect on the hypothesized relation among variables, model testing, and test fairness. Aside from the Social Desirable Scale, which has often been used in detecting faking, we explored the possibility of an alternative method, which is the person-fit statistics of nonparametric item response theory (NIRT). In the scope of parametric item response theory (PIRT), the person-fit technique has been used in faking detection. Although the PIRT assumptions such as large sample size, normal distribution, and number of items are difficult to achieve, numerous researchers still adopt conventional methods, leading to inaccurate results and implications. Using NIRT person-fit may be more flexible and closer to the practical condition based on NIRT features, and are therefore the focus of this study. We used both simulated and real data to test the hypothesis. In Study 1, the data were simulated and varied in sample size, distribution, faking motivation, and aberrant rate, to investigate the accuracy of person-fit estimating between PIRT and NIRT. In Study 2, the technique using person-fit as a faking detection tool was applied to empirical data to evaluate its use in a practical context. The results indicate that superior person-fit statistics are conditional. The Guttman error detection rate was higher when the sample size was less than 100, when partial item-faking existed in the scale, and in normal and platykurtic distributions. When the aberrant rate is 100% with severe faking, U3p outperformed other indicators in the negatively skewed and platykurtic distribution. Comparatively, lz could be adopted in all median-faking conditions. Our empirical study found that the normal distribution of ability is not easy to satisfy across a small and large sample size. Adopting person-fit statistics for faking detection is feasible, particularly for U3p.
Reference:	References Armstrong, R. D., Stoumbos, Z. G., Kung, M. T., & Shi, M. (2007). On the performance of the lZ person-fit statistic. Practical Assessment Research & Evaluation, 12(16). Retrieved March 12, 2011, from the World Wide Web: http://pareonline.net/getvn.asp?v=12&n=16. Boer, P. (2001). Mspwin(Version 5.0). Groningen, Netherlands: iec ProGAMMA. Bolt, D. M. (2002). A Monte Carlo comparison of parametric and nonparametric polytomous DIF detection methods. Applied Measurement in Education, 15, 113–141. Chen, C. I., Lee, M. N., &Yen, C. L. (2004). Faking intention on the internet: Effects of test types and situational factors. Chinese Journal of Psychology, 46(4), 349-359. Chernyshenko, O. S., Stark, S., Chan, K., Drasgow, F., & Williams, B. (2001). Fitting item response theory models to two personality inventories: Issues and insights. Multivariate Behavioral Research, 36, 523–562. Chiou, H. J. (2008). Determination of sample size and power analysis in structure equation modeling. Journal of Quantitative Research, 2(1), 139-172. Cliff, N., & Keats, J. A. (2003). Ordinal measurement in the behavioral sciences. Mahwah, NJ: Lawrence Erlbaum Associates. Dagohoy, A. V. T. (2005). Person fit for tests with polytomous responses (Unpublished doctoral dissertation). University of Twente, Enschede, Netherlands. De Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press. Drasgow, F. (1989). An evaluation of marginal maximum likelihood estimation for the two-parameter model. Applied Psychological Measurement, 13, 77-90. Drasgow, F., Levin, M. V., & McLaughlin, M. E. (1991). Appropriateness for some multidimensionsl test batteries. Applied Psychological Measurement, 15, 171-191. Drasgow, F., Levin, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathemat ). A comparison of model-data fit for parametric and nonparametric item response theory models using ordinal-level ratings (Unpublished doctoral dissertation). Purdue University, West Lafayette, Indiana. Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224-247. Emons, W. H. M.(2009).Detection and diagnosis of person misfit from patterns of summed polytomous item scores. Applied Psychological Measurement, 33(8), 599-619. Emons, W. H. M., Meijer, R. R., & Sijtsma, K. (2002). Comparing simulated and theoretical sampling distributions of the U3 person-fit statistic. Applied Psychological Measurement, 26(1), 88-108. Emons, W. H. M., Sijtsma, K., & Meijer, R. R. (2005). Global, local, and graphical person-fit analysis using person-response functions. Psychological Methods, 10(1), 101-119. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates. Ferrando, P. J., & Lorenzo, U. (2000). WPerfit: A program for computing parametric person-fit statistics and plotting person response curves. Educational and Psychological Measurement, 60(3), 479-487. Glas, C. A. W., & Dagohoy, A. V. T. (2007). A person fit test for IRT models for polytomous items. Psychometrika, 72(2), 159-180. Glickman, M. E, Seal, P., & Eisen, S. V. (2009). A non-parametric Bayesian diagnostic for detecting differential item functioning in IRT models. Health Services and Outcomes Research Methodology , 9(3), 145-161. Granberg-Rademacker, J. S. (2010). An algorithm for converting ordinal scale measurement data to interval/ratio scale. Educational and Psychological Measurement, 70(1), 74-90. Harwell, M. R., & Janosky, J. E. (1991). An empirical study of the effects of small datasets and varying prior variances on item parameter estimation in BILOG. Applied Psychological Measurement, 15, 279-291. Hemker, B. T. (2000). Reversibility revisited and other comparisons of three types of polytomous IRT models. In A. Boomsma, M. A. J. van Duijn, and T. A. B. Snijders (Eds.). Essays on item response theory (pp. 277-296). New York, NY: Springer-Verlag. Hemker, B. T., Sijtsma, K., & Molenaar, I. W.(1995). Selection of unidimensional scales from a multimensional item bank in the polytomous Mokken’s IRT model. Applied Psychological Measurement, 19(4), 337-352. Hemker, B. T., Sijtsma, K., Molenaar, I. W., & Junker, B. W. (1996). Polytomous irt models and monotone likelihood ratio of the total score. Psychometrika, 61(4), 679-693. Higgins, J. (2004). Introduction to Modern Nonparametric Statistics. Pacific Grove, CA: Duxbury Press. Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two- and three-parameter logistic item characteristic curves: A Monte Carlo study. Applied Psychological Measurement, 6, 249-260. Junker, B. W. , & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25, 211-220. Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education, 16, 277-298. LaHuis, D. M., & Copeland, D. (2009). Investigating faking using a multilevel logistic regression approach to measuring person fit. Organizational Research Methods, 12(2), 296-319. Lai, T. L. (2010). The discrepancy on social desirability and job desirability between different types of jobs applicants and non-applicants (Unpublished doctoral dissertation). National Chengchi University, Taipei, Taiwan. Li, Y. L., & Baron, J. (2012). Use R: Behavioral research data analysis with R. New York, NY: Springer Lai, T. L., Yu, M. N., & Hsu, C. W. (2009). The development and validation of employee selection personality inventory. Journal of Educational Research and Development, 5(4), 269-304. Levine, M. V., & Rubin, D. B. (1979). Measuring the appropriateness of multi-choice test scores. Journal of Educational Statistics, 4, 269-290. Liu, H. C. (2007). Kernel smoothing nonparametric IRT models for polytomous response testing and its application. Journal of Research on Measurement and Statistics, 15, 13-27. Lord, F. M. (1974). Estimation of latent ability and item parameters when there are omitted responses. Psychometrika, 39, 247-264. Marlowe, D. A., & Crowne, D. P. (1960). A new scale of social desirability independent of psychopathology. Journal of Consulting Psychology, 24, 349-354. Maxwell, S. E., Kelley, K., & Rausch, J. R. (2008). Sample size planning for statistical power and accuracy in parameter estimation. Annual Review of Psychology, 59, 537-563. Meijer, R. R. (2003). Diagnosing item score patterns on a test using item response theory-based person-fit statistics. Psychological Methods, 8(1), 72-87. Meijer, R. R., & Baneke, J.(2004). Analyzing psychopathology items: A case for nonparametricitem response theory modeling. Psychological Methods, 9, 354-367. Meijer, R. R., & van Krimpen-Stoop, E. M. L. A. (2000). Person fit across subgroups: An achievement testing example. In A. Boomsma, M. A. J. van Duijn, and T. A. B. Snijders (Eds.). Essays on item response theory (pp. 377-390). New York, NY: Springer-Verlag. Meijer, R. R., Molenaar, L. W., & Sijtsma, K. (1994). Influence of test and person characteristics on nonparametric appropriateness measurement. Applied Psychological Measurement, 18(2), 111-120. Meijer, R. R., & Sijtsma, K. (2001). Methodology review-evaluating person fit. Applied Psychological Measurement, 25(2), 107-135. Mislevy, R. J. (1986). Bayes model estimation in item response models. Psychometrika, 51, 177-195. Mislevy, R. J., & Bock, R. D. (1984). BILOG Version 2.2: Item analysis and test scoring with binary logistic models. Mooresville, IN: Scientific Software. Mokken, R. J. (1971). A theory and procedure of scale analysis: With applications in political research. The Hague, Nederland: Mouton. Mokken, R. J. (1997). Nonparametric models for dichotomous responses. In W. J. van der Linden and R. K. Hambleton (Eds.). Handbook of modern item response theory (pp. 351-367). New York: Springer-Verlag. Mokken, R. J., & Lewis, C. (1982). A nonparametric approach to the analysis of dichotomous item responses. Applied Psychological Measurement 6, 417–430. Molenaar, I. W. (1991). A weighted Loevinger H-Coefficient extending Mokken Scaling to multicategory items. Kwantitatieve Methoden, 12(37), 97-117. Molenaar, I. W. (2001). Thirty years of nonparametric item response theory. Applied Psychological Measurement, 25, 295-299. Molenaar, I. W., & Sijtsma, K. (2000). User’s manual MSP5 for windows. Groningen: iec ProGAMMA. Nozawa, Y. (2008). Comparison of parametric and nonparametric IRT equating methods under the common-item nonequivalent groups design (Unpublished doctoral dissertation). University of Iowa, Iowa.. Osterlind, S. J., & Everson, H. T. (2008). Differential item function(2nd ed). Thousand Oaks, CA: Sage. R Development Core Team (2011). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/. Ramsay, J. O. (2000). TestGraf. A program for the graphical analysis of multiple-choice tests and questionnaire data [Computer software and manual]. Retrieved January 22, 2011, from the World Wide Web: http://www.psych.mcgill.ca/faculty/ramsay/ramsay.ht Razali, N. M., & Wah, Y. B.(2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21-33. Reise, S. P. (2000). Using multilevel logistic regression to evaluate person-fit in IRT models. Multivariate Behavioral Research, 35, 543-568. Reise, S. P., & Waller, N. G. (1990). Fitting the two parameter model to personality data. Applied Psychological Measirement, 14, 45-58. Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8, 164–184. Reise, S. P., & Henson, J. M. (2003). A discussion of modern versus traditional psychometrics as applied to personality assessment scales. Journal of Personality Assessment, 81, 93–103. Reise, S. P., & Widaman, K. F. (1999). Assessing the fit of measurement models at the individual level: A comparison of item response theory and covariance structure approaches. Psychological Methods, 4, 3-21. Reise, S. P., & Yu, J. (1990). Parameter recovery in the graded response model using MULTILOG. Journal of Educational Measuement, 27, 133-144. Rizopoulos, D. (2006). ltm: An R package for latent variable modeling and item response theory analyses. Journal of Statistical Software, 17 (5), 1-25. Robie, C., Zickar, M. J., & Schmitt, M. J. (2001). Measurement equivalence between applicant and incumbent groups: An IRT analysis of personality scales. Human Performance, 14, 187-207. Ronald, L. T. (1997). A Monte Carlo investigation of parameter estimation efficacy using modified fixed "C" three parameter log (3PL) item response theory models with small sample sizes. ETD Collection for Wayne State University. Retrieved March 12, 2011, from the World Wide Web: http://digitalcommons.wayne.edu/dissertations/AAI9815390. Schmitt, N., Chan, D., Sacco, J. M., McFarland, L. A., &Jennings, D. (1999). Correlates of person fit and effect of person fit on test validity. Applied Psychological Measurement, 23, 41-53. Seong, T. J. (1990). Sensitivity of marginal maximum likelihood estimation of item and ability parameters to the characteristics of the prior ability distributions. Applied Psychological Measurement, 14, 299-311. Sijtsma, K. (1998). Methodology review: Nonparametric IRT approaches to the analysis of dichotomous item scores. Applied Psychological Measurement, 22(1), 3-31. Sijtsma, K. (2005). Nonparametric item response theory models. Encyclopedia of Social Measurement, 2, 875-882. Sijtsma, K., Emons, W. H. M., Bouwmeester, S., Nyklicek, I., & Roorda, L. D. (2008). Nonparametric IRT analysis of Quality-of-Life Scales and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Bref). Quality of Life Research 17(2). 275-290. Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory (Vol. 5). London: Sage Publications. Sijtsma, K., &Molenaar, I. W. (1987). Reliability of test scores in nonparametric item response theory. Psychometrika, 52(1), 79-97. Sijtsma K, & van der Ark L. A. (2001). Progress in NIRT analysis of polytomous item scores: Dilemmas and practical solutions. In A. Boomsma, M. A. J. van Duijn, and T. A. B. Snijders (Eds.). Essays on item response theory (pp. 297-318). New York: Springer-Verlag. Snijders, T. A. B.(2001). Asymptotic null distribution of person fit statistics with estimated person parameter. Psychometrika, 66(3), 331-342. Sodano, S. M., & Tracey, T. J. G. (2011). A brief inventory of interpersonal problems-circumplex using nonparametric item response theory: Introducing the IIP-IRT. Journal of personality assessment, 93(1), 62-75. Sprinthall, R. C. (1997). Basic statistic analysis (5th ed). Boston : Allyn and Bacon Stochl, J. (2007). Nonparametric extension of item response theory models and its usefulness for assessment of dimensionality of motor tests, Acta Universitatis Carolinae, 42(1), 75-94. Stone, C. A. (1992). Recovery of marginal maximum likelihood estimates in the two-parameter logistic response model: An evaluation of MULTILOG. Applied Psychological Measurement, 15, 1-16. Stewart, M. E., Watson, R., Clark, A., Ebmeier, K. P., & Deary, I. J. (2010). A hierarchy of happiness? Mokken scaling analysis of the Oxford Happiness Inventory. Personality and Individual Differences, 48 (7), 845-848. Stout, W. F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation. Psychometrika, 55, 293–325. St-Onge, C., Valois, P., Abdous, B., & Germain, S. ( 2009). A Monte Carlo study of the effect of item characteristic curve estimation on the accuracy of three person-fit statistics. Applied Psychological Measurement, 33(4), 307-324. St-Onge, C., Valois, P., Abdous, B., & Germain, S. ( 2011). Accuracy of person-fit statistics: A Monte Carlo study of the influence of aberrance rates. Applied Psychological Measurement, 35(6), 419-432. Swaminathan, H., & Gifford, J. A. (1983). Estimation of parameters in the three-parameter latent trait model. In D. Weiss (Ed.), New horizons in testing (pp. 13-30). New York: Acasemic Press. Tate, R. (2002). Test dimensionality. In G. Tindal & T. M. Haladyna (Eds.), Large-scale assessment programs for all students: Validity, technical adequacy, and implementation (pp. 181–211). Mahwah, NJ: Lawrence Erlbaum Associates van den Wittenboer, G., Hox, J. J., & De Leeuw, E. D. (2000). Latent class analysis of respondent scalability. Quality & Quantity, 34, 177-191. van der Ark, L. A. (2007). Mokken scale analysis in R. Journal of Statistical Software, 20(11), 1-19. van der Flier, H. (1980). Vergelijkbaarheid van individuele testprestaties [Comparability of individual test performance]. Lisse, Netherlands: Swets & Zeitlinger. van Krimpen-Stoop, E. M. L. A., & Meijer, R. R. (2002). Detection of person misfit in computerized adaptive tests with polytomous items. Applied Psychological Measurement, 26, 164-180. van Schuur, W. H. (2003). Mokken scale analysis: Between the Guttman scale and parametric item response theory. Political Analysis, 11, 139-163. Waller, N. G., Thompson, J. S., & Wenk, E. (2000). Using IRT to separate measurement bias from true group differences on homogeneous and heterogeneous scales: An illustration with the MMPI. Psychological Methods, 5, 125-146. Wells, C. S., & Bolt, D. M. (2008). Investigation of a nonparametric procedure for assessing goodness-of-fit in item response theory. Applied Measurement in Education, 21, 22-40. Woods, C. M. (2006). Ramsay-curve item response theory (RC-IRT) to detect and correct for nonnormal latent variables. Psychological Methods, 11(3), 253-270. Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press. Xu, X. L. (2004). Computerized adaptive testing and equating methods with nonparametric IRT models (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign, IL, Champaign. Yen, W. M. (1987). A comparison of the efficiency and accuracy of BILOG and LOGIST. Psychometriks, 52, 275-291. Yu, M. N. (2002). Educational test and assessment. Taipei city, Taiwan: psychological Publishing company, Ltd. Yu, M. N. (2009). The item response theory and its application. Taipei city, Taiwan: psychological Publishing company, Ltd. Yu, M. N., Shie, M. J., Chen, P. L., Huang, S. Y., Chung, P. C., Chao, P. C., Chen, Y. H., Syu, J. J. (2010). The construction of apitide test for secondary education under the free entry exam policy (NAER-99-23-B-2-01-00-2-01). New Taipei city, Taiwan: National Academy of Educational Research. Zicker, M. J., & Drasgow, F. (1996). Detecting faking on a personality instrument using appropriate measurement. Applied Psychological Measurement, 20, 71-87. Zickar, M. J., Gibby, R. E., & Robie, C. (2004). Uncovering faking samples in applicant, incumbent, and experimental data sets: An application of mixed model item response theory. Organizational Research Methods, 7(2), 168-190.
Description:	博士國立政治大學教育研究所 97152515 101
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0097152515
Data Type:	thesis
Appears in Collections:	[教育學系] 學位論文

Files in This Item:

File	Size	Format
251501.pdf	11011Kb	Adobe PDF2	1542	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback