Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/141005
|
Title: | 分類錯誤資料在母體異質下的馬可夫模型 A mixture model for heterogeneous ordinal data with misclassification |
Authors: | 李依璇 Lee, Yi-Shiuan |
Contributors: | 黃佳慧 Huang, Chia-Hui 李依璇 Lee, Yi-Shiuan |
Keywords: | 縱向資料 羅吉斯迴歸 潛在類別 隱藏式馬可夫模型 分類錯誤 Hidden Markov Model Latent class Logistic regression Longitudinal data Misclassification |
Date: | 2022 |
Issue Date: | 2022-08-01 17:15:10 (UTC+8) |
Abstract: | 本研究欲觀察一系列屬於定序變量的縱向資料,並假設母體是由兩個具備 相異特質的群體所組成,以群體劃分母體的方式來處理縱向資料中常見的組間 相異現象,而同一個對象的數個數據間所存在的相關性則以馬可夫模型解釋。另一方面定序變量是由三個類別組成,每一個類別皆被視為一種馬可夫狀態,並且假設不同的群體有相異的狀態空間。在蒐集數據時,測量誤差會使得部分資料的分類有誤,也就是說觀察到的馬可夫鏈未必皆是正確的。為了處理個體異質性以及測量誤差的問題,本研究利用混合馬可夫模型以及隱藏式馬可夫模型的概念,以羅吉斯迴歸分別建立群體類別、給定群體條件之下的初始狀態與狀態轉換的機率模型。計算概似函數時將機率表示為所有可能的馬可夫鏈與群體之聯合機率的加總,以排除測量誤差所產生的錯誤資訊,再利用 R 語言中的 “constrOptim”套件,放入對數概似函數和分數函數求出最大概似估計量。最後由四組不同的參數值進行電腦模擬,以偏誤、標準差、標準誤和覆蓋率這四種指標來評估所提出的統計模型,結果顯示樣本的分佈情形不影響評估表現,而參數估計偏誤與測量誤差之間的關係也是合乎預期的。 The aim of this work is to provide a model for longitudinal data which has the characteristics of heterogeneity in population and correlation within subjects. In this study, the former can be explained by supposing the population consists of several unobservable subgroups with distint features while the latter can be captured by Markov models, in which the Markov states are assumed to be ordinal variables. Furthermore, some observed states are subject to misclassification owing to the measurement error; hence both groups and Markov states without misclassification are latent variables. To address this, mixture Markov chain model and hidden Markov model are used in the analysis of misclassified heterogenous ordinal data. The models of subpopulation membership, subpopuation-specific initial states as well as transition patterns are proposed with logistic regression models. Simulations are conducted under four different parameters settings and maximum likelihood estimators are solved by using the function "constrOptim" in software R. Our simulation results suggest that the estimations, in terms of bias, standard deviation, standard error and coverage probability, are robust to the frequencies of the observed states. In addition, the dependence between esitmation biases and measurement error rates are in line with expectations. |
Reference: | Albert, P. S. (1994). A markov model for sequences of ordinal data from a relapsing- remitting disease. Biometrics, pages 51–60.
Bahl, L., Brown, P., De Souza, P., and Mercer, R. (1986). Maximum mutual infor- mation estimation of hidden markov model parameters for speech recognition. In ICASSP’86. IEEE International Conference on Acoustics, Speech, and Signal Pro- cessing, volume 11, pages 49–52. IEEE.
Bartolucci, F., Farcomeni, A., and Pennoni, F. (2012). Latent Markov models for lon- gitudinal data. CRC Press.
Baum, L. E., Petrie, T., Soules, G., and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. The annals of mathematical statistics, 41(1):164–171.
Chaijareenont, K., Sirimai, K., Boriboonhirunsarn, D., and Kiriwat, O. (2004). Accu- racy of nugent’s score and each amsel’s criteria in the diagnosis of bacterial vaginosis. J Med Assoc Thai, 87(11):1270–1274.
Cheon, K., Thoma, M. E., Kong, X., and Albert, P. S. (2014). A mixture of transition models for heterogeneous longitudinal ordinal data: with applications to longitudinal bacterial vaginosis data. Statistics in medicine, 33(18):3204–3213.
Clark, T. S. and Linzer, D. A. (2015). Should i use fixed or random effects? Political science research and methods, 3(2):399–408.
Cook, R. J. (1999). A mixed model for two-state markov processes under panel obser- vation. Biometrics, 55(3):915–920.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22.
Goodman, L. A. (1961). Statistical methods for the mover-stayer model. Journal of the American Statistical Association, 56(296):841–868.
Haussler, D. K. D. and Eeckman, M. G. R. F. H. (1996). A generalized hidden markov model for the recognition of human genes in dna. In Proc. int. conf. on intelligent systems for molecular biology, st. louis, pages 134–142.
Koumans, E. H. and Kendrick, J. S. (2001). Preventing adverse sequelae of bacterial vaginosis: a public health program and research agenda. Sexually transmitted dis- eases, pages 292–297.
Krumbein, W. C. and Dacey, M. F. (1969). Markov chains and embedded markov chains in geology. Journal of the International Association for Mathematical Geology, 1(1): 79–96.
Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, pages 963–974.
Markov, A. A. (1906). Rasprostranenie zakona bol’shih chisel na velichiny, zavisyaschie drug ot druga. Izvestiya Fiziko-matematicheskogo obschestva pri Kazanskom univer- sitete, 15(135-156):18.
Norris, J. R. (1998). Markov chains. Number 2. Cambridge university press.
Nugent, R. P., Krohn, M. A., and Hillier, S. L. (1991). Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation. Journal of clinical microbiology, 29(2):297–301.
Poulsen, C. S. (1983). Latent structure analysis with choice modeling applications. PhD thesis, University of Pennsylvania.
Sanders, K. L., Thoma, M. E., Yu, K., and Albert, P. S. (2011). An evaluation of the natural history of bacterial vaginosis using transition models. Sexually transmitted diseases, 38(12):1131. |
Description: | 碩士 國立政治大學 統計學系 109354006 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0109354006 |
Data Type: | thesis |
DOI: | 10.6814/NCCU202200708 |
Appears in Collections: | [統計學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
400601.pdf | | 1084Kb | Adobe PDF2 | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|