政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/136971
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文筆數/總筆數 : 113318/144297 (79%)
造訪人次 : 50965811      線上人數 : 949
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/136971
    請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/136971


    題名: 基於師生方法學習多層次注意力的跨領域轉移學習
    A Teacher-Student Approach to Cross-domain Transfer Learning with Multi-level Attention
    作者: 唐英哲
    Tang, Ying-Jhe
    貢獻者: 黃瀚萱
    Huang, Hen-Hsen
    唐英哲
    Tang, Ying-Jhe
    關鍵詞: 自然語言學習
    跨領域轉移問題
    多任務學習
    注意力機制
    Natural language processing
    Domain adaptation
    Multi task learning
    Attention mechanism
    日期: 2021
    上傳時間: 2021-09-02 16:57:32 (UTC+8)
    摘要: 本研究應用於跨領域轉移問題上。跨領域轉移問題希望能解決在一個領域資料利用機器學習訓練模型,並將此訓練後的模型應用於其他不同領域的資料。跨領域問題的困難處在於源領域以及目標領域之間的差異,如 "快" 這個形容詞在跑車產品是好的形容詞,但在電池產品卻是不好的形容詞。在機器學習的問題中,利用已標記資料訓練模型已能達到非常好的效能,但更多情況是沒有足夠的已標記資料訓練模型。基於上述原因,本研究希望可以建立一個既可以解決跨領域轉移問題,又可以解決已標記資料量少的模型。
    模型架構可以分為三個部分的多任務學習,分別為監督式學習、師生跨領域轉移注意力模型以及相關度偵測任務。監督式學習使用資料及標籤輸入模型進行學習。師生跨領域轉移模型由教師模型提供學生模型訓練的偽標記資料,學生模型藉由資料層級注意力和領域層級注意力的幫助,為學生模型篩選出適合訓練的偽標記資料。相關度偵測任務用來偵測句子與描述主體之間的關係。
    本研究應用於產品意見的情緒立場判斷以及藝人與核能的網路輿情立場判斷問題,實驗結果顯示使用本研究的方法能夠在上述的情緒及輿情立場的分類任務都能達到最好的效能。
    The lack of training data forms a challenging issue for applying NLP models in a new domain. Previous work on cross-domain transfer learning aims to exploit the information from the source domains to do prediction for the target domain. To reduce the noises from the out-of-domain data and improve the model`s generalization ability, this work proposes a novel teacher-student approach with multi-task learning that transfers the information from source domains to the target domain with sophisicated weights determined by using the attention mechanism at both instance level and domain level. The generalization ability is further enhanced by unsupervised data augmentation. We also introduce a subject detection task for co-training the main model. Our approach is evaluated not only on the widely-adopted English dataset, Amazon product reviews, but also on Chinese datasets including product reviews, artist reviews, and public opinions of nuclear power. Experimental results show that our approach outperforms state-of-the-art models.
    參考文獻: [1] Anthony Aue and Michael Gamon. “Customizing Sentiment Classifiers to New Domains: A Case Study”. In: Jan. 2005.
    [2] John Blitzer, Mark Dredze, and Fernando Pereira. “Biographies, Bollywood, Boom­ boxes and Blenders: Domain Adaptation for Sentiment Classification”. In: Pro­ ceedings of the 45th Annual Meeting of the Association of Computational Lin­ guistics. Prague, Czech Republic: Association for Computational Linguistics, June 2007, pp. 440–447. URL: https://www.aclweb.org/anthology/P07-1056.
    [3] John Blitzer, Ryan McDonald, and Fernando Pereira. “Domain Adaptation with Structural Correspondence Learning”. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Sydney, Australia: Associa­ tion for Computational Linguistics, July 2006, pp. 120–128. URL: https://www. aclweb.org/anthology/W06-1615.
    [4] Danushka Bollegala, David Weir, and John Carroll.“Cross­Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus”. In: IEEE Transactions on Knowledge and Data Engineering 25.8 (2013), pp. 1719–1731. DOI:10.1109/TKDE.2012.103.
    [5] Danushka Bollegala, David Weir, and John Carroll. “Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross­Domain Sentiment Classifi­ cation”. In: Proceedings of the 49th Annual Meeting of the Association for Com­ putational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 132–141. URL: https:
    //www.aclweb.org/anthology/P11-1014.
    [6] Minmin Chen et al. “Marginalized Denoising Autoencoders for Domain Adapta­ tion”. In: CoRR abs/1206.4683 (2012). arXiv: 1206.4683. URL: http://arxiv.org/ abs/1206.4683.
    [7] Junyoung Chung et al. “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling”. In: CoRR abs/1412.3555 (2014). arXiv: 1412.3555. URL: http://arxiv.org/abs/1412.3555.
    [8] Xia Cui and Danushka Bollegala. Multi­source Attention for Unsupervised Domain Adaptation. 2020. arXiv: 2004.06608 [cs.CL].
    [9] Xia Cui and Danushka Bollegala. “Self­Adaptation for Unsupervised Domain Adap­ tation”. In: RANLP. 2019.
    [10] Yong Dai et al. Adversarial Training Based Multi­Source Unsupervised Domain Adaptation for Sentiment Analysis. 2020. arXiv: 2006.05602 [cs.CL].
    [11] Jacob Devlin et al. “BERT: Pre­training of Deep Bidirectional Transformers for Language Understanding”. In: CoRR abs/1810.04805 (2018). arXiv: 1810.04805. URL: http://arxiv.org/abs/1810.04805.
    [12] X. Ding et al. “Learning Multi­Domain Adversarial Neural Networks for Text Clas­ sification”. In: IEEE Access 7 (2019), pp. 40323–40332. DOI: 10.1109/ACCESS. 2019.2904858.
    [13] Hady Elsahar and Matthias Gallé. “To Annotate or Not? Predicting Performance Drop under Domain Shift”. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Con­
    ference on Natural Language Processing (EMNLP­IJCNLP). Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 2163–2173. DOI: 10. 18653/v1/D19-1222. URL: https://www.aclweb.org/anthology/D19-1222.
    [14] Yaroslav Ganin et al. “Domain­Adversarial Training of Neural Networks”. In: Jour­ nal of Machine Learning Research 17.59 (2016), pp. 1–35. URL: http://jmlr.org/ papers/v17/15-239.html.
    [15] Deepanway Ghosal et al. “KinGDOM: Knowledge­Guided DOMain Adaptation for Sentiment Analysis”. In: Proceedings of the 58th Annual Meeting of the Asso­
    ciation for Computational Linguistics. Online: Association for Computational Lin­ guistics, July 2020, pp. 3198–3210. DOI: 10.18653/v1/2020.acl-main.292. URL: https://www.aclweb.org/anthology/2020.acl-main.292.
    [16] Arthur Gretton et al. “A Kernel Two­Sample Test”. In: Journal of Machine Learn­ ing Research 13.25 (2012), pp. 723–773. URL: http:// jmlr. org/ papers/ v13 / gretton12a.html.
    [17] Han Guo, Ramakanth Pasunuru, and Mohit Bansal. Multi­Source Domain Adap­ tation for Text Classification via DistanceNet­Bandits. 2020. arXiv: 2001 . 04362 [cs.CL].
    [18] Jiang Guo, Darsh J. Shah, and Regina Barzilay. “Multi­Source Domain Adaptation with Mixture of Experts”. In: CoRR abs/1809.02256 (2018). arXiv: 1809 . 02256. URL: http://arxiv.org/abs/1809.02256.
    [19] Sepp Hochreiter and Jürgen Schmidhuber. “Long Short­Term Memory”. In: Neural Computation 9.8 (1997), pp. 1735–1780. DOI: 10 . 1162 / neco . 1997 . 9 . 8 . 1735. eprint: https://doi.org/10.1162/neco.1997.9.8.1735. URL: https://doi.org/10.
    1162/neco.1997.9.8.1735.
    [20] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. “Deep Learning”. In: Nature 521.7553 (2015), pp. 436–444. DOI: 10.1038/nature14539. URL: https://doi. org/10.1038/nature14539.
    [21] Zheng Li et al. “End­to­End Adversarial Memory Network for Cross­domain Senti­ ment Classification”. In: Proceedings of the Twenty­Sixth International Joint Con­
    ference on Artificial Intelligence, IJCAI­17. 2017, pp. 2237–2243. DOI: 10.24963/
    ijcai.2017/311. URL: https://doi.org/10.24963/ijcai.2017/311.
    [22] Zheng Li et al. “Hierarchical Attention Transfer Network for Cross­domain Senti­ ment Classification”. In: Jan. 2018.
    [23] https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence
    [24] Sinno Jialin Pan et al. “Cross­Domain Sentiment Classification via Spectral Fea­ ture Alignment”. In: Proceedings of the 19th International Conference on World
    Wide Web. WWW ’10. Raleigh, North Carolina, USA: Association for Computing Machinery, 2010, pp. 751–760. ISBN: 9781605587998. DOI: 10.1145/1772690.
    1772767. URL: https://doi.org/10.1145/1772690.1772767.
    [25] Alan Ramponi and Barbara Plank. Neural Unsupervised Domain Adaptation in NLP—A Survey. 2020. arXiv: 2006.00632 [cs.CL].
    [26] Sebastian Ruder. “An Overview of Multi­Task Learning in Deep Neural Networks”. In: CoRR abs/1706.05098 (2017). arXiv: 1706.05098. URL: http://arxiv.org/ abs/1706.05098.
    [27] Sebastian Ruder and Barbara Plank. “Strong Baselines for Neural Semi­supervised Learning under Domain Shift”. In: CoRR abs/1804.09530 (2018). arXiv: 1804 . 09530. URL: http://arxiv.org/abs/1804.09530.
    [28] Sainbayar Sukhbaatar et al. “Weakly Supervised Memory Networks”. In: CoRR abs/1503.08895 (2015). arXiv: 1503.08895. URL: http://arxiv.org/abs/1503. 08895.

    [29] Pascal Vincent et al. “Extracting and Composing Robust Features with Denoising Autoencoders”. In: Proceedings of the 25th International Conference on Machine
    Learning. ICML ’08. Helsinki, Finland: Association for Computing Machinery, 2008, pp. 1096–1103. ISBN: 9781605582054. DOI: 10.1145/1390156.1390294.
    URL: https://doi.org/10.1145/1390156.1390294.
    [30] Jason W. Wei and Kai Zou. “EDA: Easy Data Augmentation Techniques for Boost­ ing Performance on Text Classification Tasks”. In: CoRR abs/1901.11196 (2019). arXiv: 1901.11196. URL: http://arxiv.org/abs/1901.11196.
    [31] Fangzhao Wu and Yongfeng Huang. “Sentiment Domain Adaptation with Multiple Sources”. In: Proceedings of the 54th Annual Meeting of the Association for Com­ putational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for
    Computational Linguistics, Aug. 2016, pp. 301–310. DOI: 10 . 18653 / v1 / P16 -
    1029. URL: https://www.aclweb.org/anthology/P16-1029.
    [32] Qizhe Xie et al. “Unsupervised Data Augmentation”. In: CoRR abs/1904.12848 (2019). arXiv: 1904.12848. URL: http://arxiv.org/abs/1904.12848.
    [33] Shujuan Yu et al. “Hierarchical Data Augmentation and the Application in Text Classification”. In: IEEE Access PP (Dec. 2019), pp. 1–1. DOI: 10.1109/ACCESS. 2019.2960263.
    [34] Han Zhao et al. “Adversarial Multiple Source Domain Adaptation”. In: Advances in Neural Information Processing Systems. Ed. by S. Bengio et al. Vol. 31. Curran
    Associates, Inc., 2018, pp. 8559–8570. URL: https://proceedings.neurips.cc/
    paper/2018/file/717d8b3d60d9eea997b35b02b6a4e867-Paper.pdf.
    描述: 碩士
    國立政治大學
    資訊科學系
    108753207
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0108753207
    資料類型: thesis
    DOI: 10.6814/NCCU202101251
    顯示於類別:[資訊科學系] 學位論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    320701.pdf1556KbAdobe PDF27檢視/開啟


    在政大典藏中所有的資料項目都受到原著作權保護.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回饋