政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/136971

政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/136971

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 115393/146433 (79%)
Visitors : 54910981 Online Users : 321

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大典藏 > College of Informatics > Department of Computer Science > Theses > Item 140.119/136971

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/136971

Title:	基於師生方法學習多層次注意力的跨領域轉移學習 A Teacher-Student Approach to Cross-domain Transfer Learning with Multi-level Attention
Authors:	唐英哲 Tang, Ying-Jhe
Contributors:	黃瀚萱 Huang, Hen-Hsen 唐英哲 Tang, Ying-Jhe
Keywords:	自然語言學習跨領域轉移問題多任務學習注意力機制 Natural language processing Domain adaptation Multi task learning Attention mechanism
Date:	2021
Issue Date:	2021-09-02 16:57:32 (UTC+8)
Abstract:	本研究應用於跨領域轉移問題上。跨領域轉移問題希望能解決在一個領域資料利用機器學習訓練模型，並將此訓練後的模型應用於其他不同領域的資料。跨領域問題的困難處在於源領域以及目標領域之間的差異，如 "快" 這個形容詞在跑車產品是好的形容詞，但在電池產品卻是不好的形容詞。在機器學習的問題中，利用已標記資料訓練模型已能達到非常好的效能，但更多情況是沒有足夠的已標記資料訓練模型。基於上述原因，本研究希望可以建立一個既可以解決跨領域轉移問題，又可以解決已標記資料量少的模型。模型架構可以分為三個部分的多任務學習，分別為監督式學習、師生跨領域轉移注意力模型以及相關度偵測任務。監督式學習使用資料及標籤輸入模型進行學習。師生跨領域轉移模型由教師模型提供學生模型訓練的偽標記資料，學生模型藉由資料層級注意力和領域層級注意力的幫助，為學生模型篩選出適合訓練的偽標記資料。相關度偵測任務用來偵測句子與描述主體之間的關係。本研究應用於產品意見的情緒立場判斷以及藝人與核能的網路輿情立場判斷問題，實驗結果顯示使用本研究的方法能夠在上述的情緒及輿情立場的分類任務都能達到最好的效能。 The lack of training data forms a challenging issue for applying NLP models in a new domain. Previous work on cross-domain transfer learning aims to exploit the information from the source domains to do prediction for the target domain. To reduce the noises from the out-of-domain data and improve the model`s generalization ability, this work proposes a novel teacher-student approach with multi-task learning that transfers the information from source domains to the target domain with sophisicated weights determined by using the attention mechanism at both instance level and domain level. The generalization ability is further enhanced by unsupervised data augmentation. We also introduce a subject detection task for co-training the main model. Our approach is evaluated not only on the widely-adopted English dataset, Amazon product reviews, but also on Chinese datasets including product reviews, artist reviews, and public opinions of nuclear power. Experimental results show that our approach outperforms state-of-the-art models.
Reference:	[1] Anthony Aue and Michael Gamon. “Customizing Sentiment Classifiers to New Domains: A Case Study”. In: Jan. 2005. [2] John Blitzer, Mark Dredze, and Fernando Pereira. “Biographies, Bollywood, Boom boxes and Blenders: Domain Adaptation for Sentiment Classification”. In: Pro ceedings of the 45th Annual Meeting of the Association of Computational Lin guistics. Prague, Czech Republic: Association for Computational Linguistics, June 2007, pp. 440–447. URL: https://www.aclweb.org/anthology/P07-1056. [3] John Blitzer, Ryan McDonald, and Fernando Pereira. “Domain Adaptation with Structural Correspondence Learning”. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Sydney, Australia: Associa tion for Computational Linguistics, July 2006, pp. 120–128. URL: https://www. aclweb.org/anthology/W06-1615. [4] Danushka Bollegala, David Weir, and John Carroll.“CrossDomain Sentiment Classification Using a Sentiment Sensitive Thesaurus”. In: IEEE Transactions on Knowledge and Data Engineering 25.8 (2013), pp. 1719–1731. DOI:10.1109/TKDE.2012.103. [5] Danushka Bollegala, David Weir, and John Carroll. “Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for CrossDomain Sentiment Classifi cation”. In: Proceedings of the 49th Annual Meeting of the Association for Com putational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 132–141. URL: https: //www.aclweb.org/anthology/P11-1014. [6] Minmin Chen et al. “Marginalized Denoising Autoencoders for Domain Adapta tion”. In: CoRR abs/1206.4683 (2012). arXiv: 1206.4683. URL: http://arxiv.org/ abs/1206.4683. [7] Junyoung Chung et al. “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling”. In: CoRR abs/1412.3555 (2014). arXiv: 1412.3555. URL: http://arxiv.org/abs/1412.3555. [8] Xia Cui and Danushka Bollegala. Multisource Attention for Unsupervised Domain Adaptation. 2020. arXiv: 2004.06608 [cs.CL]. [9] Xia Cui and Danushka Bollegala. “SelfAdaptation for Unsupervised Domain Adap tation”. In: RANLP. 2019. [10] Yong Dai et al. Adversarial Training Based MultiSource Unsupervised Domain Adaptation for Sentiment Analysis. 2020. arXiv: 2006.05602 [cs.CL]. [11] Jacob Devlin et al. “BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding”. In: CoRR abs/1810.04805 (2018). arXiv: 1810.04805. URL: http://arxiv.org/abs/1810.04805. [12] X. Ding et al. “Learning MultiDomain Adversarial Neural Networks for Text Clas sification”. In: IEEE Access 7 (2019), pp. 40323–40332. DOI: 10.1109/ACCESS. 2019.2904858. [13] Hady Elsahar and Matthias Gallé. “To Annotate or Not? Predicting Performance Drop under Domain Shift”. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Con ference on Natural Language Processing (EMNLPIJCNLP). Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 2163–2173. DOI: 10. 18653/v1/D19-1222. URL: https://www.aclweb.org/anthology/D19-1222. [14] Yaroslav Ganin et al. “DomainAdversarial Training of Neural Networks”. In: Jour nal of Machine Learning Research 17.59 (2016), pp. 1–35. URL: http://jmlr.org/ papers/v17/15-239.html. [15] Deepanway Ghosal et al. “KinGDOM: KnowledgeGuided DOMain Adaptation for Sentiment Analysis”. In: Proceedings of the 58th Annual Meeting of the Asso ciation for Computational Linguistics. Online: Association for Computational Lin guistics, July 2020, pp. 3198–3210. DOI: 10.18653/v1/2020.acl-main.292. URL: https://www.aclweb.org/anthology/2020.acl-main.292. [16] Arthur Gretton et al. “A Kernel TwoSample Test”. In: Journal of Machine Learn ing Research 13.25 (2012), pp. 723–773. URL: http:// jmlr. org/ papers/ v13 / gretton12a.html. [17] Han Guo, Ramakanth Pasunuru, and Mohit Bansal. MultiSource Domain Adap tation for Text Classification via DistanceNetBandits. 2020. arXiv: 2001 . 04362 [cs.CL]. [18] Jiang Guo, Darsh J. Shah, and Regina Barzilay. “MultiSource Domain Adaptation with Mixture of Experts”. In: CoRR abs/1809.02256 (2018). arXiv: 1809 . 02256. URL: http://arxiv.org/abs/1809.02256. [19] Sepp Hochreiter and Jürgen Schmidhuber. “Long ShortTerm Memory”. In: Neural Computation 9.8 (1997), pp. 1735–1780. DOI: 10 . 1162 / neco . 1997 . 9 . 8 . 1735. eprint: https://doi.org/10.1162/neco.1997.9.8.1735. URL: https://doi.org/10. 1162/neco.1997.9.8.1735. [20] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. “Deep Learning”. In: Nature 521.7553 (2015), pp. 436–444. DOI: 10.1038/nature14539. URL: https://doi. org/10.1038/nature14539. [21] Zheng Li et al. “EndtoEnd Adversarial Memory Network for Crossdomain Senti ment Classification”. In: Proceedings of the TwentySixth International Joint Con ference on Artificial Intelligence, IJCAI17. 2017, pp. 2237–2243. DOI: 10.24963/ ijcai.2017/311. URL: https://doi.org/10.24963/ijcai.2017/311. [22] Zheng Li et al. “Hierarchical Attention Transfer Network for Crossdomain Senti ment Classification”. In: Jan. 2018. [23] https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence [24] Sinno Jialin Pan et al. “CrossDomain Sentiment Classification via Spectral Fea ture Alignment”. In: Proceedings of the 19th International Conference on World Wide Web. WWW ’10. Raleigh, North Carolina, USA: Association for Computing Machinery, 2010, pp. 751–760. ISBN: 9781605587998. DOI: 10.1145/1772690. 1772767. URL: https://doi.org/10.1145/1772690.1772767. [25] Alan Ramponi and Barbara Plank. Neural Unsupervised Domain Adaptation in NLP—A Survey. 2020. arXiv: 2006.00632 [cs.CL]. [26] Sebastian Ruder. “An Overview of MultiTask Learning in Deep Neural Networks”. In: CoRR abs/1706.05098 (2017). arXiv: 1706.05098. URL: http://arxiv.org/ abs/1706.05098. [27] Sebastian Ruder and Barbara Plank. “Strong Baselines for Neural Semisupervised Learning under Domain Shift”. In: CoRR abs/1804.09530 (2018). arXiv: 1804 . 09530. URL: http://arxiv.org/abs/1804.09530. [28] Sainbayar Sukhbaatar et al. “Weakly Supervised Memory Networks”. In: CoRR abs/1503.08895 (2015). arXiv: 1503.08895. URL: http://arxiv.org/abs/1503. 08895. [29] Pascal Vincent et al. “Extracting and Composing Robust Features with Denoising Autoencoders”. In: Proceedings of the 25th International Conference on Machine Learning. ICML ’08. Helsinki, Finland: Association for Computing Machinery, 2008, pp. 1096–1103. ISBN: 9781605582054. DOI: 10.1145/1390156.1390294. URL: https://doi.org/10.1145/1390156.1390294. [30] Jason W. Wei and Kai Zou. “EDA: Easy Data Augmentation Techniques for Boost ing Performance on Text Classification Tasks”. In: CoRR abs/1901.11196 (2019). arXiv: 1901.11196. URL: http://arxiv.org/abs/1901.11196. [31] Fangzhao Wu and Yongfeng Huang. “Sentiment Domain Adaptation with Multiple Sources”. In: Proceedings of the 54th Annual Meeting of the Association for Com putational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 301–310. DOI: 10 . 18653 / v1 / P16 - 1029. URL: https://www.aclweb.org/anthology/P16-1029. [32] Qizhe Xie et al. “Unsupervised Data Augmentation”. In: CoRR abs/1904.12848 (2019). arXiv: 1904.12848. URL: http://arxiv.org/abs/1904.12848. [33] Shujuan Yu et al. “Hierarchical Data Augmentation and the Application in Text Classification”. In: IEEE Access PP (Dec. 2019), pp. 1–1. DOI: 10.1109/ACCESS. 2019.2960263. [34] Han Zhao et al. “Adversarial Multiple Source Domain Adaptation”. In: Advances in Neural Information Processing Systems. Ed. by S. Bengio et al. Vol. 31. Curran Associates, Inc., 2018, pp. 8559–8570. URL: https://proceedings.neurips.cc/ paper/2018/file/717d8b3d60d9eea997b35b02b6a4e867-Paper.pdf.
Description:	碩士國立政治大學資訊科學系 108753207
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0108753207
Data Type:	thesis
DOI:	10.6814/NCCU202101251
Appears in Collections:	[Department of Computer Science ] Theses

Files in This Item:

File	Description	Size	Format
320701.pdf		1556Kb	Adobe PDF2	7	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback