政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/150169
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 114205/145239 (79%)
造访人次 : 52611620      在线人数 : 890
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/150169


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/150169


    题名: 基於Transformers的社群媒體輿論風向變化視覺化分析系統
    Visualization of Social Media Opinion Detection Using Transformers
    作者: 陳岳紘
    Chen, Yue-Hung
    贡献者: 紀明德
    Chi, Ming-Te
    陳岳紘
    Chen, Yue-Hung
    关键词: 資訊視覺化
    大型語言模型
    社群媒體
    Visualization
    Large language models
    Social media
    日期: 2024
    上传时间: 2024-03-01 13:41:54 (UTC+8)
    摘要: 近年來,社群媒體逐漸成為人們生活中不可或缺的一部分,而大型語言模型的出現提升了文本分析的可行性與發展性,在這樣的背景下,本研究探討了使用基於 transformers 的語言模型實現基於文本的視覺化系統的可能性,利用主題建模的技術擷取社群媒體中的風向變化,並且提出兩階段分群的作法提升風向變化分析的效率。為了結合對話式語言模型與視覺化系統,本研究也探討了如何使用 GPT 輸出特定模式的結果,透過提示工程的實驗,我們改良了留言的立場分析的提示詞,使輸出的結果能夠直接為後續程式所用。本研究也提出基於物理碰撞的視覺化方式,能夠讓使用者快速了解社群媒體中的風向變化,並且對感興趣的主題進行進一步的瞭解。我們利用時間軸表示立場分析的結果,並結合各種資訊,讓使用者能夠從各種不同面向對資料進行觀察。最後,我們也使用一連串量化分析的指標來測試這些結果,並提出一些使用案例。
    In recent years, social media has gradually become an indispensable part of people's lives. With the advancement of internet technology, the volume of data within social media has been steadily increasing, making the efficient extraction of information from social media a crucial challenge. On the other hand, the emergence of large language models has enhanced the feasibility and expansiveness of text analysis. Therefore, this study explores the possibility of implementing a text-based visualization system using transformer-based language models. The research focuses on utilizing topic modeling techniques to extract opinion changes within social media. Additionally, a visualization approach based on physical collision is proposed, allowing users to rapidly comprehend changes in the opinion of social media posts and gain further insights into topics of interest. The study also investigates how to use GPT models to output specific patterns. Through prompting engineering, the model is able to do stance analysis in comments, and the results can be directly utilized by subsequent programs. The stance analysis results are represented on a timeline, incorporating various information to enable users to observe data from different perspectives. Finally, a series of quantitative experiment are employed to evaluate these results, and several use cases are presented.
    參考文獻: [1] Arthur,D.andVassilvitskii,S.(2006).k-means++:Theadvantagesofcarefulseeding. Technical report, Stanford.
    [2] Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    [3] Bianchi, F., Terragni, S., Hovy, D., Nozza, D., and Fersini, E. (2021). Cross-lingual contextualized topic models with zero-shot learning. arXiv eprint arXiv:2004.07737.
    [4] Binucci, C., Didimo, W., and Spataro, E. (2016). Fully dynamic semantic word clouds. In 2016 7th International Conference on Information, Intelligence, Systems & Applications (IISA), pages 1–6. IEEE.
    [5] Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022.
    [6] Chakkarwar, V. and Tamane, S. (2020). Social media analytics during pandemic for covid19 using topic modeling. In 2020 International Conference on Smart Innovations in Design, Environment, Management, Planning and Computing (ICSIDEMPC), pages 279–282.
    [7] Charlesworth, J. (2023). How to structure json responses in chat- gpt with function calling. https://www.freecodecamp.org/news/ how-to-get-json-back-from-chatgpt-with-function-calling/. [Online; accessed 11-02-2023].
    [8] Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre- training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    [9] Ekin, S. (2023). Prompt engineering for chatgpt: A quick guide to techniques, tips, and best practices. 10.36227/techrxiv.22683919.
    [10] Grootendorst, M. (2022). Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv eprint arXiv:2203.05794.
    [11] Hu, M., Wongsuphasawat, K., and Stasko, J. (2016). Visualizing social media content with sententree. IEEE transactions on visualization and computer graphics, 23(1):621–630.
    [12] Knittel, J., Koch, S., and Ertl, T. (2020). Pyramidtags: Context-, time-and word order-aware tag maps to explore large document collections. IEEE Transactions on Visualization and Computer Graphics, 27(12):4455–4468.
    [13] Knittel, J., Koch, S., Tang, T., Chen, W., Wu, Y., Liu, S., and Ertl, T. (2021). Real- time visual analysis of high-volume social media posts. IEEE Transactions on Visual- ization and Computer Graphics, 28(1):879–889.
    [14] Liu, S., Li, T., Li, Z., Srikumar, V., Pascucci, V., and Bremer, P.-T. (2018). Visual interrogation of attention-based models for natural language inference and machine comprehension. In Proceedings of the 2018 Conference on Empirical Methods in Nat- ural Language Processing: System Demonstrations, pages 36–41, Brussels, Belgium. Association for Computational Linguistics.
    [15] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettle- moyer, L., and Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
    [16] Malzer, C. and Baum, M. (2020). A hybrid approach to hierarchical density-based
    cluster selection. In 2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI). IEEE.
    [17] McInnes, L., Healy, J., and Astels, S. (2017). hdbscan: Hierarchical density based clustering. J. Open Source Softw., 2(11):205.
    [18] McInnes, L., Healy, J., and Melville, J. (2020). Umap: Uniform manifold approxi- mation and projection for dimension reduction. arXiv eprint arXiv:1802.03426.
    [19] OpenAI (2023a). Api reference - openai api. https://platform.openai.com/docs/ api-reference. [Online; accessed 01-14-2024].
    [20] OpenAI (2023b). Models - openai api. https://platform.openai.com/docs/ models/gpt-3-5. [Online; accessed 01-22-2024].
    [21] Qin, C., Zhang, A., Zhang, Z., Chen, J., Yasunaga, M., and Yang, D. (2023). Is chatgpt a general-purpose natural language processing task solver? arXiv eprint arXiv:2302.06476.
    [22] Reimers, N. and Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
    [23] scikit-learn contrib (2017). Benchmarking performance and scaling of python clus- tering algorithms. https://hdbscan.readthedocs.io/en/latest/performance_and_ scalability.html. [Online; accessed 11-02-2023].
    [24] sentence transformers (2020). distiluse-base-multilingual-cased-v2. https:// huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2. [On- line; accessed 11-02-2023].
    [25] Sun, X., Dong, L., Li, X., Wan, Z., Wang, S., Zhang, T., Li, J., Cheng, F., Lyu, L., Wu, F., and Wang, G. (2023). Pushing the limits of chatgpt on nlp tasks.
    [26] Vaswani, A., Bengio, S., Brevdo, E., Chollet, F., Gomez, A. N., Gouws, S., Jones, L., Kaiser, L., Kalchbrenner, N., Parmar, N., Sepassi, R., Shazeer, N., and Uszkoreit, J. (2018). Tensor2tensor for neural machine translation. CoRR, abs/1803.07416.
    [27] Vaswani,A.,Shazeer,N.,Parmar,N.,Uszkoreit,J.,Jones,L.,Gomez,A.N.,Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
    [28] Vig,J.(2019).Amultiscalevisualizationofattentioninthetransformermodel.arXiv preprint arXiv:1906.05714.
    [29] White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., and Schmidt, D. C. (2023). A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv eprint arXiv:2302.11382.
    [30] Winata, G. I., Madotto, A., Lin, Z., Liu, R., Yosinski, J., and Fung, P. (2021). Lan- guage models are few-shot multilingual learners. In Ataman, D., Birch, A., Conneau, A., Firat, O., Ruder, S., and Sahin, G. G., editors, Proceedings of the 1st Workshop on Multilingual Representation Learning, pages 1–15, Punta Cana, Dominican Republic. Association for Computational Linguistics.
    [31] Wu, T., Wongsuphasawat, K., Ren, D., Patel, K., and DuBois, C. (2020). Tempura: Query analysis with structural templates. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1–12.
    描述: 碩士
    國立政治大學
    資訊科學系
    110753121
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0110753121
    数据类型: thesis
    显示于类别:[資訊科學系] 學位論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    312101.pdf8490KbAdobe PDF0检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈