Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/150169
|
Title: | 基於Transformers的社群媒體輿論風向變化視覺化分析系統 Visualization of Social Media Opinion Detection Using Transformers |
Authors: | 陳岳紘 Chen, Yue-Hung |
Contributors: | 紀明德 Chi, Ming-Te 陳岳紘 Chen, Yue-Hung |
Keywords: | 資訊視覺化 大型語言模型 社群媒體 Visualization Large language models Social media |
Date: | 2024 |
Issue Date: | 2024-03-01 13:41:54 (UTC+8) |
Abstract: | 近年來,社群媒體逐漸成為人們生活中不可或缺的一部分,而大型語言模型的出現提升了文本分析的可行性與發展性,在這樣的背景下,本研究探討了使用基於 transformers 的語言模型實現基於文本的視覺化系統的可能性,利用主題建模的技術擷取社群媒體中的風向變化,並且提出兩階段分群的作法提升風向變化分析的效率。為了結合對話式語言模型與視覺化系統,本研究也探討了如何使用 GPT 輸出特定模式的結果,透過提示工程的實驗,我們改良了留言的立場分析的提示詞,使輸出的結果能夠直接為後續程式所用。本研究也提出基於物理碰撞的視覺化方式,能夠讓使用者快速了解社群媒體中的風向變化,並且對感興趣的主題進行進一步的瞭解。我們利用時間軸表示立場分析的結果,並結合各種資訊,讓使用者能夠從各種不同面向對資料進行觀察。最後,我們也使用一連串量化分析的指標來測試這些結果,並提出一些使用案例。 In recent years, social media has gradually become an indispensable part of people's lives. With the advancement of internet technology, the volume of data within social media has been steadily increasing, making the efficient extraction of information from social media a crucial challenge. On the other hand, the emergence of large language models has enhanced the feasibility and expansiveness of text analysis. Therefore, this study explores the possibility of implementing a text-based visualization system using transformer-based language models. The research focuses on utilizing topic modeling techniques to extract opinion changes within social media. Additionally, a visualization approach based on physical collision is proposed, allowing users to rapidly comprehend changes in the opinion of social media posts and gain further insights into topics of interest. The study also investigates how to use GPT models to output specific patterns. Through prompting engineering, the model is able to do stance analysis in comments, and the results can be directly utilized by subsequent programs. The stance analysis results are represented on a timeline, incorporating various information to enable users to observe data from different perspectives. Finally, a series of quantitative experiment are employed to evaluate these results, and several use cases are presented. |
Reference: | [1] Arthur,D.andVassilvitskii,S.(2006).k-means++:Theadvantagesofcarefulseeding. Technical report, Stanford. [2] Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. [3] Bianchi, F., Terragni, S., Hovy, D., Nozza, D., and Fersini, E. (2021). Cross-lingual contextualized topic models with zero-shot learning. arXiv eprint arXiv:2004.07737. [4] Binucci, C., Didimo, W., and Spataro, E. (2016). Fully dynamic semantic word clouds. In 2016 7th International Conference on Information, Intelligence, Systems & Applications (IISA), pages 1–6. IEEE. [5] Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022. [6] Chakkarwar, V. and Tamane, S. (2020). Social media analytics during pandemic for covid19 using topic modeling. In 2020 International Conference on Smart Innovations in Design, Environment, Management, Planning and Computing (ICSIDEMPC), pages 279–282. [7] Charlesworth, J. (2023). How to structure json responses in chat- gpt with function calling. https://www.freecodecamp.org/news/ how-to-get-json-back-from-chatgpt-with-function-calling/. [Online; accessed 11-02-2023]. [8] Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre- training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. [9] Ekin, S. (2023). Prompt engineering for chatgpt: A quick guide to techniques, tips, and best practices. 10.36227/techrxiv.22683919. [10] Grootendorst, M. (2022). Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv eprint arXiv:2203.05794. [11] Hu, M., Wongsuphasawat, K., and Stasko, J. (2016). Visualizing social media content with sententree. IEEE transactions on visualization and computer graphics, 23(1):621–630. [12] Knittel, J., Koch, S., and Ertl, T. (2020). Pyramidtags: Context-, time-and word order-aware tag maps to explore large document collections. IEEE Transactions on Visualization and Computer Graphics, 27(12):4455–4468. [13] Knittel, J., Koch, S., Tang, T., Chen, W., Wu, Y., Liu, S., and Ertl, T. (2021). Real- time visual analysis of high-volume social media posts. IEEE Transactions on Visual- ization and Computer Graphics, 28(1):879–889. [14] Liu, S., Li, T., Li, Z., Srikumar, V., Pascucci, V., and Bremer, P.-T. (2018). Visual interrogation of attention-based models for natural language inference and machine comprehension. In Proceedings of the 2018 Conference on Empirical Methods in Nat- ural Language Processing: System Demonstrations, pages 36–41, Brussels, Belgium. Association for Computational Linguistics. [15] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettle- moyer, L., and Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. [16] Malzer, C. and Baum, M. (2020). A hybrid approach to hierarchical density-based cluster selection. In 2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI). IEEE. [17] McInnes, L., Healy, J., and Astels, S. (2017). hdbscan: Hierarchical density based clustering. J. Open Source Softw., 2(11):205. [18] McInnes, L., Healy, J., and Melville, J. (2020). Umap: Uniform manifold approxi- mation and projection for dimension reduction. arXiv eprint arXiv:1802.03426. [19] OpenAI (2023a). Api reference - openai api. https://platform.openai.com/docs/ api-reference. [Online; accessed 01-14-2024]. [20] OpenAI (2023b). Models - openai api. https://platform.openai.com/docs/ models/gpt-3-5. [Online; accessed 01-22-2024]. [21] Qin, C., Zhang, A., Zhang, Z., Chen, J., Yasunaga, M., and Yang, D. (2023). Is chatgpt a general-purpose natural language processing task solver? arXiv eprint arXiv:2302.06476. [22] Reimers, N. and Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084. [23] scikit-learn contrib (2017). Benchmarking performance and scaling of python clus- tering algorithms. https://hdbscan.readthedocs.io/en/latest/performance_and_ scalability.html. [Online; accessed 11-02-2023]. [24] sentence transformers (2020). distiluse-base-multilingual-cased-v2. https:// huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2. [On- line; accessed 11-02-2023]. [25] Sun, X., Dong, L., Li, X., Wan, Z., Wang, S., Zhang, T., Li, J., Cheng, F., Lyu, L., Wu, F., and Wang, G. (2023). Pushing the limits of chatgpt on nlp tasks. [26] Vaswani, A., Bengio, S., Brevdo, E., Chollet, F., Gomez, A. N., Gouws, S., Jones, L., Kaiser, L., Kalchbrenner, N., Parmar, N., Sepassi, R., Shazeer, N., and Uszkoreit, J. (2018). Tensor2tensor for neural machine translation. CoRR, abs/1803.07416. [27] Vaswani,A.,Shazeer,N.,Parmar,N.,Uszkoreit,J.,Jones,L.,Gomez,A.N.,Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. [28] Vig,J.(2019).Amultiscalevisualizationofattentioninthetransformermodel.arXiv preprint arXiv:1906.05714. [29] White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., and Schmidt, D. C. (2023). A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv eprint arXiv:2302.11382. [30] Winata, G. I., Madotto, A., Lin, Z., Liu, R., Yosinski, J., and Fung, P. (2021). Lan- guage models are few-shot multilingual learners. In Ataman, D., Birch, A., Conneau, A., Firat, O., Ruder, S., and Sahin, G. G., editors, Proceedings of the 1st Workshop on Multilingual Representation Learning, pages 1–15, Punta Cana, Dominican Republic. Association for Computational Linguistics. [31] Wu, T., Wongsuphasawat, K., Ren, D., Patel, K., and DuBois, C. (2020). Tempura: Query analysis with structural templates. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1–12. |
Description: | 碩士 國立政治大學 資訊科學系 110753121 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0110753121 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
312101.pdf | | 8490Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|