政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/157832
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 116849/147881 (79%)
Visitors : 64220184      Online Users : 559
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/157832


    Title: 基於大型語言模型的可解釋性混合專家系統:多模態資料驅動的創新自動交易框架
    Interpretable Mixture of Experts Based on Large Language Models: A Multimodal Data-Driven Framework for Innovative Automated Trading
    Authors: 劉冠銘
    Liu, Kuan-Ming
    Contributors: 江彌修
    Chiang, Mi-Hsiu
    劉冠銘
    Liu, Kuan-Ming
    Keywords: 大型語言模型
    混和專家模型
    動態交易策略
    Large Language Models
    Mixture-of-Experts
    Dynamic Trading Strategies
    Date: 2025
    Issue Date: 2025-07-01 15:16:48 (UTC+8)
    Abstract: 隨著深度學習與大型語言模型(LLMs) 的快速發展,專家混合模型(Mixture-of-Experts, MoE) 在股票投資領域的應用得到了推動。雖然這些模型展現出優秀的交易績效,但大多數仍局限於單模態數據處理,忽略了來自文本數據等其他模態所提供的豐富信息。另外傳統基於神經網絡的路由器選擇機制無法充分考慮上下文和現實世界的細微差異,導致專家選擇次優化。為了解決這些問題,本研究提出了一種新穎框架,通過將大型語言模型作為路由器融入MoE 架構,充分利用大型語言模型預訓練好的的世界知識與推理能力,動態選擇專家來處理歷史價格數據與股票新聞。這種方法不僅提升了專家選擇的效率與準確性,還增強了模型的解釋性。

    實驗結果顯示,基於多模態真實股票數據的本研究模型框架在總回報率(TR)、夏普比率(SR) 和卡爾瑪比率(CR) 等多項核心指標上顯著優於傳統MoE 模型及其他深度神經網絡方法。LLMoE 通過將數值數據與文本數據整合,實現了更有效的專家選擇與交易決策,並展現出卓越的風險調整能力。此外,該框架靈活的架構設計可輕鬆適配於多種下游任務,其高透明性則通過自然語言推理增強了交易決策的可信度。綜上所述,本研究提供了一個創新的智能交易解決方案,彌補了傳統模型的不足,並為未來在金融市場的應用與研究開闢了新方向。
    With the rapid development of deep learning and large language models (LLMs), the application of Mixture-of-Experts (MoE) models in stock investment has gained momentum. While these models demonstrate excellent trading performance, most remain limited to single-modal data processing, overlooking the rich information provided by other modalities such as textual data. Additionally, traditional neural network-based router selection mechanisms fail to adequately consider contextual and real-world nuances, leading to suboptimal expert selection. To address these issues, this study proposes a novel framework that incorporates large language models as routers within the MoE architecture, leveraging the pre-trained world knowledge and reasoning capabilities of LLMs to dynamically select experts for processing historical price data and stock news. This approach not only improves the efficiency and accuracy of expert selection but also enhances model interpretability.

    Experimental results show that the proposed model framework, based on multimodal real stock data, significantly outperforms traditional MoE models and other deep neural network methods across multiple core metrics including Total Return (TR), Sharpe Ratio (SR), and Calmar Ratio (CR). LLMoE achieves more effective expert selection and trading decisions by integrating numerical data with textual data, demonstrating superior risk-adjusted performance. Furthermore, the framework's flexible architectural design can be easily adapted to various downstream tasks, while its high transparency enhances the credibility of trading decisions through natural language reasoning. In conclusion, this study provides an innovative intelligent trading solution that addresses the shortcomings of traditional models and opens new directions for future applications and research in financial markets.
    Reference: Ding, H., Li, Y., Wang, J., & Chen, H. (2024a). Large language model agent in financial trading: A survey. arXiv preprint arXiv:2408.06361.

    Ding, Q., Shi, H., & Liu, B. (2024b). Tradexpert: Revolutionizing trading with mixture of expert LLMs. arXiv preprint arXiv:2411.00782.

    Hu, Z., Liu, W., Bian, J., Liu, X., & Liu, T.-Y. (2018). Listening to chaotic whispers: A deep learning framework for news-oriented stock trend prediction. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining, 261–269.

    Iacovides, G., Konstantinidis, T., Xu, M., & Mandic, D. (2024). Finllama: LLM-based financial sentiment analysis for algorithmic trading. In Proceedings of the 5th ACM International Conference on AI in Finance, 134–141.

    Innovations, B. (2018). Stock price and news related to it. Kaggle Dataset. Available at: https://www.kaggle.com/datasets/BidecInnovations/stock-price-and-news-realted-to-it/ (Accessed: 2025-05-06).

    Jin, M., Wang, S., Ma, L., Chu, Z., Zhang, J. Y., Shi, X., Chen, P.-Y., Liang, Y., Li, Y.-F., Pan, S., et al. (2023). Time-LLM: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:2310.01728.

    jun Gu, W., hao Zhong, Y., zun Li, S., song Wei, C., ting Dong, L., yue Wang, Z., & Yan, C. (2024). Predicting stock prices with FinBERT-LSTM: Integrating news sentiment analysis. In Proceedings of the 2024 8th International Conference on Cloud and Big Data Computing, 67–72.

    Kou, Z., Yu, H., Peng, J., & Chen, L. (2024). Automate strategy finding with LLM in quant investment. arXiv preprint arXiv:2409.06289.

    Li, K., & Xu, J. (2023). An attention-based multi-gate mixture-of-experts model for quantitative stock selection. International Journal of Trade, Economics and Finance, 14(3), 165–173.

    Li, Y., Yu, Y., Li, H., Chen, Z., & Khashanah, K. (2023). TradingGPT: Multi-agent system with layered memory and distinct characters for enhanced financial trading performance. arXiv preprint arXiv:2309.03736.

    Lopez-Lira, A., & Tang, Y. (2023). Can ChatGPT forecast stock price movements? Return predictability and large language models. arXiv preprint arXiv:2304.07619.

    Sawhney, R., Agarwal, S., Wadhwa, A., & Shah, R. (2020). Deep attentive learning for stock movement prediction from social media text and company correlations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 8415–8426.

    Shen, S., Hou, L., Zhou, Y., Du, N., Longpre, S., Wei, J., Chung, H. W., Zoph, B., Fedus, W., Chen, X., et al. (2023). Mixture-of-experts meets instruction tuning: A winning combination for large language models. arXiv preprint arXiv:2305.14705.

    Shi, X., Wang, S., Nie, Y., Li, D., Ye, Z., Wen, Q., & Jin, M. (2024). Time-MoE: Billion-scale time series foundation models with mixture of experts. arXiv preprint arXiv:2409.16040.

    Soun, Y., Yoo, J., Cho, M., Jeon, J., & Kang, U. (2022). Accurate stock movement prediction with self-supervised learning from sparse noisy tweets. In 2022 IEEE International Conference on Big Data (Big Data), 1691–1700. IEEE.

    Sun, S., Wang, X., Xue, W., Lou, X., & An, B. (2023). Mastering stock markets with efficient mixture of diversified trading experts. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2109–2119.

    Toner, W., & Darlow, L. (2024). An analysis of linear time series forecasting models. arXiv preprint arXiv:2403.14587.

    Vallarino, D. (2024). A dynamic approach to stock price prediction: Comparing RNN and mixture of experts models across different volatility profiles. arXiv preprint arXiv:2410.07234.

    Xu, W., Liu, W., Xu, C., Bian, J., Yin, J., & Liu, T.-Y. (2021). REST: Relational event-driven stock trend forecasting. In Proceedings of the Web Conference 2021, 1–10.

    Yoo, J., Soun, Y., Park, Y.-C., & Kang, U. (2021). Accurate multivariate stock movement prediction via data-axis transformer with multi-level contexts. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2037–2045.

    Yu, Y., Li, H., Chen, Z., Jiang, Y., Li, Y., Zhang, D., Liu, R., Suchow, J. W., & Khashanah, K. (2024a). FinMEM: A performance-enhanced LLM trading agent with layered memory and character design. In Proceedings of the AAAI Symposium Series, 3, 595–597.

    Yu, Z., Wu, Y., Wang, G., & Weng, H. (2024b). MIGA: Mixture-of-experts with group aggregation for stock market prediction. arXiv preprint arXiv:2410.02241.

    Zeng, A., Chen, M., Zhang, L., & Xu, Q. (2023a). Are transformers effective for time series forecasting? In Proceedings of the AAAI Conference on Artificial Intelligence, 37, 11121–11128.

    Zeng, Z., Kaur, R., Siddagangappa, S., Rahimi, S., Balch, T., & Veloso, M. (2023b). Financial time series forecasting using CNN and transformer. arXiv preprint arXiv:2304.04912.

    Zhao, H., Liu, Z., Wu, Z., Li, Y., Yang, T., Shu, P., Xu, S., Dai, H., Zhao, L., Mai, G., et al. (2024). Revolutionizing finance with LLMs: An overview of applications and insights. arXiv preprint arXiv:2401.11641.

    Zhou, Y., Lei, T., Liu, H., Du, N., Huang, Y., Zhao, V., Dai, A. M., Le, Q. V., Laudon, J., et al. (2022). Mixture-of-experts with expert choice routing. Advances in Neural Information Processing Systems, 35, 7103–7114.
    Description: 碩士
    國立政治大學
    金融學系
    112352014
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0112352014
    Data Type: thesis
    Appears in Collections:[Department of Money and Banking] Theses

    Files in This Item:

    File Description SizeFormat
    201401.pdf2053KbAdobe PDF1View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback