政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/114285
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 113318/144297 (79%)
造访人次 : 50951509      在线人数 : 980
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 商學院 > 金融學系 > 學位論文 >  Item 140.119/114285


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/114285


    题名: 深度增強學習在動態資產配置上之應用— 以美國ETF為例
    The Application of Deep Reinforcement Learning on Dynamic Asset Allocation : A Case Study of U.S. ETFs
    作者: 劉上瑋
    贡献者: 廖四郎
    劉上瑋
    关键词: 動態資產配置
    深度增強學習
    Q-Learning
    類神經網路
    Dynamic asset allocation
    Deep reinforcement learning
    Q-Learning
    Neural network
    日期: 2017
    上传时间: 2017-11-01 14:21:03 (UTC+8)
    摘要: 增強式學習(Reinforcement Learning)透過與環境不斷的互動來學習,以達到極大化每一期報酬的總和的目標,廣泛被運用於多期的決策過程。基於這些特性,增強式學習可以應用於建立需不斷動態調整投資組合配置比例的動態資產配置策略。
    本研究應用Deep Q-Learning演算法建立動態資產配置策略,研究如何在每期不同的環境狀態之下,找出最佳的配置權重。採用2007年7月2日至2017年6月30日的美國中大型股的股票ETF及投資等級的債券ETF建立投資組合,以其日報酬率資料進行訓練,並與買進持有策略及固定比例投資策略比較績效,檢視深度增強式學習在動態資產配置適用性。
    Reinforcement learning learns by interacting with the environment continuously, in order to achieve the target of maximizing the sum of each return. It has been used to solve multi-period decision making problem broadly. Because of these characteristics, reinforcement learning can be applied to build the strategies of dynamic asset allocation which keep reallocating the mix of portfolio consistently.
    In this study, we apply deep Q-Learning algorithm to build the strategies of dynamic asset allocation. Studying how to find the optimal weights in the different environment. We use Large-Cap, Mid-Cap ETFs and investment-grade bond ETFs in the U.S. to build up the portfolio. We train the model with the data of daily return, and then we measure its performance by comparing with buy-and-hold and constant-mix strategy to check the fitness of deep Q-Learning.
    參考文獻: [1] Arnott, R. D., Hsu, J., & Moore, P. (2005). Fundamental indexation. Financial Analysts Journal, 61(2), 83-99.

    [2] Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover.

    [3] Daryanani, G. (2008). Opportunistic Rebalancing: A New Paradigm for Wealth Managers. Journal of Financial Planning, 21(1).

    [4] DeMiguel, V., Garlappi, L., & Uppal, R. (2007). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?. The review of Financial studies, 22(5), 1915-1953.

    [5] Brinson, G. P., Singer, B. D., & Beebower, G. L. (1991). Determinants of portfolio performance II: An update. Financial Analysts Journal, 47(3), 40-48.

    [6] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.

    [7] Kinga, D., & Adam, J. B. (2015). A method for stochastic optimization. In International Conference on Learning Representations (ICLR).

    [8] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.

    [9] Markowitz, H. (1952). Portfolio selection. The journal of finance, 7(1), 77-91.

    [10] Michaud, R. O. (1998). Efficient Asset Management: a practical guide to stock portfolio management and asset allocation. Financial Management Association, Survey and Synthesis Series. HBS Press, Boston, MA.

    [11] Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’optimal?. Financial Analysts Journal, 45(1), 31-42.

    [12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, L., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D. & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.

    [13] Perold, A. F., & Sharpe, W. F. (1988). Dynamic strategies for asset allocation. Financial Analysts Journal, 16-27.

    [14] Plaxco, L. M., & Arnott, R. D. (2002). Rebalancing a global policy benchmark. The Journal of Portfolio Management, 28(2), 9-22.

    [15] Sharpe, W. F. (1966). Mutual fund performance. The Journal of business, 39(1), 119-138.

    [16] Sharpe, W. F. (1987). Integrated asset allocation. Financial Analysts Journal, 43(5), 25-32.

    [17] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.

    [18] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292.
    描述: 碩士
    國立政治大學
    金融學系
    104352029
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0104352029
    数据类型: thesis
    显示于类别:[金融學系] 學位論文

    文件中的档案:

    档案 大小格式浏览次数
    202901.pdf1621KbAdobe PDF2992检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈