Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/118697
|
Title: | 應用AI強化學習於建立股票交易代理人之研究-以台積電股票為例 A study on establishing trading agent of stocks by AI reinforcement learning in Taiwan semiconductor manufacturing company stocks |
Authors: | 林睿峰 Lin, Jui-Feng |
Contributors: | 姜國輝 季延平 林睿峰 Lin, Jui-Feng |
Keywords: | 機器學習 強化學習 Q-learning Deep Q-learning Machine learning Reinforcement learning Q-learning Deep Q-learning |
Date: | 2018 |
Issue Date: | 2018-07-17 11:25:51 (UTC+8) |
Abstract: | 機器學習技術中,強化學習受心理學的行為主義啟發,模仿生物從與環境的互動中,透過追求獎勵與避開懲罰逐步改變行為的學習方法。強化學習非常擅長進行連續多次決策的決策控制,而股市交易符合此類型問題的性質。 然而股市環境狀態具有多樣性,難以用有限的狀態種類來概括,要讓學習代理人能夠學到面對所有環境狀態的應對行動會花費大量的訓練成本,因此本研究採用兩種訓練模型,其一是配合非監督式學習的分群能力先將環境狀態分群,再經由Q-learning演算法訓練;其二是使用將強化學習與深度學習結合的Deep Q-learning演算法訓練價值函數,利用深度學習擬似函數的能力,以Deep Q Network(DQN)為基礎建立股票交易代理人。 系統設計上,本研究採用包含MA、MACD、RSI、BIAS、KD等多種技術指標作為交易代理人觀察市場環境狀態的方法,為歸納何種技術指標較能夠代表市場狀態,本研究設計七組技術指標組合並實測、比較其績效。以投資結束時所持資金與投資開始時所持資金的差距,即總獲利或總虧損作為獎勵訊號激勵代理人改變其交易行為,追求更高的獲利。 本研究以台積電股票為例,擷取自2011年11月3日至2017年12月1日,共六年的臺灣證券交易所網站所公開之盤後資訊,訓練與測試交易代理人的性能,在其中表現最優的模型中,交易代理人平均具有16.14%年獲利率,並形成穩定的交易策略,具備有效獲利的能力。 Reinforcement learning is one of machine learning techniques. Reinforcement learning is inspired by psychology`s behaviorism. Agents imitate learning methods that change behavior by pursuing rewards and avoiding punishment, just as creatures interact with the environment. Reinforcement learning is very good at continuous multiple decision-making. Stock market trading meets the nature of this type of problem. The state of the stock market environment is uncountable. It takes a lot of training costs for the learning agent to learn the response to all environmental states. This study uses two training models. First, cluster environmental states with unsupervised learning. Second, train the value function by Deep Q-learning algorithm which is combined with reinforcement learning and deep learning. This study uses technical indicators including MA, MACD, RSI, BIAS, KD as environmental states for trading agents to observe the market environment. This study designed seven sets of technical indicators. We compare their performance to find out which technical indicators are more representative of the market state. Take the total profit or total loss which is the difference between the funds held at the end of the trading and the funds held at the beginning of the trading as the reward signal. This study takes Taiwan Semiconductor Manufacturing Company stock as an example. We take six years of the after hours information on the website of the Taiwan Stock Exchange to train and test the performance of trading agents. Trading agents showed an average annual interest rate of 16.14% in the best performing model. The agent presents a stable trading strategy with effective profitability. |
Reference: | [ 1 ] 吳欣曄. (2004). 以增強式學習法設計機台派工法則之研究. 臺灣大學電機工程學研究所學位論文, 1-77. [ 2 ] 林典南. (2008). 使用 AdaBoost 之臺股指數期貨當沖交易系統. 臺灣大學資訊網路與多媒體研究所學位論文, 1-55. [ 3 ] 周俊志. (2008). 自動交易系統與策略評價之研究. 臺灣大學資訊工程學研究所學位論文, 1-48. [ 4 ] 賴怡玲. (2009). 使用增強式學習法建立臺灣股價指數期貨當沖交易策略. 臺灣大學資訊工程學研究所學位論文, 1-24. [ 5 ] Hsiao, Y. W., Liu, H. J., & Liao, Y. F. (2016). 基於增強式深層類神經網路之語言辨認系統 (Reinforcement Training for Deep Neural Networks-based Language Recognition)[In Chinese]. In Proceedings of the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016) (pp. 325-341). [ 6 ] Lee, J. W. (2001). Stock price prediction using reinforcement learning. In Industrial Electronics, 2001. Proceedings. ISIE 2001. IEEE International Symposium on (Vol. 1, pp. 690-695). IEEE. [ 7 ] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. [ 8 ] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529. [ 9 ] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press. |
Description: | 碩士 國立政治大學 資訊管理學系 105356036 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0105356036 |
Data Type: | thesis |
DOI: | 10.6814/THE.NCCU.MIS.004.2018.A05 |
Appears in Collections: | [資訊管理學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
603601.pdf | 2279Kb | Adobe PDF2 | 15 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|