Resumen |
In this paper we present a deep reinforcement learning-based methodology for computing optimized trading policies. During the first stage of the methodology, we employ Gated Recurrent Units (GRUs) to predict the immediate future behaviour of the time series that describe the temporal dynamics of the value of a set of assets. Then, we employ a Deep Q-Learning Architecture to compute optimized trading policies that describe, at every point in time, which assets have to be bought and which have to be sold in order to maximize profit. Our experimental results, which are based on trading cryptocurrencies, show that the proposed algorithm effectively computes trading policies that achieve incremental profits from an initial budget. © 2020, Springer Nature Switzerland AG. |