A General Framework of Optimal Investment

Show more

1. Introduction

In this paper, we propose a general framework, under which optimal portfolio construction or investment activities can be carried out, along with a number of trading strategies. This paper adds to the knowledge of the investment management literature by introducing the following ideas. First, we recognize the time varying property of asset return distributions^{1}, develop and call for new methods, potentially based on machine learning and panel regression, to conduct the model inference and portfolio optimization. Second, in order to build the investment management framework, we propose to use machine learning method to handle big data input and the dynamic programming problems, which might arise when we try to model the market uncertainty and formulate a dynamic optimal investment problem. Third, most importantly, we propose a completely different way to look at the randomness of the financial market. Previous work focuses on forecasting-type investment management techniques and tries to build various types of models to predict the uncertainties. Technical analysis is inevitably used. The authors of this paper, however, do not rely on the serial correlations of asset returns or analysis based on technical indicators. Instead, we effectively use the information from the cross-sectional data of asset returns and try to build various statistics, i.e., the crystal balls, that enable us to observe the future realizations of the market uncertainty in an aggregate or portfolio level. Investment strategies that are accurate at certain confidence levels are proposed and tested via simulation and backtesting studies. The analysis in this paper, for the first time in the financial literature, utilizes the cross sectional data of financial assets to infer their aggregate time series behavior. The proposed theories prove to be effective in both simulations, in an artificial environment, and backtesting studies with real market data. Last, but not least, we propose a combination of brute-force model-free approach, such as machine learning (reinforcement learning or Q-learning) in financial analysis, which can be found in [1] - [6] , and purely theoretical approaches such as no arbitrage pricing, hedging and dynamic stochastic general equilibrium (DSGE) studies. We try to find a balance between those methodologies, in order to yield better results, i.e., investment frameworks, investment management strategies and portfolio construction methodologies with good empirical performance. For example, we suggest using reinforcement learning (RL) to solve, potentially, the dynamic portfolio optimization problems, with minimal assumption on the underlying asset return dynamics. Moreover, we argue that, the underlying asset dynamics, often modeled via an Itô process in the stochastic analysis literature, can be combined with artificial neural network to yield better fits to the market data, meantime without losing the theoretical ground^{2}.

Throughout the history of financial analysis, researchers and practitioners strive to build investment or trading strategies with short, medium or long term to benefit from the economic and market movements. However, inevitably in the literature, all the work focus on predicting future market movement with the information available from the past. For example, popular methods include, but are not limited to, trend following, mean-reversion or long-short strategies. Taking the class of trend following strategies as an example, some source of literature review can be found in [7] , where the methods identifying the trend of market movements and benefiting from riding that trend are surveyed. A loss will occur when market reverts and oscillates. More descriptions can also be found in [8] , [9] and [10] , as an incomplete list. On the other hand, mean reversion strategies build on the belief that the market variables will revert to their long run equilibrium levels, with short term adjustments or fluctuations, where trading opportunities appear. Work in this category includes [11] , [12] and [13] , among others. However, this investment method will suffer greatly if the spread between the two sets of chosen assets widens. Therefore, both classes of investment strategies incur losses with advert market movements. The root cause is that, information from the past might not be a good indicator of future market movements or at least it can only predict or explain a limited fraction of future volatility.

Moreover, current practice estimates model parameters directly using historical time-series data, which, potentially, introduces some problems. First, the distributions of economic or financial time series might be time varying, therefore a brute-force inference using historical data might result in significant estimation bias. Second, imposing additional model structure on the historical data series introduces further model risks.

All those facts encourage the authors of this paper to search for a new framework, under which portfolio management and trading activities are conducted. Ideally, first, this framework should be model-free such that it is able to incorporate and accommodate all the approaches, either model dependent or model independent, static or dynamic, into it. Second, this framework should allow efficient and accurate parameter inference that captures time dependent feature of the coefficients involved.

In addition to the proposal of a general framework, we work with a large class of optimal investment strategies based on a rotation of the original asset space. For some of the strategies, we do not try to predict the market movements from the past data directly, which we only utilize to get model parameter estimates. Here all the mentioned parameters are theoretically current time measurable and do not involve prediction. Instead, we try to identify, orthogonalize and isolate the random sources in the market and diversify away the randomness (asymptotically or exactly). Moreover, we do not try to model the serial correlation structure of the asset returns and mainly focus on the cross-sectional properties of them. Simulation and backtesting studies show good applicability of the investment models that we have developed. To the best of our knowledge, this paper is the first to discuss such orthogonalization and diversification, in the literature of investment and portfolio management.

In the end, numerical experiments are carried out. We find consistently good performance under both simulation and backtesting studies, which coincides with the basic intuition that if we get good parameter estimates, performance will be guaranteed by mathematical and probability laws. In spite of the good performance, limitations of the testing approach and remediation are discussed.

The organization of this paper is as follows. Section 2 describes the main investment framework. Section 3 introduces concrete investment strategies under the proposed framework. Section 4 performs simulation studies to test the models proposed, Section 5 backtests the models using equity data in the US and China markets and Section 6 concludes. Readers who are only interested in the investment strategies can skip Section 2, which contains the rigorous mathematical derivations, and directly start from Section 3.

2. The Optimal Investment Framework

This section contains the mathematical description and construction of the optimal investment framework. We first introduce the necessary probability space and tools required for further analysis. Next, we propose a rotated asset space which we will be working with instead of the original asset space. Afterwards, we write down the general formulation of the optimization problem and propose an investment framework to jointly solve the optimization problem and meantime perform parameter inference via machine learning (ML). Readers can directly start from Section 3 for concrete trading models, with the understanding that skipping this section does not prevent them from understanding the investment strategies.

2.1. Mathematical Setup

Assume that the randomness in the financial economic system under consideration is modeled by a filtered probability space $\left(\Omega \mathrm{,}F\mathrm{,}{\mathcal{F}}_{(\cdot )}\mathrm{,}\mathbb{P}\right)$, where $\Omega $ represents the sample space, modeling the entire collection of possible outcomes of the system, $F$ represents all the information in the system and ${\mathcal{F}}_{(\cdot )}\mathrm{:}={\left\{{F}_{t}\right\}}_{t\ge 0}$ is the information filtration with $F\mathrm{:}={\displaystyle {\cup}_{t\ge 0}{F}_{t}}$, satisfying the usual conditions, and $\mathbb{P}$ is the historical probability measure on $F$. There are M financial assets in the economic system, whose rate of return processes are denoted by an M-vector ${\left\{{R}_{t}\right\}}_{t\ge 0}$. Suppose that ${R}_{t}\in {L}^{2}\left({F}_{t}\right)$, meaning that ${R}_{t}$ has finite variance-covariance structure for all $t\ge 0$. Obviously we have the following (trivial) decomposition

$\begin{array}{c}{R}_{t+h}=\underset{{\mu}_{t}}{\underset{\ufe38}{\mathbb{E}\left[{R}_{t+h}\mathrm{|}{\mathcal{F}}_{t}\right]}}+\underset{\text{martingaledifferencesequence}{\stackrel{^}{U}}_{t+h}}{\underset{\ufe38}{\left({R}_{t+h}-\mathbb{E}\left[{R}_{t+h}\mathrm{|}{\mathcal{F}}_{t}\right]\right)}}\\ ={\mu}_{t}+\underset{{\sigma}_{t}{U}_{t+h}}{\underset{\ufe38}{{\stackrel{^}{U}}_{t+h}}}\\ ={\mu}_{t}+{\sigma}_{t}{U}_{t+h}\\ \mathrm{:}=\underset{\text{deterministicandpredictableattime}t}{\underset{\ufe38}{\mu \left({t}_{0}\mathrm{,}{t}_{0}+h\mathrm{,}\cdots \mathrm{,}t\mathrm{,}\omega \mathrm{,}{R}_{{t}_{0}}\mathrm{,}\cdots \mathrm{,}{R}_{t}\mathrm{,}{U}_{{t}_{0}}\mathrm{,}\cdots \mathrm{,}{U}_{t}\right)}}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}+\underset{\text{deterministicandpredictableattime}t}{\underset{\ufe38}{\sigma \left({t}_{0}\mathrm{,}{t}_{0}+h\mathrm{,}\cdots \mathrm{,}t\mathrm{,}\omega \mathrm{,}{R}_{{t}_{0}}\mathrm{,}\cdots \mathrm{,}{R}_{t}\mathrm{,}{U}_{{t}_{0}}\mathrm{,}\cdots \mathrm{,}{U}_{t}\right)}}\underset{\text{randomattime}t}{\underset{\ufe38}{{U}_{t+h}}}\mathrm{.}\end{array}$ (1)

Here process ${\stackrel{^}{U}}_{t}\in {\mathbb{R}}^{M}$ is an M-vector generating the randomness, all the quantities are defined with proper dimensions, and it is obvious that ${F}_{t}=\sigma \left({\stackrel{^}{U}}_{0\to t}\right)$, i.e., $\stackrel{^}{U}$ generates all the information in the financial economic system. The last line of the above equation tries to impose some model structures on $\mu $ and $\sigma $, which can be of any functional form. $\stackrel{^}{U}$ can be modeled by, for example, a joint Lévy process, a system of stochastic differential equations, a linear or nonlinear time series or even a collection of artificial neural networks. A detailed explanation of Equation (1) is postponed to the next section.

2.2. Economic Settings and Financial Market

Following Section 2.1, we assume that the source of randomness in the economy and the financial market can be represented by an N-dimensional (
$1\le N\le \infty $ ) jointly independent and identically distributed (i.i.d.)^{3} stochastic process

${U}_{t}={\left\{{U}_{t}^{j}\right\}}_{j=1}^{N}$ (conditional on ${F}_{t-h}$ ) with zero mean and ${\text{COV}}_{t}\left({U}_{t+h}^{i}\mathrm{,}{U}_{t+h}^{j}\right)={\delta}_{i\mathrm{,}j}$,

for any
$t\ge 0$ and
$h>0$, where
${\delta}_{i\mathrm{,}j}$ is the Kronecker Delta and h is the smallest time increment under our consideration. Information filtration
${\left\{{F}_{t}\right\}}_{t\ge 0}$ is generated by U. Recall that, there are M primary financial assets traded in the market, whose rate of returns are denoted by an M-vector
${R}_{t}={\left\{{R}_{t}^{m}\right\}}_{m=1}^{M}$. Suppose that we have the conditional decomposition^{4} (conditional on
${F}_{t-h}$ )

${R}_{t}={\mu}_{t-h}+{\sigma}_{t-h}{U}_{t}$ (2)

where
${\mu}_{t-h}\in {F}_{t-h}$ is an
$M\times 1$ vector and
${\sigma}_{t-h}\in {F}_{t-h}$ is an
$M\times N$ matrix, which can both be estimated at some precision and accuracy at time
$t-h$. Then, we know that
${\mathbb{E}}_{t-h}\left[{R}_{t}\right]={\mu}_{t-h}$ ^{5} and
${\text{COV}}_{t-h}\left[{R}_{t}-{\mu}_{t-h}\right]={\sigma}_{t-h}^{\u22ba}{\sigma}_{t-h}$ ^{6}.

Note that, this setting is very general as $h\in {\mathbb{R}}^{+}$ can be 1-second, 1-day, 1-week or 1-year and it encompasses all the possible time frequencies. $\left({\mu}_{t-h}\mathrm{,}{\sigma}_{t-h}\right)$ can be any stochastic process materialized at time $t-h$, for every $\left(t\mathrm{,}h\right)\in {\mathbb{R}}_{2}^{+}$. $\left(\mu \mathrm{,}\sigma \right)$ also helps to model the cross-sectional and time series correlation structure of R. When $h\to 0+$, we can consider the limiting case of Equation (2) as a system of stochastic differential equations with jumps (SDEJ), when ${\mu}_{t-h}=\mu \left(t-h,{U}_{t-h},{R}_{t-h}\right)$ and ${\sigma}_{t-h}=\sigma \left(t-h,{U}_{t-h},{R}_{t-h}\right)$, i.e., they are functions of random sources ${U}_{t-h}$ and ${R}_{t-h}$ at time $t-h$. Equation (2) defines a general semi-martingale R when proper technical conditions are satisfied.

Remark 1 (On M, N and Factor Structure). If $M\ge N$, i.e., the number of financial assets R is larger than the number of random sources U, we are in an effectively complete market. As some ( $M-N$, to be accurate) assets are redundant, we can choose N linearly independent assets in this case. However, if $M<N$, the market is incomplete. For the sake of generality, we study the case where $N=\infty $. Consider an orthonormal basis ${\left\{{e}_{t}^{n}\right\}}_{n=1}^{\infty}$ and decompose ${R}_{t}^{m}$ as

${R}_{t}^{m}={\displaystyle \underset{n=1}{\overset{\infty}{\sum}}}{\langle {R}^{m},{e}^{n}\rangle}_{t}{e}_{t}^{n}$ (3)

where ${\langle \cdot \mathrm{,}\cdot \rangle}_{t}$ is the canonical inner product in ${L}^{2}\left({F}_{t}\right)$, i.e., the Hilbert space of all the stochastic processes that have finite second order moments at time t, and m runs from 1 to M. Equation (3) defines an infinite series expansion of ${R}_{t}^{m}$ and we truncate the first M elements and write

${R}_{t}^{m}={\displaystyle \underset{n=1}{\overset{M}{\sum}}}{\langle {R}^{m},{e}^{n}\rangle}_{t}{e}_{t}^{n}+{\phi}_{t}^{m,M}.$ (4)

Here ${\phi}_{t}^{m,M}:={\displaystyle {\sum}_{n=M+1}^{\infty}}{\langle {R}^{m},{e}^{n}\rangle}_{t}{e}_{t}^{n}$ is considered to be the residual term. Then, if we denote ${\Theta}_{t}:={\left({\theta}_{t}^{m,n}\right)}_{m,n}:={\left({\langle {R}^{m},{e}^{n}\rangle}_{t}\right)}_{m,n}$, ${e}_{t}=\left({e}_{t}^{1}\mathrm{,}\cdots \mathrm{,}{e}_{t}^{M}\right)$, likewise for ${R}_{t}$ and ${\varphi}_{t}^{M}=\left({\phi}_{t}^{\mathrm{1,}M}\mathrm{,}\cdots \mathrm{,}{\phi}_{t}^{M\mathrm{,}M}\right)$, we will have

${\Theta}_{t}^{-1}{R}_{t}={e}_{t}+{\Theta}_{t}^{-1}{\varphi}_{t}^{M}\mathrm{.}$ (5)

Clearly, our analysis is asymptotic in nature under the assumption that

$\frac{1}{M}{1}_{M}\cdot {\Theta}_{t}^{-1}{\varphi}_{t}^{M}\cong 0$

where ${1}_{M}$ is an M-vector with entries all equal to 1. The above description justifies the analysis in this paper, the asymptotic investment framework and related strategies to be proposed in Section 3.

Remark 2 (More on Factor Structure). Mapping Equation (2) to the popular factor representation of asset returns R, we have

${R}_{t}={\alpha}_{t-h}+{\beta}_{t-h}\cdot {F}_{t-\tau}+{\sigma}_{t-h}{\epsilon}_{t}$ (6)

where $\tau $ can be 0 or h. Equation (2) can be viewed as an equivalent form of Equation (6) after a proper Gram-Schmidt orthogonalization process on $\left(F\mathrm{,}\epsilon \right)$. The validity of a factor representation is justified in Remark 1. Moreover, the determination of the factor space F requires a thorough theoretical and empirical study. For example, one choice of the factors is VIX index, studied in [14] . A deep insight of linear factor models can be found in [15] . Classical results can also be found in [16] , [17] , and references therein.

2.3. The Optimal Investment Asset Space

2.3.1. The Optimal Rotation

We illustrate ideas under the discrete-time setting, with the understanding that to solve continuous time models, time discretization is inevitable, which essentially reduces a continuous time problem to a discrete time one. We are seeking a rotation matrix ${\lambda}_{t-h}$ and a portfolio weight vector ${w}_{t-h}$ of appropriate dimensions, such that the w-weighted average of the rotated asset space

$\underset{\text{randomattime}t-h}{\underset{\ufe38}{{w}_{t-h}\cdot {\lambda}_{t-h}\cdot {R}_{t}}}=\underset{\text{deterministicattime}t-h}{\underset{\ufe38}{{w}_{t-h}\cdot {\lambda}_{t-h}\cdot {\mu}_{t-h}}}+\underset{\cong 0}{\underset{\ufe38}{{w}_{t-h}\cdot {\lambda}_{t-h}\cdot {\sigma}_{t-h}\cdot {U}_{t}}}$ (7)

is what we want to work with.

An Example of the Rotation

Rewrite Equation (2) as

$\begin{array}{c}{\stackrel{^}{R}}_{t}={\left[{\sigma}_{t-h}^{\u22ba}{\sigma}_{t-h}\right]}^{-1}{\sigma}_{t-h}^{\u22ba}{R}_{t}\\ ={\left[{\sigma}_{t-h}^{\u22ba}{\sigma}_{t-h}\right]}^{-1}{\sigma}_{t-h}^{\u22ba}{\mu}_{t-h}+{U}_{t}\\ ={\theta}_{t-h}+{U}_{t}\end{array}$ (8)

assuming that
${\sigma}_{t-h}^{\u22ba}{\sigma}_{t-h}$ is of full-rank. Define the rotated assets as
${\stackrel{^}{R}}_{t}={\left[{\sigma}_{t-h}^{\u22ba}{\sigma}_{t-h}\right]}^{-1}{\sigma}_{t-h}^{\u22ba}{R}_{t}$, with the Moore-Penrose inverse
${\lambda}_{t-h}\mathrm{:}={\left[{\sigma}_{t-h}^{\u22ba}{\sigma}_{t-h}\right]}^{-1}{\sigma}_{t-h}^{\u22ba}$ defining the rotation^{7}. Note that, conditional on the information filtration
${F}_{t-h}$, i.e., all the public or private information available at time
$t-h$,
${\stackrel{^}{R}}_{t}$ is mutually orthogonal. Our goal is to find optimal portfolio weights
${w}_{t-h}=\left({w}_{t-h}^{1}\mathrm{,}{w}_{t-h}^{2}\mathrm{,}\cdots \mathrm{,}{w}_{t-h}^{N}\right)$, on the rotated asset space
${\stackrel{^}{R}}_{t}$, for all
$\left(t\mathrm{,}h\right)\in {\mathbb{R}}_{2}^{+}$ and
$t\ge h$. The optimally realized return at time t is therefore

${\stackrel{^}{R}}_{t}^{w}={w}_{t-h}\cdot {\stackrel{^}{R}}_{t}$ (9)

$={w}_{t-h}\cdot {\theta}_{t-h}+{w}_{t-h}\cdot {U}_{t}$ (10)

and

${\mathbb{E}}_{t-h}\left[{\stackrel{^}{R}}_{t}^{w}\right]={w}_{t-h}\cdot {\theta}_{t-h}\mathrm{.}$ (11)

With the rotated assets ${\stackrel{^}{R}}_{t}$, we can do the following optimization, i.e., to minimize the impact of the error term ${w}_{t-h}\cdot {U}_{t}$ on the investment strategies that we try to develop. In the sequel, we will always assume $M=N$, without loss of generality, unless we want to discuss the cases with incomplete market.

The Pricing Kernel

After the rotation stated in the previous section, we can define the stochastic discount factor in the rotated asset space, projected onto the space spanned by the assets $\stackrel{^}{R}$, as ${M}_{t}={R}_{t}^{f}{\displaystyle {\sum}_{n=1}^{N}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}{U}_{t}^{n}+{\xi}_{t}$, where ${R}_{t}^{f}$ is the (locally) risk-free rate and ${\xi}_{t}\perp {\stackrel{^}{R}}_{t}$, meaning that ${\xi}_{t}$ lies in the orthogonal space of ${\stackrel{^}{R}}_{t}$ in ${L}^{2}\left({F}_{t}\right)$. A linear transformation can bring the pricing kernel back to the original asset space.

2.3.2. Parameter Estimation via Regression Techniques

The General No Arbitrage Asset Pricing Formula. The classic asset pricing relation reads, under no arbitrage condition

${P}_{t}=\mathbb{E}\left[{m}_{t+h}{P}_{t+h}\mathrm{|}{F}_{t}\right]$ (12)

where h is the smallest time increment, ${P}_{t}$ denotes the asset price at time t, ${m}_{t}$ is the stochastic discount factor evaluated at time t and $F$ is the information filtration. Simple algebra transforms the above equation to

$\mathbb{E}\left[{m}_{t+h}\left({R}_{t+h}-{R}_{t+h}^{f}\right)\mathrm{|}{F}_{t}\right]=0$ (13)

where ${R}_{t+h}^{f}\in {F}_{t}$ is the return of the locally risk-free asset. Of course, stochastic discount factor m depends on the information filtration $F$, which is assumed to be generated by an r-dimensional process X. Further assume that ${m}_{t}=g\left(t\mathrm{,}{X}_{0}\mathrm{,}{X}_{h}\mathrm{,}\cdots \mathrm{,}{X}_{t}\right)$.

Suppose that the asset span is denoted by $\mathbb{S}$ and the corresponding return process is ${R}_{t}\in {L}^{2}\left({F}_{t}\right)$, i.e., $R\in {\mathbb{R}}^{M}$ is square-integrable. Equation (13) is equivalent to

${\text{cov}}_{t}\left({m}_{t+h}\mathrm{,}{R}_{t+h}-{R}_{t+h}^{f}\right)+{B}_{t\mathrm{,}t+h}{\mathbb{E}}_{t}\left[{R}_{t+h}-{R}_{t+h}^{f}\right]=0.$ (14)

Here
${B}_{t\mathrm{,}t+h}$ denotes the t-price of a zero-coupon bond with maturity time
$t+h$. Because we have
$\sigma \left({R}_{t}\right)\subseteq {\mathcal{F}}_{t}$ and
$\text{Span}\left(R\right)\subseteq \text{Span}\left(X\right)$ ^{8}, we can, without loss of generality, assume that
${m}_{t+h}=1+{\omega}_{t}\cdot {R}_{t+h}$. Then, we have

${\omega}_{t}{\Sigma}_{t}+{B}_{t\mathrm{,}t+h}\cdot {\mathbb{E}}_{t}\left[{R}_{t+h}-{R}_{t+h}^{f}\right]=0$ (15)

where $\Sigma $ is the conditional variance-covariance matrix and therefore

${\omega}_{t}=-{B}_{t\mathrm{,}t+h}\cdot {\mathbb{E}}_{t}\left[{R}_{t+h}-{R}_{t+h}^{f}\right]\cdot {\Sigma}_{t}^{-1}\mathrm{.}$ (16)

Once we obtain the weight process $\omega $, we can price any asset in the span $\mathbb{S}$. Therefore, the problem is to compute ${\mathbb{E}}_{t}\left[{R}_{t+h}\right]$ and ${\mathbb{E}}_{t}\left[{R}_{t+h}^{i}\cdot {R}_{t+h}^{j}\right]$ for any $\left(i\mathrm{,}j\right)$ pair. Alternative method to compute the weight process can be found in [6] . Suppose that the asset returns depend on a set of state variables, aforementioned and denoted by X. Here we separate X into $\left(Y\mathrm{,}Z\right)$, where Y is not asset specific and serves as a general factor process, but Z is asset specific, i.e., sub-vector ${Z}_{t}^{i}$ corresponds to asset ${R}_{t}^{i}$ and $Z=\left({Z}^{1},\cdots ,{Z}^{n}\right)$.

Time-Series Regression. For most of the papers in the literature, time-series regression is utilized to find the relation

${R}_{t+h}=f\left(t\mathrm{,}{X}_{t}\mathrm{,}{X}_{t-h}\mathrm{,}\cdots \mathrm{,}{X}_{0}\right)+{\u03f5}_{t\mathrm{,}t+h}\mathrm{.}$ (17)

This, of course, can serve as an option in our analysis. Functional form f can be represented by a basis function expansion or a deep artificial neural network. To obtain the values of ${\mathbb{E}}_{t}\left[{R}_{t+h}^{i}\cdot {R}_{t+h}^{j}\right]$, we can run a time-series regression of ${R}_{t+h}^{i}\cdot {R}_{t+h}^{j}$ on $\left({X}_{t}\mathrm{,}{X}_{t-h}\mathrm{,}\cdots \right)$. This means we can utilize regression technique to compute the volatility estimates.

Panel Regression. In this paper, we emphasize the methodology using the panel regression technique. Following the notation above, we have common risk factors Y, asset specific risk factors $Z=\left({Z}^{1},\cdots ,{Z}^{n}\right)$ and $X=\left(Y,Z\right)$. We can run a panel regression of the following form

$\left({R}_{t+h},{X}_{t+h}\right)=p\left(t,{R}_{t},{X}_{t},\cdots ,{R}_{t-kh},{X}_{t-kh}\right)+{\u03f5}_{t,t+h}.$ (18)

^{8
$\sigma \left(\xi \right)$ } is the information sigma-algebra generated by
$\xi $ and
$\text{Span}\left(R\right)$ is the linear space spanned by R.

Then, we have

${\mathbb{E}}_{t}\left[\left({R}_{t+h},{X}_{t+h}\right)\right]=p\left(t,{R}_{t},{X}_{t},\cdots ,{R}_{t-kh},{X}_{t-kh}\right).$ (19)

The benefit of doing so is three-fold. First, it can make use of the entire cross-section of asset return data. Second, it can give estimations of future risk factor returns X. Third, it generates more observations and can reduce the reliance on the past historical data series, i.e., k can be a small integer. To obtain the values of ${\mathbb{E}}_{t}\left[{R}_{t+h}^{i}\cdot {R}_{t+h}^{j}\right]$, we can run a panel regression of ${R}_{t+h}^{i}\cdot {R}_{t+h}^{j}$ on ${\stackrel{^}{X}}_{t}$, where ${\stackrel{^}{X}}_{t}$ is the new risk factor process adjusted for the dimensions of the variance-covariance matrix. In order to achieve better precision in function approximation via machine learning, we can interpolate the cross-sectional asset returns and formulate a cross-sectional curve to get as many samples as possible at any granularity and then perform the regression functional approximation via machine learning. For a meaningful interpolation we can first sort the cross-sectional asset returns and then interpolate the resulted curve via any interpolation technique, for example, linear, polynomial interpolation or even an artificial neural network. The detailed interpolation should be done as following. First sort ${R}_{t+h}$ and the related sorted return series is $\left({R}_{t+h}^{\left[1\right]}\mathrm{,}\cdots \mathrm{,}{R}_{t+h}^{\left[M\right]}\right)$. Also do the same for X and denote $\left({X}_{t+h}^{\left[1\right]}\mathrm{,}\cdots \mathrm{,}{X}_{t+h}^{\left[M\right]}\right)$ as the sorted series. Suppose the regressor variable is $\left(\mathrm{1,}\cdots \mathrm{,}M\right)$. We interpolate p points between $\left({R}_{t+h}^{\left[i\right]}\mathrm{,}{R}_{t+h}^{\left[i+1\right]}\right)$. Assume that $\left({X}_{t+h}^{\left[{k}_{i}\right]}\mathrm{,}{X}_{t+h}^{\left[{k}_{i+1}\right]}\right)$ pair corresponds to $\left({R}_{t+h}^{\left[i\right]}\mathrm{,}{R}_{t+h}^{\left[i+1\right]}\right)$. We choose p

equal distance points from the interpolated sequence for $\left({X}_{t+h}^{\left[1\right]}\mathrm{,}\cdots \mathrm{,}{X}_{t+h}^{\left[M\right]}\right)$ and the goal is achieved.

General Discussions. Time-series regression seeks the same functional relation between the dependent and independent variables across time, while the functional relations can be time-varying via panel regression. However, the dependency on risk factors of each asset in the universe can be different for time-series regression while for panel regression the functional dependency is the same across assets. Moreover, in time-series regression, we need more samples than the number of factors. However, in panel regression, we can incorporate asset specific factors, for example, earnings-per-share or book to market ratio, into the regression framework. If the asset span contains 7000 assets, we use 10 different asset specific risk factors and look back 2 periods in time, in panel regression there will be 7000 observations and 20 independent variables. As discussed previously, we can also interpolate the 7000 asset returns to get more samples, potentially, infinity, to run the regression. The reasons for the preference on panel regression over time-series regression are as follows. First, the functional dependency of asset returns on state variables, i.e., factors, might be time varying. Therefore, using long historical data series might be inappropriate. Second, according to derivatives pricing theory, the functional form of asset returns on the state variables is the same across different assets, which means a panel regression is suitable.

2.4. The General Optimization Problem

Consider the following dynamic portfolio optimization problem in a stochastically varying financial environment

$\underset{\left(w\mathrm{,}\tau \right)\in \mathcal{W}\times \mathcal{S}\left[\mathrm{0,}T\right]}{\mathrm{sup}}\phi \left[{D}_{\tau}{G}_{\tau}^{w}\right]$ (20)

where $\tau $ is a stopping time and $\mathcal{S}\left[\mathrm{0,}T\right]$ is a subspace of all the stopping times between $\left[\mathrm{0,}T\right]$. ${D}_{t}$ is the discount factor under the physical measure and ${G}_{t}^{w}$ is the cumulative payoff measure of the investment portfolio process ${w}_{t}$ adjusted by the risk, which can be path-dependent, where ${w}_{t}\in \mathcal{W}$, a proper space of optimal portfolio weights. $\phi (\cdot )$ is an appropriate measure of performance, which can be conditional expectation, conditional quantile or other metrics. Note that, Equation (20) relates the investment portfolio choice practice to a dynamic problem with optimal stopping. Therefore the optimal exercise boundary can be computed. Specifically, G can be Sharpe ratio, Sortino ratio, max drawdown, or any other risk adjusted performance measure. Function G can depend on realized values of state variables, or conditional expected values that require us to use ANNs to predict. The ultimate output of the above optimization problem is a set of portfolio weights at each time t. Note that $\omega $ can be modeled also via ANNs, according to [6] .

2.5. A New Investment Framework Incorporating Reinforcement Learning

2.5.1. The Big Picture

In [4] , [18] - [23] , reinforcement learning (RL) is introduced to solve dynamic hedging, portfolio optimization and asset pricing problems. The ability to handle dynamic programming makes RL suitable for those three types of problems, which are essentially the optimization of a target function (or functional) under dynamic constraints on control variables, where the state of the nature might evolve in a stochastic manner. The technical advantages and the ability to process huge amounts of data and high dimensional computations enable researchers to consider more complicated problems under realistic assumptions, for example, the consideration of market frictions, information asymmetry, stochastic differential games (that arise in the pricing practice of some types of equity swaps), market making and therefore mean-field games. Contrary to the traditional approach to relate those problems to BSDEs (backward stochastic differential equations) and solve them using machine learning methods, which is documented in [24] and [25] as an incomplete list, the RL approach seeks to solve the aforementioned problems in a more direct and brute-force manner, under the theoretical economic framework, therefore resulting in a model-free solution.

Back to our framework of investment, we propose to work under the rotated asset space $\stackrel{^}{R}$ and use Equation (20) as our main optimization problem. We call for a general methodology to combine the steps of model estimation and dynamic optimization. The reason is that, under the observation that the distributions of asset returns are time varying, it introduces bias to estimate the model parameters $\left({\mu}_{t}\mathrm{,}{\sigma}_{t}\right)$ at time t directly using the data prior to it. However, RL makes it possible to optimize the portfolio weights in a model-free manner. Equation (20) is more general than mean-variance optimization because G can be functional of past or future trajectories of $\left(\mu \mathrm{,}\sigma \mathrm{,}U\mathrm{,}R\right)$, which results in a dynamic stochastic optimal control problem. Dynamic portfolio optimization takes into account the market regime changes. As [26] illustrates, there are two additional hedging terms that account for stochastic varying interest rate and market price of risk processes in the portfolio decomposition formula.

To summarize, the authors of this paper propose a new way of practice in both pricing and trading, to eliminate model dependence^{9}, while maintaining the general economic or finance framework. This calls for the development of more advanced Artificial Neural Network (ANN) and RL techniques^{10}.

2.5.2. Factor Construction

The first step to create the general investment framework is data processing and factor construction. Basically, factors represent the risk decomposition of any asset return in the universe and by bearing the risks, investors get rewarded in the financial market. Factors, which can be both qualitative and quantitative, should fall into the following categories. First, political environment and policies made by government and other authorities should be included. Second, macroeconomic factors, such as economic cycle, GDP, inflation, monetary policies and fiscal policies, should be included. Third, micro-economic and financial factors, such as market returns, yield to maturity for bonds, are helpful. Fourth, fundamental factors, such as earnings per share, book to market ratios, are crucial to equities. Fifth, technical indicator factors, such as the output from other predictive or trading models, namely, the resulted portfolio weights or predictions, stock market technical indicators and indicators from behavioral finance theories, i.e., market sentiment, real time news from natural language processing, should also be included. Please note that, there might be factors that do not fall into the above categories, such as weather conditions, as long as they are helpful in identifying the risk characteristics of an asset, we should include them. To summarize, this module processes raw market data and formulates different factors, for a meaningful risk decomposition of asset returns. Moreover, different factors might have different observation frequencies, sometimes, interpolation, in time dimension, is needed for granularity considerations.

2.5.3. Variables Prediction

^{9}Eliminating model dependence introduces data dependence.

^{10}One reference on this topic is [34] .

After the factors are constructed, we can use the methodologies outlined in Section 2.3.2 to compute the conditional expected values for asset returns and factor returns. The predictions can be made at any time frequency, e.g., a second, an hour, a day, a week, or even a quarter. It would be helpful to mention that the frequency of factor input should match the frequency of prediction. Instead of using calendar as a measure of time flow, we can use trading volume or volatility as the metric. For example, we divide the data into small blocks by equal trading volumes or the accumulation of volatility. If reinforcement learning technique is used, this step will be merged into the next one: Portfolio Construction.

2.5.4. Portfolio Construction

After we obtain the predicted asset and factor returns, we can compute the portfolio weights based on them as inputs. The first candidate is mean-variance frontier or Black-Litterman model. As a second choice, we can utilize machine learning classification techniques, e.g., k-means method, to categorize the asset universe into small groups based on the predicted and realized values and make long-short decisions accordingly. Moreover, Bayesian constrained deep reinforcement learning can be used to formulate the dynamic optimal portfolios. We emphasize that, outputs from other investment models can be used as inputs to our framework either by formulating them as factors or via ensemble method, which is quite popular in machine learning literature and practice.

2.5.5. Risk Management and Attribution

There are many risk management techniques available in the literature. We mention two of them. First, dynamic portfolio insurance techniques can be used to reduce the max-drawdown of the constructed portfolio. Second, we can set stop-gain or stop-loss threshold to formulate a more prudent strategy. The last step is to perform a sensitivity and risk-attribution analysis to better understand the strategy performance.

3. Investment Strategies under the General Framework

3.1. A Brute Force Mean-Variance or Dynamic Portfolio Optimization

^{11}According to [26] , as long as
${\mathrm{lim}}_{M\to \infty}\frac{1}{{M}^{2}}VA{R}_{t-h}\left({\displaystyle {\sum}_{m=1}^{M}}\text{\hspace{0.05em}}{R}_{t}^{m}\right)=0$, we will have
${\mathrm{lim}}_{M\to \infty}\frac{1}{M}{\displaystyle {\sum}_{m=1}^{M}}\text{\hspace{0.05em}}{R}_{t}^{m}=\frac{1}{M}{\displaystyle {\sum}_{m=1}^{M}}\text{\hspace{0.05em}}{\mathbb{E}}_{t-h}\left[{R}_{t}^{m}\right]$ in probability. Therefore, in this situation, it is fine to assign equal weights to the original asset space.

^{12}Alternatively, the joint distribution of U can be recovered directly from historical data.

^{13}Otherwise empirical quantiles have to be used.

We refer the interested readers to [27] , [28] and [29] for an introduction to mean-variance optimization, a Bayesian approach that builds on [27] and an illustration of a typical dynamic portfolio optimization problem. Machine learning based methods to perform dynamic portfolio optimization are introduced in [19] and [23] . Applications of reinforcement learning on portfolio management can be found in [18] , [21] , [22] , [30] and [31] . Please note that, now we are working with the rotated asset returns
$\stackrel{^}{R}$ ^{11}. All of the above analysis can be applied to the rotated asset space under the proposed framework.

3.2. Quantile Investing

With Equation (9), we can build a trading model which is profitable at some confidence level. To do this, we need to assume a joint distribution for U^{12} in order to compute its quantile values^{13}. Taking joint Gaussian distribution as an example, denote the top α-quantile of
${w}_{t-h}\cdot {U}_{t}$ as
${q}_{\alpha}^{+}\left(t\right)$ and the bottom α-quantile of
${w}_{t-h}\cdot {U}_{t}$ as
${q}_{\alpha}^{-}\left(t\right)$. Then, if
${w}_{t-h}\cdot {\theta}_{t-h}>{q}_{\alpha}^{+}\left(t\right)$, we long
${w}_{t-h}\cdot {\stackrel{^}{R}}_{t}$ and if
${w}_{t-h}\cdot {\theta}_{t-h}<{q}_{\alpha}^{-}\left(t\right)$, we short
${w}_{t-h}\cdot {\stackrel{^}{R}}_{t}$. This method applies at portfolio level or individual asset level.

3.3. Long-Short Portfolio

Long-short portfolio method means that we can score each asset, e.g., based on the past values and predictions of its returns or other variables, in the universe, and rank the cross section. Long the top (bottom) and short the bottom (top) quantiles of the asset span and formulate a trend following (contrarian) strategy. The classification of different groups of assets can be done via machine learning classification methods based on the realized and predicted values, i.e., the computed conditional expectations. Please note that, the financial market can be a perfect blend of momentum effect and mean-reversion effect. For example, the stocks that perform well might continue to perform well. However, the stocks with worst performance in the past might tend to perform better in the next period. This leads us to ask, whether, in the long run, mean-reversion effect is significant and, in short run, momentum effect is dominating or the opposite? Careful empirical investigations should be carried out to answer this question.

3.4. Eliminating the Randomness Asymptotically

3.4.1. The Case with Short-Selling

As a third attempt, we try to utilize strong law of large numbers (S-LLN hereafter) to eliminate the randomness represented by ${U}_{t}$ at time $t-h$, if M is

large enough. Let ${w}_{t-h}^{j}\equiv \frac{1}{M}$ and we have

$\frac{{1}_{M}\cdot {\stackrel{^}{R}}_{t}}{M}\equiv \frac{{1}_{M}\cdot {\theta}_{t-h}}{M}+\frac{{1}_{M}\cdot {U}_{t}}{M}\mathrm{.}$ (21)

Here ${1}_{M}=\left(\mathrm{1,1,}\cdots \mathrm{,1}\right)$ is an M-vector. But note that, as U is jointly independent (or uncorrelated), when M (and therefore N) is large, we have

$\frac{{1}_{M}\cdot {U}_{t}}{M}\to 0$ in probability or almost surely, according to the (strong) LLN. Therefore, we have, approximately

$\frac{{1}_{M}\cdot {\stackrel{^}{R}}_{t}}{M}\cong \frac{{1}_{M}\cdot {\theta}_{t-h}}{M}\mathrm{.}$ (22)

Equation (22) is extremely powerful. It equates the realization of a future random variable at time t to a variable which is known currently at $t-h$. If

$\frac{{1}_{M}\cdot {\theta}_{t-h}}{M}>{\lambda}_{+}$, we can long the asset $\frac{{1}_{M}\cdot {\stackrel{^}{R}}_{t}}{M}$ and when $\frac{{1}_{M}\cdot {\theta}_{t-h}}{M}<{\lambda}_{-}$, we short $\frac{{1}_{M}\cdot {\stackrel{^}{R}}_{t}}{M}$. Here ${\lambda}_{+}\ge 0$ and ${\lambda}_{-}\le 0$ are two threshold values that trigger the algorithm, which can be time dependent or even stochastic.

3.4.2. The Case without Short-Selling

^{14}T is a g-vector.

For countries or regions that do not permit short-selling or this activity is costly, we can use futures contract to continue our analysis. Consider the asset space
${R}_{t}$ and g futures contracts
${F}_{t\mathrm{,}T}\in {\mathbb{R}}^{g}$ which start at time t with maturity time T^{14}. Consider the following M-factor regression

${F}_{t,T}=a+b{R}_{t}+\gamma \cdot {U}_{t,T}$ (23)

where ${U}_{t,T}:=\left({U}_{t,T}^{1},\cdots ,{U}_{t,T}^{g}\right)$ is an i.i.d. sequence and $\gamma $ describes the covariance structure. Because, we have

${R}_{t}:={\mu}_{t-h}+{\sigma}_{t-h}{U}_{t}$ (24)

then

${F}_{t,T}=a+b{\mu}_{t-h}+\left[b{\sigma}_{t-h},\gamma \right]\left[{U}_{t},{U}_{t,T}\right]$ (25)

$={\theta}_{t-h}+{\upsilon}_{t-h}{\stackrel{^}{U}}_{t\mathrm{,}T}\mathrm{.}$ (26)

Here

${\stackrel{^}{U}}_{t,T}=GS\left[{U}_{t},{U}_{t,T}\right]$

where GS denotes Gram-Schmidt orthogonalization under the canonical ${L}^{2}\left({F}_{t}\right)$ -norm. Then, perform similar transformation

${\left[{\upsilon}_{t-h}^{\u22ba}{\upsilon}_{t-h}\right]}^{-1}{\upsilon}_{t-h}^{\u22ba}\cdot {F}_{t\mathrm{,}T}={\left[{\upsilon}_{t-h}^{\u22ba}{\upsilon}_{t-h}\right]}^{-1}{\upsilon}_{t-h}^{\u22ba}\cdot {\theta}_{t-h}+{\stackrel{^}{U}}_{t\mathrm{,}T}\mathrm{.}$ (27)

Same analysis follows from Equation (27). As we can see from the above algorithm, regression (23) has a large number of factors, which is difficult to implement in practice with limited computation power. Therefore, we do not test this algorithm in this paper.

3.4.3. The Impact of Estimation Error

In this section, we study the impact of estimation errors on the proposed optimal trading strategy in Section 3.4. First, assume that ${\mu}_{t-h}$ is estimated with a zero-mean error ${\stackrel{^}{\mu}}_{t-h}={\mu}_{t-h}+{\iota}_{t-h}$, where ${\iota}_{t-h}$ is the error term with ${\mathbb{E}}_{t-h}\left[{\iota}_{t-h}\right]=0$. Then, we have

${\stackrel{^}{R}}_{t}\mathrm{:}={P}_{t-h}^{\sigma}\left({\mu}_{t-h}+{\iota}_{t-h}\right)+\left({U}_{t}-{P}_{t-h}^{\sigma}{\iota}_{t-h}\right)$ (28)

where ${P}_{t-h}^{\sigma}\mathrm{:}={\left[{\sigma}_{t-h}^{\u22ba}{\sigma}_{t-h}\right]}^{-1}{\sigma}_{t-h}^{\u22ba}$. It can be seen that when $M=N$ is sufficiently large, we will have

$\frac{{1}_{M}}{M}\cdot {P}_{t-h}^{\sigma}\left({\mu}_{t-h}+{\iota}_{t-h}\right)$ (29)

$=\underset{\text{truevalue}}{\underset{\ufe38}{\frac{{1}_{M}}{M}\cdot {\theta}_{t-h}}}+\underset{\text{errorterm}}{\underset{\ufe38}{\frac{{1}_{M}}{M}\cdot {P}_{t-h}^{\sigma}{\iota}_{t-h}}}\mathrm{.}$ (30)

Because ${P}_{t-h}^{\sigma}{\iota}_{t-h}$ has zero mean, according to [32] , as $M\to \infty $, we will have convergence in probability $\frac{{1}_{M}}{M}\cdot {P}_{t-h}^{\sigma}{\iota}_{t-h}\to 0$, under mild technical conditions. Therefore, the error term will not impact the estimation of the true value of $\frac{{1}_{M}}{M}\cdot {\theta}_{t-h}$ if M is large, based on the assumption that ${\mathbb{E}}_{t-h}\left[{\iota}_{t-h}\right]=0$.

The next step is to estimate the impact of estimation error of volatility matrix on the algorithm given that the drift term $\mu $ is estimated correctly. Suppose we have

${\stackrel{^}{R}}_{t}=\left({P}_{t-h}^{\sigma}+{\u03f5}_{t-h}\right){\mu}_{t-h}+\left(I+{\epsilon}_{t-h}\right){U}_{t}$ (31)

$=\underset{\text{trueestimator}}{\underset{\ufe38}{\left({P}_{t-h}^{\sigma}{\mu}_{t-h}+{U}_{t}\right)}}+\underset{\text{estimationerror}}{\underset{\ufe38}{\left({\u03f5}_{t-h}{\mu}_{t-h}+{\epsilon}_{t-h}{U}_{t}\right)}}\mathrm{.}$ (32)

Here $\left(\u03f5\mathrm{,}\epsilon \right)$ are estimation error terms which are uncorrelated with U, with ${\mathbb{E}}_{t-h}\left[{\epsilon}_{t-h}\right]=0$ and ${\mathbb{E}}_{t-h}\left[{\epsilon}_{t-h}\right]=0$. A similar argument shows that the impact of the error terms will vanish if M is sufficiently large. The discussion of joint impact of estimation errors from both drift $\mu $ and volatility term $\sigma $ is analogous with only more complicated formula.

3.5. Eliminating the Randomness Completely

Following Equations (6) and (7), we can write

${\lambda}_{t-h}\cdot {R}_{t}={\lambda}_{t-h}\cdot {\alpha}_{t-h}+{\lambda}_{t-h}\cdot {\beta}_{t-h}\cdot {F}_{t}+{\lambda}_{t-h}\cdot {\sigma}_{t-h}\cdot {\epsilon}_{t}$ (33)

$={\lambda}_{t-h}\cdot {\alpha}_{t-h}+{\lambda}_{t-h}\cdot {\Phi}_{t-h}\cdot {\Upsilon}_{t}.$ (34)

Let us remind the readers that ${R}_{t}\in {\mathbb{R}}^{M}$, ${\lambda}_{t-h}\in {\mathbb{R}}^{J\times M}$, ${\sigma}_{t-h}\in {\mathbb{R}}^{M\times M}$ and ${\epsilon}_{t}\in {\mathbb{R}}^{M\times M}$, where J is the number of rotated assets chosen to formulate the optimal portfolio. We denote $\Upsilon \mathrm{:}=\left[F\mathrm{,}\epsilon \right]\in {\mathbb{R}}^{M\times N}$ and $\Phi \mathrm{:}=\left[\beta \mathrm{,}\sigma \right]\in {\mathbb{R}}^{M\times N}$. In this section, we assume that the market is incomplete with $M<N$ and $J\le N-M$. The null space of matrix ${\Phi}_{t-h}$ is denoted by $Ker\left({\Phi}_{t-h}\right)$ and its

dimension is $N-M$. Further denote ${\left\{{v}_{t-h}^{n}\right\}}_{n=1}^{J}$ as a set of linearly independent vectors in $Ker\left({\Phi}_{t-h}\right)$. Taking ${\left\{{v}_{t-h}^{n}\right\}}_{n=1}^{J}$ as portfolio weights, we have

${\stackrel{^}{R}}_{t}^{n}={v}_{t-h}^{n}\cdot {R}_{t}={v}_{t-h}^{n}\cdot {\alpha}_{t-h}.$ (35)

Therefore, we have J rotated assets
${\stackrel{^}{R}}_{t}\mathrm{:}={\left\{{\stackrel{^}{R}}_{t}^{n}\right\}}_{n=1}^{J}$, which are conditional deterministic. We can perform mean-variance optimization or simply assign equal weights to
${\stackrel{^}{R}}_{t}$, based on the signs^{15} of
${\left\{{v}_{t-h}^{n}\cdot {\alpha}_{t-h}\right\}}_{n=1}^{J}$.

3.6. Constructing Factor Mimicking Portfolios

From the decomposition Formula (2), we can derive the equations to construct the factor mimicking portfolios. Suppose we have

${R}_{t}=\alpha +\beta {F}_{t}+{\u03f5}_{t}$ (36)

^{15}It is also interesting to find vectors
${\left\{{v}_{t-h}^{n}\right\}}_{n=1}^{J}\in Ker\left({\Phi}_{t-h}\right)$ such that
${\left\{\left|{v}_{t-h}^{n}\cdot {\alpha}_{t-h}\right|\right\}}_{n=1}^{J}$ achieves maximum.

where ${R}_{t}$ is $N\times 1$, $\alpha $ is $N\times 1$, $\beta $ is $N\times K$, ${F}_{t}$ is $K\times 1$ and ${\u03f5}_{t}$ is of $N\times 1$ dimension. A simple multiplication of both sides of Equation (36) by the Moore-Penrose inverse of $\gamma =\left(\beta ,I\right)$ yields

$\left({F}_{t}\mathrm{,}{\u03f5}_{t}\right)\cong {\left({\gamma}^{\u22ba}\gamma \right)}^{-1}{\gamma}^{\u22ba}\left({R}_{t}-\alpha \right)\mathrm{.}$ (37)

We can construct indexes of F by constructing portfolios of R according to Equation (37).

4. Simulation Study

4.1. Methodology

4.1.1. Direct Simulation

For direct simulation approach, We assume that $\left({\mu}_{t-h}\mathrm{,}{\sigma}_{t-h}\right)$ is estimated correctly and Equation (8) is already obtained. It is justifiable to use independent draws for daily realizations of ${\theta}_{t-h}$ across time, as our methods do not try to forecast future values based on serial correlations, which is the main difference between the proposed trading models and almost all the other methods in the existing literature. Of course, simulating directly the market price of risk vector ${\theta}_{t-h}$ avoids the estimations of model parameters based on historical data, which eliminates some estimation errors. Transaction cost is simplified to be 5.00% per year and subtracted from the realized strategy return at each month. The real backtesting study is postponed to Section 5. Note that, for risk management purposes, in real-world backtesting in the US equity market, we can mix our portfolio with some percentage of VIX index or add risk management to the current methodologies.

4.1.2. Calibration and Simulation 1

Next, we work with a more realistic setting in order to illustrate how calibration impacts the model performance. The data generating process (DGP) of asset returns R is assumed to be a Gaussian process ${R}_{t}^{m}\cong N\left({\mu}_{t}^{m}\mathrm{,}{\sigma}_{t}^{m}\right)$, where ${\mu}_{t}^{m}={\stackrel{^}{\mu}}^{m}+{\u03f5}_{t}^{m}$ is a combination of a trend term ${\stackrel{^}{\mu}}^{m}$ plus a noise ${\u03f5}_{t}^{m}$ which also follows a joint normal distribution $N\left(\mathrm{0,}{\Sigma}_{\u03f5}\right)$ at each time t. ${\left\{{\sigma}_{t}^{m}\right\}}_{m=1}^{M}$ is again sampled from a joint Gaussian distribution $N\left(\sigma \mathrm{,}{\Sigma}_{\sigma}\right)$ at each time t. The covariance structure of the DGP is modeled through $\left({\Sigma}_{\u03f5}\mathrm{,}{\Sigma}_{\sigma}\right)$.

We first determine the number of assets M and the time scope T, which are set to be 200 and 240-months. Then, we simulate one trajectory of the M assets with T months into the future and perform the model calibration for $\left({\mu}_{t}\mathrm{,}{\sigma}_{t}\right)$ in Equation (2) and optimization analysis. For realistic values of $\left(\stackrel{^}{\mu}\mathrm{,}\sigma \mathrm{,}{\Sigma}_{\u03f5}\mathrm{,}{\Sigma}_{\sigma}\right)$, we record the Sharpe and Sortino ratios of the strategies under equal weight assumption, scaled weight assumption and risk parity assumption, which are documented in Section 4.1.1.

4.1.3. Calibration and Simulation 2

The simulation study in this section compares the simple NAV curve produced by a naive trend following model and our equal-weight model. The trend following model proceeds as follows. Compute the T-period moving average of asset returns for each stock and thus formulate a vector of moving averages ${\stackrel{\xaf}{R}}_{t}$ at time t. Long the cross section with equal weights if ${1}_{M}\cdot {\stackrel{\xaf}{R}}_{t}>0$ and short otherwise. The DGP of the simulation study in this section is an ARMA(1,1)-GARCH model with realistic coefficients, simulated using ugarchpath function offered by rugarch package in R programming language.

4.1.4. Calibration and Simulation 3

In this section, we try to test a long only strategy based on the computed conditional expectations via simulation study. The study proceeds as following. First, simulate asset-specific factor processes $H\in {R}^{M\times T}$ and $B\in {R}^{M\times T}$ as independent draws from random normal variables, where M is the number of assets in the cross-section and T is the number of periods for our consideration. Then transform the processes H and B by the following logic: replace the i-th column by the average of $\left(i-1\right)$ -th and i-th column for each H and B. This step adds some serial correlations across time for $\left(H\mathrm{,}B\right)$. Suppose that the asset return process follows $R=5\times {H}^{2}+H-H\times B$. We compute ${\mathbb{E}}_{t}\left[{R}_{t+h}\right]=\varphi \left(t,{H}_{t},{B}_{t}\right)$ via machine learning regression. Then, we sort the conditional expected returns computed, long the top 100 assets and record the performance. We set $M=1000$ and $T=500$ days in this simulation study. Another two possible strategies that serve as the extensions of the one introduced in Section 3.4 are as follows. First, compute the mean value of the conditional expected returns and if it is positive, long the asset universe. We go short the asset universe otherwise. The asset span is rotated as proposed in Section 2.3.1. Second, find the maximum expected rotated asset return at each time t, and long this asset when the expected value is positive, hold to the next period and we go short otherwise.

4.2. Numerical Results

4.2.1. Simulation Study 1

In this simulation study, we test the methods outlined in Sections 2.4, 3.4 and 4.1.1. Specifically, we generate 36 months of Sharpe ratios $\theta $ of the M assets, perform optimization at each month and record the net asset value curve (NAV curve). After obtaining the values of $\theta $, we generate U and therefore $\stackrel{^}{R}$. We generate ${w}_{t-h}\cdot {U}_{t}$ randomly at each month and add to the realized returns to account for the remaining randomness of the market. We assume that $\theta $ is sampled from a Gaussian distribution with mean 0.95 and standard deviation 1.40, annualized. Those values are estimated from S&P500 historical data from 1990/01/01 to 2018/02/27. The frequency of data is set to be monthly. After

obtaining the time series of $\stackrel{^}{R}$, we estimate ${\stackrel{\xaf}{\theta}}_{t+T-1}:=\frac{1}{T}{\displaystyle {\sum}_{j=0}^{T-1}}\text{\hspace{0.05em}}{\stackrel{^}{R}}_{t+j}$. Table 1 shows the results in various cases. Figure 1 shows the net asset value (NAV) curves.

4.2.2. Simulation Study 2

In this simulation study, we test the methods outlined in Sections 2.4, 3.4 and 4.1.1. Table 2 shows the results in various cases. Figure 2 shows the net asset value (NAV) curves.

Table 1. Sharpe ratios of different strategies.

Figure 1. Net asset value curves of simulation study 1 for unit and equal weights with M = 300.

Figure 2. Net asset value curves of simulation study 1 for unit and equal weights and M = 300.

Table 2. Sharpe ratios of different strategies.

4.2.3. Simulation Study 3

Simulation Study 3 corresponds to the ideas presented in Sections 2.4, 3.4 and 4.1.2. The methodology is as follows. First, simulate the market rate of returns R using the DGP specified in Section 4.1.2. Second, with the realized time series of ${\mathrm{\{}{R}_{t}\mathrm{\}}}_{t\mathrm{=1}}^{K}$, use a moving window of T periods to compute

${\stackrel{^}{\mu}}_{t+T-1}\mathrm{:}=\frac{1}{T}{\displaystyle \underset{j=0}{\overset{T-1}{\sum}}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}{R}_{t+j}$ (38)

${\stackrel{^}{\sigma}}_{t+T-1}^{\u22ba}{\stackrel{^}{\sigma}}_{t+T-1}\mathrm{:}=COV\left({R}_{t}\mathrm{,}{R}_{t+1}\mathrm{,}\cdots \mathrm{,}{R}_{t+T-1}\right)\mathrm{.}$ (39)

Third, compute ${\stackrel{^}{\theta}}_{t+T-1}\mathrm{:}={\left[{\stackrel{^}{\sigma}}_{t+T-1}^{\u22ba}{\stackrel{^}{\sigma}}_{t+T-1}\right]}^{-1}{\stackrel{^}{\sigma}}_{t+T-1}^{\u22ba}{\stackrel{^}{\mu}}_{t+T-1}$. Fourth, U can be recovered by applying ${\stackrel{^}{U}}_{t-T+1}\mathrm{:}={\left[{\stackrel{^}{\sigma}}_{t+T-1}^{\u22ba}{\stackrel{^}{\sigma}}_{t+T-1}\right]}^{-1}{\stackrel{^}{\sigma}}_{t+T-1}^{\u22ba}{R}_{t+T-1}-{\stackrel{^}{\theta}}_{t+T-1}$. We report the Sharpe ratios in Table 3 and NAV curves are shown in Figure 3. We use a moving window of 12 months to estimate the joint variance-covariance matrix ${\Sigma}_{t}={\sigma}_{t}^{\u22ba}{\sigma}_{t}$ via cov.shrink function in R language and follow the rule below to decompose ${\Sigma}_{t}={B}_{t}^{\u22ba}{B}_{t}$, where ${B}_{t}={V}_{t}{\lambda}_{t}{V}_{t}^{\u22ba}$, ${V}_{t}$ is the orthogonal matrix of

Figure 3. Net asset value curves of simulation study 3 for scaled, equal and rotated weights.

Table 3. Sharpe ratios of different strategies in simulation study 3.

eigenvectors of ${\Sigma}_{t}$ and ${\lambda}_{t}$ is a diagonal matrix collecting the square-root of eigenvalues of ${\Sigma}_{t}$.

4.2.4. Simulation Study 4

This experiment corresponds to Section 4.1.3. The NAV curves are displayed in Figure 4. We set $T=6$ and $M=\left(25,50,250\right)$ for this study.

4.2.5. Simulation Study 5

The simulation study in this section corresponds to the method documented in Section 4.1.4. NAV plot for long only strategy is on the left, and the NAV for law of large numbers based strategy is on the right (Figure 5).

5. Backtesting

5.1. US Equity Market

5.1.1. The Market Data

We use monthly US stock return data downloaded from the CRSP database provided by WRDS. The data span from January of 1999 to December of 2018. After excluding missing return data, there are 1,521,909 security-month observations. The data set is comprised of various publicly traded securities, of which common stock is the major type, accounting for 81.00% of the sample. Returns are winsorized at the 1st and 99th percentile within each share code-month group to mitigate the effect of outliers or simply truncated at $\pm \Upsilon \mathrm{\%}$ -level for some chosen $\Upsilon $.

5.1.2. The Testing Methodologies

For US equities, we apply the methodology introduced in Section 2.3.2 to estimate the conditional expected asset returns and shrinkage for variance-covariance matrix using a T-day moving average window. Two scenarios are considered. For the first scenario, we aim at testing the claim that as long as the conditional expected asset returns can be correctly estimated, the model performance is guaranteed by S-LLN. Therefore, we assume an imperfect foresight for one period ahead: meaning that at each time t, we can use the returns materialized at time $t+h$ to infer the conditional expectations of asset returns

Figure 4. Net asset value curves of simulation study 4 for naive trend following and equal weights strategy with (25, 50, 250) assets.

Figure 5. Net asset value curves of simulation study 5.

${\mathbb{R}}_{t}\left[{R}_{t+h}\right]$, where the estimation methodology and R are described in Section 2.3.2. The factor process is chosen also to be R, i.e., we let the cross-sectional returns to explain their own behavior. If the average of the cross-sectional of the expected returns is positive, we go long the entire universe with equal weights. We short the equally-weighted universe if the aforementioned quantity is negative. The second scenario tries to infer the conditional expected asset returns in the same way and obtain variance-covariance matrix through shrinkage method, using cov.shrink function in R. Under the rotated asset space $\stackrel{^}{R}$, we compute the conditional expected returns and rank them, long the top θ-percentage and short the bottom θ-percentage. The performance is evaluated quarterly. We do not account for transaction cost in this backtesting study. A preliminary test shows that a 30 bps cost per unit of portfolio weight change does not affect the testing results. Nor does the restriction of asset space to domestic US common shares affect the outcome. Our tests are robust in this sense.

5.1.3. Testing Results

For the first scenario, we get 100% positive quarterly returns and a Sharpe ratio of 1.87. Sortino ratio is therefore $\infty $. The result of the second scenario is summarized in Table 4, where $T=4$. Figure 6 shows the same methodology (Scenario 2) applied monthly. Annual Sortino ratio is again $\infty $.

5.2. China A-Shares

5.2.1. The Market Data

In addition to the US equity market, we test our algorithm with China A-shares. The monthly return data are downloaded from Wind terminal and span from

Table 4. US equity performance with 500 stocks randomly selected at the beginning of each quarter and performing the long-short strategy on the rotated asset space. Results are stable for different random draws. Sortino ratio is above 9.00.

Figure 6. Net asset value curve of scenario 2, applied monthly.

January of 2002 to August of 2018. We consider Shanghai and Shenzhen A-share markets. HS300 index data are also downloaded from Wind. Returns are winsorized as following. For positive returns, we apply a cap of 99.00% and for negative returns, a floor of −99.00% is imposed.

5.2.2. The Testing Methodology

We test the algorithm introduced in Section 3.4.1 with equally-weighted portfolios in the original asset space. The methodology is as follows. We compute the T-quarter moving average of past returns for each of the stocks in the A-share universe. Then, apply equal weights on the vector of the moving averages. If this number is positive, we long the equal-weight portfolio and if this number is negative, we short the equal-weight portfolio.

5.2.3. Testing Results

The net asset value curve (NAV curve) is shown in Figure 7. Annualized Sortino ratio is 4.67. Percentage of positive annual returns is 81.25%.

5.3. Limitations and Challenges

Theoretically sound, our general investment framework does rely heavily on the quality of parameter estimates for $\left(\mu \mathrm{,}\sigma \mathrm{,}\theta \right)$ and ${\left[{\sigma}^{\u22ba}\sigma \right]}^{-1}$, as illustrated by both simulation and backtesting studies. Better estimates result in better strategy performance. However, under the observations that the distributions of asset price returns are time varying, it becomes very hard to estimate the accurate values of the model parameters based merely on the time-series data as each point in the time series is sampled from a different distribution. Also, the time varying property might be different under various sample frequencies. Our general framework posts two challenges to the field of parameter estimation. First, find the right time frequency such that the model parameters can be estimated

Figure 7. Net asset value curve of backtesting using equal weights with moving window 4 quarters and 3101 stocks in the universe of China A-share market. The figure corresponds to long-short portfolios. Time ranges from 2003-8 to 2018-8.

accurately. Second, find the correct estimation method to minimize the numerical errors.

6. Conclusions and Future Research

In this paper, we outline a general framework of optimal investment and discuss several concrete investment strategies under this framework. The basic idea of the proposed investment methodologies is proper diversification and the elimination of the future market randomness. Simulation studies and backtesting show good performance of the proposed methods under this framework. Note that, the same ideas apply to all categories of investment strategies, i.e., trend following, mean-reversion, long-short, etc.. For example, the long-short strategy tries to score the assets and long or short certain classes of assets whose scores fall in predetermined sets. We can apply this type of analysis on the rotated asset space. The integration of the proposed investment framework with other classes of strategies is also interesting, which we leave to future research.

Moreover, the real market environment is, of course, much more complicated than the data generating processes we consider in the simulation study. The research for more advanced model parameter estimation techniques, when there is model uncertainty, time changing parameters and measurement errors, is highly important and necessary. Examples can be found in [33] and [1] . Moreover, similar to [22] , our framework can be combined with deep reinforcement learning techniques to simultaneously estimate the model parameters and meantime perform portfolio optimization. In [34] , a flexible stochastic volatility model framework is proposed based on neural network, which might also be a future direction to impose flexible model structures on $\left(\mu \mathrm{,}\sigma \right)$ and perform parameter estimation. We leave the open questions in estimation to future research and meantime direct the interested readers to the relevant literature for more information.

Last, but not least, the simulation studies and backtesting in this paper focus on equity data. However, the methodologies can be applied to any asset class, for which the rate of return can be defined and computed.

NOTES

^{1}This means that asset returns at different time might be sampled from different distributions.

^{2}For example, let the diffusion coefficients be an ANN function-form.

^{3}The requirement of being identically distributed for U can actually be relaxed under Lyapunov central limit theorem (L-CLT), which will cause a slight variation in our discussions later. However, this will not impact the algorithms to be introduced in Section 3.

^{4}Actually, the conditionally joint independence of U can be replaced by being jointly uncorrelated under strong law of large numbers (S-LLN).

^{5}Sometimes we write
${\mathbb{E}}_{t}[\cdot ]\mathrm{:}=\mathbb{E}\left[\cdot \mathrm{|}{F}_{t}\right]$.

^{6}Equation (2) can be justified as an approximate relationship due to the infinite orthogonal basis expansion of R in the conditional Hilbert space of
${L}^{2}\left({F}_{t}\right)$ random variables at each time t, see Remark 1.
$\left(\mu \mathrm{,}\sigma \right)$ can be any stochastic process. Of course, they are market, asset class and regime dependent, meaning that different situations might result in different modeling practice for them.

^{7}Here we define rotation as a linear transformation.

References

[1] Gu, S., Kelly, B. and Xiu, D. (2018) Empirical Asset Pricing via Machine Learning. Chicago Booth Research Paper.

https://doi.org/10.3386/w25398

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3159577

[2] Halperin, I. and Feldshteyn, I. (2018) Market Self-Learning of Signals, Impact and Optimal Trading: Invisible Hand Inference with Free Energy (or, How We Learned to Stop Worrying and Love Bounded Rationality). SSRN Electronic Journal, 1-57.

https://doi.org/10.2139/ssrn.3174498

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3174498

[3] Bianchi, D., Buchner, M. and Tamoni, A. (2019) Bond Risk Premia via Machine Learning. Working Paper.

https://doi.org/10.2139/ssrn.3400941

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3232721

[4] Yu, P., Lee, J., Kulyatin, I., Shi, Z. and Dasgupta, S. (2019) Model-Based Deep Reinforcement Learning for Dynamic Portfolio Optimization. Computer Science, 1-21.

https://arxiv.org/abs/1901.08740

[5] Feng, G., Polson, N. and Xu, J. (2019) Deep Learning in Asset Pricing. Working Paper.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3350138

[6] Chen, L., Pelger, M. and Zhu, J. (2019) Deep Learning in Asset Pricing. Quantitative Finance, 1-89.

https://doi.org/10.2139/ssrn.3350138

https://arxiv.org/abs/1904.00745

[7] Faber, M. (2010) Relative Strength Strategies for Investing. Working Paper.

https://doi.org/10.2139/ssrn.1585517

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1585517

[8] Antonacci, G. (2017) Risk Premia Harvesting through Dual Momentum. Journal of Management & Entrepreneurship, 2, 27-55.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2042750

[9] Clare, A., Seaton, J., Smith, P. and Thomas, S. (2015) The Trend Is Our Friend: Risk Parity, Momentum and Trend Following in Global Asset Allocation. Working Paper.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2126478

[10] Han, Y., Yang, K. and Zhou, G. (2013) A New Anomaly: The Cross-Sectional Profitability of Technical Analysis. Journal of Financial and Quantitative Analysis, 48, 1433–1461.

https://doi.org/10.2139/ssrn.1656460

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1656460

[11] Avellaneda, M. and Lee, J. (2008) Statistical Arbitrage in the U.S. Equities Market. Working Paper.

https://doi.org/10.2139/ssrn.1153505

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1153505

[12] Meucci, A. (2009) Review of Statistical Arbitrage, Cointegration, and Multivariate Ornstein-Uhlenbeck. Working Paper, 1-19.

https://doi.org/10.2139/ssrn.1404905

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1404905

[13] Kakushadze, Z. (2015) Mean-Reversion and Optimization. Journal of Asset Management, 16, 14-40.

https://doi.org/10.1057/jam.2014.37

[14] Adrian, T., Crump, R. and Vogt, E. (2017) Nonlinearity and Fight to Safety in the Risk-Return Trade-off for Stocks and Bonds. FRB of NY Staff Report No. 723.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2646046

[15] Meucci, A. (2014) Linear Factor Models: Theory, Applications and Pitfalls. Working Paper.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1635495

[16] Fama, E. and French, K. (1992) The Cross-Section of Expected Stock Returns. Journal of Finance, 47, 427-465.

https://doi.org/10.1111/j.1540-6261.1992.tb04398.x

[17] Fama, E. and French, K. (2016) Dissecting Anomalies with a Five-Factor Model. Review of Financial Studies, 29, 69-103.

https://doi.org/10.1093/rfs/hhv043

[18] Samo, Y. and Vernuurt, A. (2016) Stochastic Portfolio Theory: A Machine Learning Perspective. Working Paper, 1-9.

https://arxiv.org/pdf/1605.02654.pdf

[19] Heaton, J., Polson, N. and Witte, J. (2016) Deep Learning for Finance: Deep Portfolios. Applied Stochastic Models in Business and Industry, 33, 3-12.

https://doi.org/10.2139/ssrn.2838013

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2838013

[20] Halperin, I. (2017) QLBS: Q-Learner in the Black-Scholes(-Merton) Worlds. Quantitative Finance, 1-34.

https://doi.org/10.2139/ssrn.3087076

https://arxiv.org/abs/1712.04609v2

[21] Deng, Y., Bao, F., Kong, Y., Ren, Z. and Dai, Q. (2017) Deep Direct Reinforcement Learning for Financial Signal Representation and Trading. IEEE Transactions on Neural Networks and Learning Systems, 28, 653-664.

https://doi.org/10.1109/TNNLS.2016.2522401

[22] Jiang, Z., Xu, D. and Liang, J. (2017) A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem. Deep Portfolio Management, 1-31.

https://arxiv.org/pdf/1706.10059.pdf

[23] Ritter, G. (2017) Machine Learning for Trading. NBER Working Paper Series, 1-79.

https://doi.org/10.2139/ssrn.3015609

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3015609

[24] Sirignano, J. and Spiliopoulos, K. (2017) DGM: A Deep Learning Algorithm for Solving Partial Differential Equations. Working Paper, 1-31.

https://arxiv.org/pdf/1708.07469.pdf

[25] E, W., Han, J. and Hentzen, A. (2017) Deep Learning-Based Numerical Methods for High-Dimensional Parabolic Partial Differential Equations and Backward Stochastic Differential Equations. Mathematics, 1-39.

https://arxiv.org/abs/1706.04702

[26] Detemple, J., Garcia, R. and Rindisbacher, M. (2003) A Monte Carlo Method for Optimal Portfolios. Journal of Finance, 58, 401-446.

https://doi.org/10.1111/1540-6261.00529

[27] Markowitz, H. (1952) Portfolio Selection. Journal of Finance, 7, 77-91.

https://doi.org/10.1111/j.1540-6261.1952.tb01525.x

[28] Black, F. and Litterman, R. (1992) Global Portfolio Optimization. Financial Analysts Journal, 48, 28-43.

https://doi.org/10.2469/faj.v48.n5.28

[29] Kraft, H., Seiferling, T. and Seifried, F. (2016) Optimal Consumption and Investment with Epstein-Zin Recursive Utility. Finance and Stochastics, 21, 187-226.

https://doi.org/10.1007/s00780-016-0316-0

[30] Kartoun, U. (2013) White Paper: A Method for Comparing Hedge Funds. Working Paper, 1-27.

https://arxiv.org/ftp/arxiv/papers/1303/1303.0073.pdf

[31] Dangi, A. (2012) Financial Portfolio Optimization: Computationally Guided Agents to Investigate, Analyse and Invest!? Master Thesis, University of Pune, Maharashtra.

https://arxiv.org/pdf/1301.4194.pdf

[32] Lu, Y. (2015) Weak Laws of Large Numbers for Sequences or Arrays of Correlated Random Variables. International Mathematical Forum, 10, 165-173.

https://doi.org/10.12988/imf.2015.5111

[33] Lai, T., Xing, H. and Chen, Z. (2018) Mean-Variance Portfolio Optimization When Means and Covariances Are Unknown. The Annals of Applied Statistics, 5, 798-823.

[34] Luo, R., Zhang, W., Xu, X. and Wang, J. (2017) A Neural Stochastic Volatility Model. Computer Science, 1-17.