A Machine Learning-based Recommendation System for Swaptions Strategies

2018·arXiv

Abstract

1 Introduction

Derivative traders are usually required to scan through hundreds, even thousands of possible trades on a daily basis. A concrete case is the so-called MidCurve Calendar Spread (MCCS), a derivatives package that involves selling an option on a forward-starting swap and buying an option on a spot-starting swap with longer expiration [8,26]. In such a package, traders look for the historical carry and the breakeven width levels, metrics that can be easily inferred from the terminal or aged payoff profile of the MCCS, shown in several heatmaps made by the research team. After that, they rank the most prominent ones to offer a client or to proceed in some proprietary trading. In general, the straightforwardness and swiftness that the decisions are made is the main upside of this framework.

However, one might notice that the main downsides of such approach are: (i) substantial information on the underlying like sensitivities, implied volatility, etc. are usually not taken into account; (ii) using the previous example, high historical values for carry and breakeven widths are more necessary rather than sufficient conditions for a profitable MCCS trade, being such argument extensible to other trades as well; (iii) a trader can quickly judge if an individual trade is worthwhile to invest, but may take some time to find it; and (iv) after a given period, traders tends to only look at a small subset of possible trades (small area on the heatmap), rather than the all available selection. Hence, a systematic approach where more information at hand is crossed and aggregated to find good trading picks and undoubtedly increase the trader’s productivity.

Therefore, the objective of this work is to develop a trading recommendation system that can aid derivatives traders in their day-to-day routine. Being more specific, our solution is based on the following pipeline: (i) on a certain trade date, we compute metrics and sensitivities related to MCCS; (ii) these metrics are feed in a model that can predict its expected return for a given holding period; and after repeating (i) and (ii) for all trades we (iii) rank the trades using some dominance criteria. Our final solution is a model-based heatmap with the attractiveness scores for each trade, which can be offered to the traders and salespeople on a daily basis.

In this sense, we organised this work as follows: next section presents a literature review on existing approaches to return/price prediction/estimation in different areas and instruments, as well as a brief description on MCCS trades. The third section displays the dataset that comports the MCCS trades, showing how the information is computed and gathered, which variables are the input and outputs, and the main assumptions that are embedded in it. Then, we move to modelling strategy, highlighting the main models that are going to be used as candidates for the recommendation system, how they are tested and have their performance assessed. Finally, we exhibit the results and discussions, closing this work with some concluding remarks and future directions for research.

2 Background

2.1 Related Works

Literature provides a growing body of evidence that price changes can be predicted, that is, in particular circumstances and periods securities violate the Efficient Market Hypothesis [5, 24]. In this sense, researchers have employed different modelling approaches and information sets to predict price changes across a range of assets. When we scan the literature for cash instruments (equities, bonds, foreign exchange, etc.) focused only in using past returns as the main source for prediction, we can find works that tap into Bayesian forecasting [33], Nonparametric Predictive Inference [2], Forecasting Combination [12], Generalized Exponential Weighted Moving Average [25], Support Vector Machines (SVM) [22], Shallow and Deep Neural Networks architectures [7,9,18,32], Random Forest and Gradient Boosting Trees [23], and so forth. The list of proposed methodologies keeps growing, in which equities or indices appears as the dominant asset class to apply these algorithms. Collectively, they provide evidence that some forecastability over returns can be achieved by putting in place complex models with a suitable training scheme.

Contrasting with the emphasis that researchers in cash instruments put on return predictability, when we devote our attention to research in derivatives instruments (options, swaps, swaptions, etc.) it is clear that most of the effort is concentrated on pricing these contracts. In parallel to the traditional framework, alternative ways of pricing and trading started to emanate relying on fewer assumptions and more data-driven. We can pinpoint approaches that use Neural Networks for option pricing and hedging with daily S&P 500 index daily call options [17] as well as for real-time pricing and hedging options on currency futures of EUR/USD at tick level [31]. It is worth to mention other approaches in the derivatives realm, such that the prediction of pricing and hedging errors for equity-linked warrants with Gaussian Process models [19], building machine learning models for predicting option prices over KOSPI 200 Index options [27] and a general study on forecasting option price distributions using Bayesian kernel methods [28].

When we devote our attention to the asset type that this work is dedicated, interest rate swaptions, a similar pattern persists: most of the research is related to pricing and not to return prediction. Regarding pricing, the same tradition of relying on stochastic calculus techniques is followed [4,29]. Regarding potential alternatives using more data-driven approaches as we saw with currency, indices and equities options, we can only mention the work of Souza et al. [30] which calibrates the Vasicek interest rate model under the risk neutral measure by learning the model parameters using Gaussian processes for regression. Considering trading strategies and return prediction, we can find even less academic research, being perhaps most of the research residing inside the counterparts that exchange such products (banks, hedge funds, etc.). This shortage of published research might be linked with the absence of ready to use and publicly available datasets, similar to the ones found in cash products since these instruments are traded off-exchange.

Based on this review of existing approaches to return/price prediction/estimation in different areas and instruments, to the best of our knowledge, our work is the first attempt to build a trading recommendation system in the context of derivatives. Our approach is not the only novel from a modelling perspective, but instead of trading the vanilla product (receiver/payer interest rate swaption), we prefer to focus on options strategies (calendar spreads, straddles, etc.) which in many cases is the package that is in practice traded. By thinking in terms of the package, in this case, a Mid-Curve Calendar Spread, rather than the individual constituents we unlock some features that can only be computed in this situation, like the carry at expiry, breakeven width and so on.

Therefore, we can train our models not only using past returns but also using sensitivities as well as information derived from the package payoff function. By portraying in this manner our investment strategy, we have a large information set that can substantially add information to aid forecasting returns. But as a counter-effect, this poses a new challenge on separating relevant features in a dynamic context. In this respect, the combination of temporal cross-validation, a diverse set of models and regularisation/feature selection can provide a robust framework for trading strategies backtesting and assessment. But before presenting such framework, next section gives a brief view on MCCSs trades.

2.2 Mid-Curve Calendar Spreads

Mid-Curve Calendar Spread (MCCS) is a package involving short selling an option on a forward-starting swap and going long a longer-expiry swaption on the same underlying swap [8]. There is a counterpart with many similarities for equities – check in [26] for more information. Investors typically use MCCS to take a view on forwarding volatility. This comes from the fact that, conceptually, spot volatility can be decomposed into forward volatility and mid-curve volatility. Taking 10y10yfor example, Figure 1 illustrate the time periods covered by different interest rate volatilities and their instruments. The red lines indicate the time over which interest rate volatility exposure is taken, and the grey line indicates the underlying forward swap rate.

Figure 1: Mid-curve Swaption: 5y mid-curve on 5y10y swap rate – the volatility of a forward-starting swaption, called mid-curve, whose strike is set at inception and but the underlying swap starts several years following the option expiry date.

Figure 2 presents the payoff profile for an EUR 1m1y2y.

Figure 2: Payoff profile for an EUR 1m1y2y.

We plot the payoff profiles for current volatility and up and down volatility scenarios, noting that the long vega position means that the payoff profile shifts up in a rising volatility environment and correspondingly shifts down in a falling vol environment. We calculate the (volatility adjusted) breakevens as being 0.41% 47%, giving little protection against selloffs. We note that forwards in a 1 volatility band leave them at 0.40% 48%, a range just marginally larger than our breakeven range (i.e., the trade should pay off just slightly less than 66% of the time). MCCS can result in what we think of as turbocharged carry, primarily because of the risks that they have (which fortunately can be balanced with the returns in a way which results in relatively attractive trades). Based on these characteristics of MCCS, next section presents how we elaborated the methodology to build this trading recommendation system.

3 Methodology

In summary, our solution develops the following roadmap (also schematically described in Figure 3):

Figure 3: Flowchart describing the input-output schemes from the proposed trading recommendation system for MCCS trades.

1. Data: On a certain trade date, we calculate metrics and sensitivities related to an MCCS package;

2. Modelling: These metrics are feed in a predictive model that outputs its expected return for a given holding period (e.g., one year);

3. Recommendation: After repeating (i) and (ii) for all MCCS we (iii) rank them based on the expected returns using some criteria.

Following this outlined structure, the next three subsections describe in more details when and which MCCS trades were recorded (Dataset), which predictive models were trained and how they were assessed (Modelling) and how the long/short trading signal is computed for each MCCS (Recommendation). Finally, last subsection presents which metrics were used to evaluate the recommendation system performance when a certain predictive model candidate is underpinning it.

3.1 Dataset

During our experiments, we opted to use the trades displayed in Table 1. Although many other configurations are available in practice, these are the ones with longest historical data available, which is important when it is necessary

Table 1: Configuration of the MCCS trades used.

to fit a predictive model. As it can be seen, all trades are in Euro, ranging from different expiries (1y-5y), forwards (1y-5y) and swap tenures (1y-5y and 8y).

For each configuration, at time t we agree with a counterpart to trade this package using the At the Money Forward (ATMF) rate as the strike, paying or receiving the present value . The is computed via SABR model [29], using information and parameters (e.g., spot, forward rates and rate-rate correlation) calibrated using market data on a daily basis. From the same model that computed the , we can also obtain other metrics and sensitivities as those displayed in Table 2.

Table 2: Metrics and sensitivities computed for each available package at time t.

PV Strike Carry at Expiry (Carry) Breakeven Width (BE Width) Aged 1y Carry Theta ATMF Implied Volatility (Implied Vol) Gamma Vega Curve Carry (Aged 1y) Time Carry (Aged 1y) Volatility Carry (Aged 1y) (Vol Carry)

Carry and BE Width are those obtained looking at the payoff profile at expiry. The Aged 1y Carry is produced by ageing the trade by one year (moving closer to the expiration) and estimate the payoff profile computing the carry. Theta, Vega and Gamma are the sensitivities of the instruments by a change in time, volatility and a wider range of underlying rate movements, respectively. These and the ATMF Implied Vol are backed by the SABR model too. Curve, Time and Volatility Carry are the amount of Aged 1y Carry that can be attributed to the changes in certain sensitivities from spot to forward, such as the Delta (Curve), Theta (Time) and Vega (Volatility). These can also be used as tools to understand which factors most influence the instrument value over time.

After computing all these metrics at time t, we hold the trade until t + h where h can be two weeks, one month, one year, and so on, as long as t + h is before or at expiration. In time t + h we compute the of the same trade again, using the new economic scenario available (e.g. rates, change in model parameters). By agreeing on buying back or selling the current trade for we can compute the Holding k-period Return of the trade started at time t by:

In summary, Table 3 presents an example of information in a wide format that is available when we combine the data from time t and t + h.

Table 3: Example of information available at time t and t + h for the MCCS.

Note that in the most contemporaneous period (close to T) we do not have the and so, we cannot compute . Conversely, if we want to use lagged returns as explanatory forces for , then in the beginning this information is also not available. Therefore, our dataset is trimmed at the beginning and the end mainly by the value of h. If h is small, such as two weeks or one month, the trimming is imperceptible and, therefore, may not affect the model fitting and validation. However, if h is large such as two or three years, this might reduce the samples available substantially, decreasing the range of models and cross-validation schemes that might be employed for this task. Based on these procedures, metrics and observations, Table 4 express other details that we used during our experiments to generate the dataset.

Table 4: Details used to generate the MCCS trade dataset.

Therefore, we gathered data from trades entered on a weekly basis from September 2006 to September 2016. These trades are struck ATMF, using the computed from the Middle Rate (in practice, some bid-ask spread would be imbued proportional to the Vega). After holding for one year (h = 1y) the trade, we compute the arithmetical returns that are, therefore by definition, automatically annualised. These returns are gross, and so we need to take into account the transaction costs (hedging costs and fixed fees charged by the derivatives desk) as well as some future funding rate. These values are also outlined in Table 4, where the transaction costs of 0.75 as a fraction of Vega were chosen not only to taken into account the transaction cost, but also some potential bid-ask spread on the start/unwind of the trade. The 3-month London Interbank Overnight Rate (LIBOR) was chosen as the funding cost/benchmark rate to compute excess returns.

Using these assumptions, next subsection presents the modelling strategy that taps into this dataset to create the recommendation system for the MCCS trades.

3.2 Modelling

In relation to modelling, our general model is a system of uncoupled equations:

where for each MCCS trade (i = 1, ..., n) there is an i-th predictive model that is feed with a set of pre-calculated features (BE Width, Carry, etc.) and returns an estimate of the holding 1y-period return ˆ. As the model is an approximation, some noise/error is expected, and in the modelling aspect, this is expressed as the component. After defining which variable is intended to be predicted, the remaining points are: which models are available to embody and how the fitting, validation and selection of these models are going to be made.

About the first point, in the first rows of Table 5 we display the models that we used during our experiments, with their mathematical descriptions and usage found in the following references [3,11,16,20].

In Table 5 Model column presents a plethora of models that this work has fitted for this prediction purpose: we started from simple predictive models such as Classical Linear Regression, k-Nearest Neighbours and Classification and Regression Tree, towards those that can seamlessly exhibit nonlinear behaviours, like Random Forest, Kernel Ridge Regression, Multi-Layer Perceptron and Support Vector Regression. Some of these methods had their hyperparameters held constant across all experiments (Fixed Hyperparameters column), or because we wanted to apply a particular form of a method (RBF kernel, single hidden layer, etc.) or because during a warm-up phase we noticed that they did not affect substantially the results (hyperbolic tangent, increasing number of trees, etc.).

For certain models, the Cross-Validated Parameters column shows which hyperparameters were optimised before the prediction step. For instance, suppose the case of Ridge Regression and the need to define the regularisation value () appropriately. Consider that we have a set of training pairs (of size L, and for this sample we subset it in k-rolling-cross-validation (k-rolling-cv) folders (better explained later in this subsection). Then, we train and test using this scheme the Ridge Regression model with one of the predefined , say = 10. We compute some performance function on the test set (Mean Squared Error – MSE) and repeat this process for all values available. We use in the final model the that on average had the lowest MSE.

We fitted usual benchmarks found in the literature for regression and forecasting modelling: the Average and Naive models [16,21]. We also implemented the benchmarks that traders use to assess whether a particular MCCS is worth to be pitched or traded: BE Width and Carry at Expiry. We replicated the way traders look to these features, by computing z-scoresbased on average and standard deviation on rolling window of size equal to 1 year. The signal for going long/short is done by a thumb rule with a simple rationale: if a certain metric has a z-score above or equal to 3, the trader goes fully long (+)/short(-) in the trade, since it is a very extreme event. Otherwise, it reduces the leverage on it, until it below one standard deviation of distance from the rolling average.

We removed any missing data, and clipped extremes values, mainly in returns above the 95% percentiles (in our case it can be due to some numerical problems, or some extreme scenarios related to 2008-2009 financial crisis period). Next subsection presents the final component of our roadmap: recommendation system.

3.3 Recommendation

The recommendation of a certain trade can be made solely on some normalised version of the expected return for holding 1y-period the i-th trade ( ˆ). Given that each model will be providing individual forecasts for each MCCS and after that their performance will be assessed locally and globally, a more suitable manner to proceed would be to assign a credit based on the tracking record of a model to predict a particular MCCS trade. Hence, we will be weighted up or down a signal not only based on the magnitude of a model prediction but also by its quality. Then, consider as ˆthe expected return for holding 1y-period

the i-th trade. Now, define the new signal function by:

where the strength of the i-th long/short signal is given by its expected return, scaled by the maximum weighted return that a long/short position on the same trade (that is why the returns are in absolute terms) was expected to yield in the previous h-period (in this case 1 year). Therefore, the trade with the maximum weighted return in absolute terms will have = 1 as well as those close to zero will yield 0. The weight/credit of a certain prediction is based on the historical Pearson correlation coefficient, that is, adherence between the actual and predicted values.

3.4 Evaluation Metrics

Below we outline two types of metrics: one that focuses on the predictive performance that the model provided, and other three that are based on the profit/loss that its application harvested during the backtest. Set by = ( ˆ) the strategy return (combination of the realized/observed excess returns and the signal – function of a model prediction), we can compute the following metrics:

• Pearson Correlation Coefficient (Rho): it is a dimensionless measure of the linear dependence between the actual and predicted values:

where Cov and V ar are the covariance and variance operators. It ranges from [+1], with 1 representing a perfect inverse linear association, and +1 the opposite. In our case, we benefit more when Rho is close to +1. In the context of linear models, a higher predictive power is a necessary condition for profitable trades (see [1]), hence by minimising the predictive error we are somewhat trailing a path for profits maximisation, albeit such causation is not very clear since this is not a sufficient condition.

• Standard Deviation: is the estimator of the dispersion around the strategy average returns (a risk measure in certain sense):

where ¯B is the average return of the benchmark (e.g., treasury bond, equity index). In our case, it was already set to the 3-month LIBOR rate (Table 4). It should be mentioned that Information Ratio makes each strategy performance comparable: since we are adjusting average returns by the risk assumed for each strategy, it removes the leverage component that is magnifying/shrinking the returns provided by a certain strategy.

4 Results and Discussions

Figure 4 displays a heatmap with the results of all models for each trade regarding Average Return (%). Similarly, different remarks can be made over the global picture: (i) the Naive and Mean Pred models underperformed, but the traders benchmarks did perform reasonably well, surpassing the predictive models in many occasions; (ii) from the linear regression family, Lasso Regression followed by Ridge Regression are the ones that performed better; (iii) most nonlinear models failed to provide a decent average return; and (iv) MLP fared well for the trades in the EUR Xy1y1y range, but did not repeat a more stable across other trades.

When we take into account the variability seen in the stream of returns generated by the recommendation system, we may encounter a different picture. Figure 5 shows a heatmap with the Information Ratio for all the available combinations of models and trades.

In general, the models kept their positions unaltered in comparison to the Average Return (%) – linear models still fared better than the nonlinear ones –, but now all are standing on a similar scale. Based on these Information Ratio results, Table 6 presents a statistical analysis using the average ranks, Friedman test and Holm posthoc procedure [10].

When we look at the average rank, Lasso Regression was the top positioned (3.23) while Mean Pred remained most of the time as the worst choice (12.86). The trader’s benchmarks performed pretty well, being placed in the third and fourth places. When we compare whether such result fared by Lasso Regression was substantially different from Ridge Regression (4.86), we arrive with a Z-score equal to 1.63 and a p-value of 0.0517. If we set our initial significance level as 0.05 and correct using the Holm procedure (last column) we can assert that Lasso did not perform significantly different from Ridge Regression, but way better than the other models. Therefore, Lasso Regression is capturing some information beyond that is being spanned by the trader’s benchmarks, as well as beating almost all other predictive models for this particular task. Our

Figure 4: Heatmap with the historical Average Return (%) during test set.

throughout analysis suggests that Lasso Regression seems to provide in general the best results across a range of metrics and criteria. Figure 6 uses boxplots as a visualisation tool to decompose the aggregated results shown before, by informing per trade the returns obtained from using the Lasso Regression trading recommendation system.

Overall, some patterns can be spotted from the boxplots: (i) in general the medians are located above zero, meaning that more than half of the trades tended to yield positive returns; (ii) Lasso Regression is exploring long/short position regardless of the most frequent outcome for each MCCS tenure; (iii) it tended to perform well for EUR 3yXyXy trades and since these tended to be historically a challenging pick (medians are centred to zero in these trades), it means that the model is actually capturing some signal from the data and not naively guessing long/short positions; and (iv) the returns distribution, mostly with higher forward and swap tenure (second and third row), tended to be right-

Figure 5: Heatmap with the Information Ratio during the test set.

skewed – because the third quartile is far from the median, whereas the first is squeezed towards the median. This last fact denounces that Lasso Regression frequently generates small negative outcomes, and dangerous scenarios are not as likely, which tends to be a desired property for quantitative strategies in general. Given that, we now look at the aggregated returns harvested by Lasso Regression during its test phase. These results are consolidated in Figure 7, where: (i) the top plot shows the average return with standard deviations obtained across all MCCS per trading week; (ii) the middle reveal histograms, where the left one represents the returns obtained across all MCCS regardless of the trading date, while the right ones displayed the same data but conditioned per position; and (iii) finally the bottom image presents the trading success rate for each long/short position suggested by the model (left), with the break down by long/short position displayed as well (right). To clarify, trading success in this context means being long/short when the returns of trade were

Table 6: Average ranks, Friedman and Holm post-hoc statistical tests and analysis for Information Ratio.

positive/negative regardless of its magnitude. In respect to the top image, we can see that Lasso Regression started well during the first year but suffered a drawdown in the second to third year. This period was marked by higher volatility, mainly due to the final developments of the Euro Crisis period (2010-2012). However, from the third year onwards the average returns always scored positive values, usually ranging from 10% to 20% in average. Such performance can be seen stamped on the middle left histogram, where the bulk of returns lies above zero, and not only that but concentrated close to 15%. This performance was largely generated by Lasso Regression suggesting short positions (middle right), while the long positions were not so successful. Such pattern can be better seen in the histogram located at the bottom, where a bimodal distribution for trading recommendation success rate is depicted. Probably the verified outperformance coming from taking short positions in the MCCS is linked with betting against the volatility/variance risk-premium trade [6]. Roughly this strategy harvest the premium paid by a counterpart for the insurance on large swings in the market (almost the same as selling a put for equities options). Since in general, the market tends to remain range-bounded, the investor shorting the trade can repurchase it later for a smaller premium, profiting from this differential. Lasso Regression did dynamically the opposite and profited from it, largely because in this last 5-6 years was populated of higher volatility periods and tail events.

Figure 8 help us to analyse which features are being most significant by Lasso Regression for each particular trade.

Each cell corresponds to a normalised t-statsfrom the model coefficients built in the last step from k-rolling-cv. Implied Vol was the most significant feature pointed out by Lasso Regression, is negatively related with the MCCS returns. Other important features were the BE Width – slightly positively correlated with returns – and the Carry at Expiry – negatively related, but probably due to the depressed levels of carry that has been seen in the last batch of data. Lasso Regression promoted in general very sparse models, being

Figure 6: Boxplot with the returns obtained from Lasso Regression for each in- strument. All y-axes were fixed between 0.3 and -0.3 (30% and -30% annualised return) to facilitate their visualisations and comparisons.

the other features playing specific roles for some trades like Time Carry for short-dated trades. Finally, the lagged returns were just relevant for few trades, and perhaps could be omitted for certain trades in the future to guarantee a broader dataset. We close this section showing results of Lasso Regression for a specific trade: EUR 4y5y5y – Figure 9.

Starting from the top, it can be seen that the Lasso Regression was able to predict reasonably well the observed returns after 1 year holding this trade. It is not perceived any over/underestimation of values, with the mirror-shape format indicating a good fit. This observation is reflected in the middle image, where the long/short positions track well the observed returns of the EUR 4y5y5y. The main mistakes are due to events that were not incorporated into the model and perhaps were also unforecastable: (i) March to April 2012, possibly due to the Greek Debt Restructuring Agreement; (ii) May to November 2013, linked with the Taper Tantrum event in the US and its effect on Europe rates.

Figure 7: Aggregated returns over the period and histogram of aggregated re- turns and success rate from trades suggested using Lasso Regression.

Although such events have influenced in the strategies return, last plot shows that the trading success of Lasso Regression has attained a historical Area Under the Curve (AUC) of 0.85, with the capacity to control the false alarms to 5% and still recommend trades accurately 40% of the time. This is a good indicator since in the onset of the recommendation system it is better to reduce the chances of suggesting a bad trade than missing a good pick.

Figure 8: Heatmap with the Feature Significance (%) obtained from normalised t-stats of Lasso Regression coefficients.

Figure 9: Lasso Regression results for EUR 4y5y5y: predicted versus observed values, returns and long/short signals over the period and receiving operating characteristic curve based on the success rate.

5 Conclusions

This work proposed a trading recommendation system for Mid-Curve Calendar Spread Trades (MCCS). We proposed a recommendation system that could analyse and rank a set of fixed income derivatives trades. Our first experiment is designing and applying this method for Mid-Curve Calendar Spread trades. Therefore, we started the methodology by showing the dataset: it comprised of 35 MCCS trades, ranging from September 2006 to September 2016, with different expirations, forward and swap tenures. For each particular trade, we described how the sampling of inputs (metrics, sensitivities and lagged returns) and outputs (returns from unwinding the trade after one year of its start) were computed on a weekly basis. Then, we displayed the modelling strategy by highlighting the models that were trained as well as which hyperparameters were investigated during the nested resampling step. Before entering the results section, we presented the backtesting setting with the performance measures used to compare different methodologies.

Most models provided results better than the modelling benchmarks (Mean and Naive), yet very few were able to outperform the trader’s benchmarks. Our results suggested that linear models with shrinkage procedures (e.g., Ridge and Lasso) tended to perform better than their nonlinear counterparts (like Kernel Ridge Regression, SVR and MLP). Also, regarding interpretability, they tend to be easier to convey to the traders, since most are versed in linear models. When we delved into Lasso Regression results, we found out that this model wielded some interesting features like: (i) it learned a type of volatility buying/selling strategy without being programmed to do so; (ii) its returns distribution across all MCCS tended to be right-skewed, meaning that we are more hedged towards dangerous scenarios with greater chances of upsides; (iii) it matched traders view on selecting good trades, but adding some dynamic view on it since Carry at Expiry is now negatively linked with returns, rather than the original view from the traders. We believe that Lasso Regression will be our choice for a first version of the trading recommendation system, with future developments giving space to different models and mixed approaches.

Acknowledgment

Adriano Soares Koshiyama wants to acknowledge the funding for its PhD studies provided by the Brazilian Research Council (CNPq) through the Science Without Borders program. Also, the authors would like to thanks, Guillaume Andrieux, Tomoya Horiuchi, Gerald Rushton, Tam Rajendran, and Anthony Morris for all the comments and support during this research.

References

[1] Emmanuel Acar and Stephen Satchell. Advanced trading rules. Butterworth-Heinemann, 2002.

[2] Rebecca M. Baker, Tahani Coolen-Maturi, and Frank P. A. Coolen. Non- parametric predictive inference for stock returns. Journal of Applied Statistics, 44(8):1333–1349, 2017.

[3] C Bishop. Pattern Recognition and Machine Learning. Springer, New York, 2007.

[4] Damiano Brigo and Fabio Mercurio. Interest rate models-theory and practice: with smile, inflation and credit. Springer Science & Business Media, 2007.

[5] John Y Campbell, Andrew Wen-Chuan Lo, and Archie Craig MacKinlay. The econometrics of financial markets. Princeton University press, 1997.

[6] Hoyong Choi, Philippe Mueller, and Andrea Vedolin. Bond variance risk premiums. Review of Finance, 21(3):987–1022, 2017.

[7] Eunsuk Chong, Chulwoo Han, and Frank C. Park. Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies. Expert Systems with Applications, 83:187 – 205, 2017.

[8] Howard Corb. Interest Rate Swaps and Other Derivatives. Columbia University Press, 2012.

[9] Y. Deng, F. Bao, Y. Kong, Z. Ren, and Q. Dai. Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems, 28(3):653–664, 2017.

[10] Joaqu´ın Derrac, Salvador Garc´ıa, Daniel Molina, and Francisco Her- rera. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm and Evolutionary Computation, 1(1):3–18, 2011.

[11] Richard O. Duda, Peter E. Hart, and David G. Stork. Pattern Classification (2nd Edition). Wiley-Interscience, 2000.

[12] Graham Elliott, Antonio Gargano, and Allan Timmermann. Complete subset regressions. Journal of Econometrics, 177(2):357 – 373, 2013.

[13] Nick Firoozye and Qilong Zhang. Turbo carry zooms ahead: A performance update of the turbo-carry trades. Nomura International plc, Nomura Research, 2014.

[14] Nick Firoozye and Qilong Zhang. Usd short-term front-end turbo carry usd 1m1y2y trades: Short horizon for sizeable carry. Nomura International plc, Nomura Research, 2014.

[15] Nick Firoozye and Xiaowei Zheng. Market update: Forward vol and mid-curve calendar spreads in usd and eur recent levels and carry and trades of note. Nomura International plc, Nomura Research, 2016.

[16] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning, volume 1. Springer series in statistics Springer, Berlin, 2001.

[17] R. Gencay and Min Qi. Pricing and hedging derivative securities with neu- ral networks: Bayesian regularization, early stopping, and bagging. IEEE Transactions on Neural Networks, 12(4):726–734, 2001.

[18] Eduardo A. Gerlein, Martin McGinnity, Ammar Belatreche, and Sonya Coleman. Evaluating machine learning classification for financial trading: An empirical approach. Expert Systems with Applications, 54:193 – 207, 2016.

[19] Gyu-Sik Han and Jaewook Lee. Prediction of pricing and hedging errors for equity linked warrants with gaussian process models. Expert Systems with Applications, 35(12):515 – 523, 2008.

[20] Simon S Haykin. Neural networks and learning machines, volume 3. Pearson, 2009.

[21] Rob Hyndman, Anne B Koehler, J Keith Ord, and Ralph D Snyder. Forecasting with exponential smoothing: the state space approach. Springer Science & Business Media, 2008.

[22] Andreas Karathanasopoulos, Konstantinos Athanasios Theofilatos, Geor- gios Sermpinis, Christian Dunis, Sovan Mitra, and Charalampos Stasinakis. Stock market prediction using evolutionary support vector machines: an application to the ase20 index. The European Journal of Finance, 22(12):1145–1163, 2016.

[23] Christopher Krauss, Xuan Anh Do, and Nicolas Huck. Deep neural net- works, gradient-boosted trees, random forests: Statistical arbitrage on the s&p 500. European Journal of Operational Research, 259(2):689 – 702, 2017.

[24] Burton G Malkiel. The efficient market hypothesis and its critics. The Journal of Economic Perspectives, 17(1):59–82, 2003.

[25] Masafumi Nakano, Akihiko Takahashi, and Soichiro Takahashi. General- ized exponential moving average (ema) model with particle filtering and anomaly detection. Expert Systems with Applications, 73:187 – 200, 2017.

[26] Sheldon Natenberg. Option volatility and pricing: advanced trading strategies and techniques. McGraw Hill Professional, 2014.

[27] Hyejin Park, Namhyoung Kim, and Jaewook Lee. Parametric models and non-parametric machine learning models for predicting option prices: Empirical comparison study over {KOSPI} 200 index options. Expert Systems with Applications, 41(11):5227 – 5237, 2014.

[28] Hyejin Park and Jaewook Lee. Forecasting nonnegative option price distri- butions using bayesian kernel methods. Expert Systems with Applications, 39(18):13243 – 13252, 2012.

[29] Riccardo Rebonato, Kenneth McKay, and Richard White. The SABR/LIBOR Market Model: Pricing, calibration and hedging for complex interest-rate derivatives. John Wiley & Sons, 2011.

[30] J. Beleza Sousa, M. L. Esquvel, and R. M. Gaspar. Machine learning vasicek model calibration with gaussian processes. Communications in Statistics -Simulation and Computation, 41(6):776–786, 2012.

[31] Christian von Spreckelsen, Hans-Jrg von Mettenheim, and Michael H. Bre- itner. Real-time pricing and hedging of options on currency futures with artificial neural networks. Journal of Forecasting, 33(6):419–432, 2014.

[32] Tianle Zhou, Shangce Gao, Jiahai Wang, Chaoyi Chu, Yuki Todo, and Zheng Tang. Financial time series prediction using a dendritic neuron model. Knowledge-Based Systems, 105:214 – 224, 2016.

[33] Xiaocong Zhou, Jouchi Nakajima, and Mike West. Bayesian forecasting and portfolio decisions using dynamic dependent sparse factor models. International Journal of Forecasting, 30(4):963–980, 2014.

Designed for Accessibility and to further Open Science