Algorithmic trading based on mean-variance optimization in Python

Learn how to create and implement trading strategies using Markowitz’s optimization!

Oct 14 ·7min read

This is the fifth part of a series of articles on backtesting trading strategies in Python. The previous ones described the following topics:

introducing the zipline framework and presenting how to test basic strategies (link)
importing custom data to use with zipline ( link )
evaluating the performance of trading strategies (link)
implementing trading strategies based on Technical Analysis (link)

This time, the goal of the article is to show how to create trading strategies using Markowitz’s portfolio optimization and the Modern Portfolio Theory.

In this article, I first give a brief introduction/reminder on the mean-variance optimization and then show how to implement it into trading strategies. Just as before, I will backtest them using the zipline framework.

The Setup

For this article I use the following libraries:

zipline    1.3.0
matplotlib 3.0.0
json       2.0.9
empyrical  0.5.0
numpy      1.14.6
pandas     0.22.0
pyfolio    0.9.2

Primer on mean-variance optimization

In 1952 Harry Markowitz published the ‘Portfolio Selection’, which described an investment theory now known as the Modern Portfolio Theory (MPT in short). Some of the key takeaways are:

the portfolio return is the weighted average of the individual portfolio constituents, however, the volatility is also impacted by the correlation between the assets
the investors should not evaluate the performance of the assets separately, but see how they would influence the performance of a portfolio
diversification (spreading the allocation over multiple assets instead of one or very few) can greatly decrease the portfolio’s volatility

We will not go deeply into the assumptions of MPT, but the main ones are that all investors share the goal of maximizing the returns on investment while avoiding as much risk as possible, they can borrow and lend money at the risk-free rate (without limits) and the transaction costs are not taken into account.

Based on all of the above, the mean-variance analysis is the process of finding optimal asset allocation that provides the best trade-off between the expected return and risk (measured as the variance of returns). A key concept connected to the mean-variance analysis is the Efficient Frontier — a set of optimal portfolios providing the highest expected portfolio return for a given level of risk — or framing it differently — providing the minimum level of risk for the expected portfolio return.

qyaIn2M.png!web

Visualization of the Efficient Frontier — Source: wikipedia

Mathematically, one of the possible formulations of the problem is the following:

where w is the vector of weights, μ is a vector of asset returns, Σ is the covariance matrix, μ _p is the target expected portfolio return. Two of the constraints are:

non-negative weights 0 short-selling is not allowed
weights must sum up to 1 — no leverage is allowed

To solve the problem and obtain the Efficient Frontier, we could define a range of possible expected portfolio returns and then for each value find the weights that minimize the variance. Fortunately, there is a library that makes the process very simple.

PyPortfolioOpt makes it possible to solve the entire optimization problem with only a few lines of code. In this article, we will create portfolios that either maximize the expected Sharpe ratio (portfolio’s excess return per unit of risk) or minimize the overall volatility. Both of these portfolios lie on the Efficient Frontier.

We present how to work with pypfopt in the following short example. First, we download the historical stock prices using yahoofinancials .

zIvUbeI.png!web

pypfopt allows us to easily calculate the expected returns and the covariance matrix directly from the prices, with no need for converting to returns beforehand.

# calculate expected returns and sample covariance amtrix
avg_returns = expected_returns.mean_historical_return(prices_df)
cov_mat = risk_models.sample_cov(prices_df)

We obtain the weights maximizing the Sharpe ratio by running the following lines of code:

# get weights maximizing the Sharpe ratio
ef = EfficientFrontier(avg_returns, cov_mat)
weights = ef.max_sharpe()
cleaned_weights = ef.clean_weights()
cleaned_weights

Which results in the following weights:

{'FB': 0.03787, 'MSFT': 0.83889, 'TSLA': 0.0, 'TWTR': 0.12324}

For convenience, we use the clean_weights() method, as it truncates very small weights to zero and rounds the rest.

Strategies

In this article we use the following problem setting:

the investor has a capital of 50000$
the investment horizon covers years 2016–2017
the investor can only invest in the following stocks: Tesla, Microsoft, Facebook, Twitter
we assume no transactions costs
there is no short selling (the investor can only sell what he/she currently owns)
when performing optimization, the investor considers 252 past trading days to calculate the historical returns and covariance matrix
the first trading decision is made on the last day of December, but the orders are executed on the first trading day of January 2016

Benchmark 1/n strategy

We start by creating a simple benchmark strategy — the **1/n portfolio**. The idea is very simple — on the first day of the test, we allocate 1/n % of our total capital to each of the considered n assets. To keep it simple, we do not do any rebalancing.

What often happens in practice is that the portfolio is rebalanced every X days to bring the allocation back to 1/n. Why? We can imagine that we hold a portfolio of two assets X and Y and at the beginning of the investment horizon the allocation is 50–50. Over a month, the price of X increased sharply while the price of Y decreased. As a result, asset X constitutes 65% of our portfolio’s worth, while Y has only 35%. We might want to rebalance back to 50–50 by selling some of X and buying more Y.

The following plot shows the cumulative returns generated by the strategy.

jQfAR36.png!web

We store some results to compare with the other strategies.

benchmark_perf = qf.get_performance_summary(returns)

Maximum Sharpe ratio portfolio — rebalancing every 30 days

In this strategy, the investor selects such weights that maximize the portfolio’s expected Sharpe ratio. The portfolio is rebalanced every 30 trading days.

We determine if a given day is a rebalancing day by using the modulo operation ( % in Python) on the current trading day’s number (stored in context.time ). We rebalance on days when the reminder after the division by 30 is 0.

We will inspect the results of all the strategies at the end of the article. However, an interesting thing to see here is the weights allocation over time.

Rjm2AvJ.png!web

Some of the insights from the plot:

there is virtually no investment in Twitter in this strategy
sometimes entire months are skipped like Jan 2016 or April 2016. That is because we are rebalancing every 30 trading days and there are on average 21 trading days in a month.

Maximum Sharpe ratio portfolio — rebalancing every month

This strategy is very similar to the previous one — we also select weights maximizing the portfolio’s expected Sharpe ratio. The difference is in the rebalancing scheme. First, we define the rebalance method which calculates the optimal weights and executes orders accordingly. Then, we schedule it using schedule_function . With the current setting, the rebalancing happens on the last trading day of the month ( date_rules.month_end ) after the market closes ( time_rules.market_close ).

We also inspect the weights over time:

RRnYZ3B.png!web

When rebalancing monthly, we do have entries for all the months. Also, in this case, there are some small investments in Twitter in mid-2017.

Minimum volatility portfolio — rebalancing every month

This time the investors select the portfolio weights by minimizing the volatility. Thanks to PyPortfolioOpt , this is as easy as changing weights = ef.max_sharpe() to weights = ef.min_volatility() in the previous code snippet.

The weights generated by the minimum volatility strategy are definitely most stable over time — there is not so much rebalancing from between to consecutive periods. That is certainly important when we account for transaction costs.

YnyAFf3.png!web

Comparing the performance

From the comparison below we see that over the duration of the backtest the strategy minimalizing volatility achieved the best returns together with the lowest portfolio volatility. It also performed much better than the strategies maximizing the Sharpe ratio.

Another interesting observation is that all of the custom strategies created using optimization resulted in performance better than naive 1/n allocation combined with buy and hold.

Conclusions

In this article, I showed how to combine zipline with pypfopt in order to backtest trading strategies based on mean-variance optimization. We only covered portfolios either maximizing the Sharpe ratio or minimizing the overall volatility, however, there are definitely more possibilities.

Some of the possible future directions:

in the optimization scheme, account for maximum potential changes in the allocation. With the zero commission setup, this is not a problem, however, in the presence of transaction costs we would like to avoiding spending too much on the fees if we completely rebalance every X days.
allow for short-selling
use custom objective functions in the optimization problem — optimizing using different evaluation metrics

It is crucial to remember that the fact that the strategy performed well in the past is no guarantee that this will happen again in the future.

As always, any constructive feedback is welcome. You can reach out to me on Twitter or in the comments. You can find the code used for this article on my GitHub .

Learn how to create and implement trading strategies using Markowitz’s optimization!

The Setup

Primer on mean-variance optimization

Strategies

Benchmark 1/n strategy

Maximum Sharpe ratio portfolio — rebalancing every 30 days

Maximum Sharpe ratio portfolio — rebalancing every month

Minimum volatility portfolio — rebalancing every month

Comparing the performance

Conclusions

Recommend

数据结构与算法：队列

Pack the Bits - Adventures in small_bit_vector

开源:Hardcoder——安卓APP与系统间通信解决方案

Java安全漫谈 - 06.RMI篇(3)

前端代码质量-圈复杂度原理和实践

Build Your First Computer Vision Project — Dog Breed Classification

Redis中设置了过期时间的Key，那么你还要知道些什么？

如何用“设计模式”制作珍珠奶茶？

Five Tips for Contributing to Open Source Software

Models as Serverless Functions

About Joyk