The more we learned about the current crop of robo advisory firms, the more we realized we could do better. This brief blog post hits the high points of that thinking.
Not Just the Same Robo Advisory Technology
It appears that all major robo advisory companies use 50+ year-old MPT (modern portfolio theory). At Sigma1 we use so-called post-modern portfolio theory (PMPT) that is much more current. At the heart of PMPT is optimizing return versus semivariance. The details are not important to most people, but the takeaway is the PMPT, in theory, allows greater downside risk mitigation and does not penalize portfolios that have sharp upward jumps.
Robo advisors, we infer, must use some sort of Monte Carlo analysis to estimate “poor market condition” returns. We believe we have superior technology in this area too.
Finally, while most robo advisory firms offer tax loss harvesting, we believe we can 1) set up portfolios that do it better, 2) go beyond just tax loss harvesting to achieve greater portfolio tax efficiency.
The previous post showed after-tax results of a hypothetical 8% return portfolio. The primary weakness in this analysis was a missing bifurcation of return: dividends versus capital gains.
The analysis in this post adds the missing bifurcation. It is instructive to compare the two results. This new analysis accounts for the qualified dividends and assumes that these dividends are reinvested. It is an easy mistake to assume that since the qualified dividend rate is identical to the capital gains rate, that dividends are equivalent to capital gains on a post-tax basis. This assumption is demonstrably false.
Though both scenarios model a net 8% annual pre-tax return, the “6+2” model (6% capital appreciation, 2% dividend) shows a lower 6.98% after-tax return for the most tax-efficient scenario versus a 7.20% after-tax return for the capital-appreciation-only model. (The “6+2” model assumes that all dividends are re-invested post-tax.)
This insight suggests an interesting strategy to potentially boost total after-tax returns. We can assume that our “6+2” model represents the expected 30-year average returns for a total US stock market index ETF like VTI, We can deconstruct VTI into a value half and a growth half. We then put the higher-dividend value half in a tax-sheltered account such as an IRA, while we leave the lower-dividend growth half in a taxable account.
This value/growth split only produces about 3% more return over 30 years, an additional future value of $2422 per $10,000 invested in this way.
While this value/growth split works, I suspect most investors would not find it to be worth the extra effort. The analysis above assumes that the growth half is “7+1” model. In reality the split costs about 4 extra basis points of expense ratio — VTI has a 5 bps expense ratio, while the growth and value ETFs all have 9 bps expense ratios. This cuts the 10 bps per year after-tax boost to only 6 bps. Definitely not worth the hassle.
Now, consider the ETF Global X SuperDividend ETF (SDIV) which has a dividend yield of about 5.93%. Even if all of the dividends from this ETF receive qualified-dividend tax treatment, it is probably better to hold this ETF in a tax-sheltered account. All things equal it is better to hold higher yielding assets in a tax-sheltered account when possible.
Perhaps more important is to hold assets that you are likely to trade more frequently in a tax-sheltered account and assets that you are less likely to trade in a taxable account. The trick then is to be highly disciplined to not trade taxable assets that have appreciated (it is okay to sell taxable assets that have declined in value — tax loss harvesting).
The graph shows the benefits of long-term discipline on after-tax return, and the potential costs of a lack of trading discipline. Of course this whole analysis changes if capital gains tax rates are increased in the future — one hopes one will have sufficient advanced notice to take “evasive” action. It is also possible that one could be blindsided by tax raising surprises that give no advanced notice or are even retroactive! Unfortunately there are many forms of tax risk including the very real possibility of future tax increases.
When asked to consider tax deferral investment strategies, many people instinctively conclude that tax deferral benefits the investor at the expense of the government. Such a belief is half-right. Tax deferral ultimately benefits boththe investor and the government’s tax revenues. While there are exceptions involving inheritance, in most other cases both parties benefit. Figure 1 summarizes the relationship between higher after-tax returns and higher nominal net cash flows to the government.
The reason I lead with the government’s side of the tax equation is for tax policy wonks in Washington D.C. I suspect many of them already know this information, and this is simply another data point to add to their arsenal of tax facts. For the others, I hope this a wake-up call. The message:
When investors, investment advisors, and fund managers successfully defer long-term capital gains, investors and governments win in the long run.
The phrase “in the long run” is important. When taxes are deferred, the government’s share grows along with the investor’s. In the short term, taxes are reduced; in the long run taxes are increased. For the investor this long-run tax increase is more than offset by increased compounding of return.
Please note that all of these win/win outcomes occur under a assumption of fixed tax rates — which is 20% in this example. It is also worth noting that these outcomes occur for funds that are spent at any point in the investor’s lifetime. This analysis does not necessarily apply to taxable assets that are passed on via inheritance.
Critical observers may acknowledge the government tax “win” holds for nominal tax dollars, but wonder whether it still holds in inflation-adjusted terms. The answer is “yes” so long as the the investor’s (long-run) pre-tax returns exceed the (long run) rate of inflation. In other words so long as g > i(g is pre-tax return, i is inflation), the yellow line will be upward sloping; More effective tax-deferral strategies, with higher post-tax returns, will benefit both parties. As inflation increases the slope of the yellow line gets flatter, but it retains an upward slope so long as pre-tax return is greater than inflation.
Tax Advantages for Investors
Responsible investors face many challenges when trying to preserve and grow wealth. Among these challenges are taxes and inflation. I will start by addressing two important maxims in managing investment taxes:
Avoid net realized short-term (ST) gains
Defer net long-term gains as long as possible
It is okay to realize some ST gains, however it is important to offset those gains with capital losses. The simplest way of achieving this offset is to realize an equal or greater amount of ST capital losses within the same tax year. ST capital losses directly offset ST capital gains.
A workable, but more complex way of offsetting ST gains is with netLT capital losses.The term netis crucial here, as LT capital losses can only be used to offset ST capital gain once they have been first used to offset LT capital gains. It is only LT capital losses in excess of LT capital gains that offset ST gains.
If the above explanation makes your head spin, you are not alone. Managing capital gains is really an exercise in linear programming. In order to make this tax exercise less (mentally) taxing, here are some simple concepts to help:
ST capital losses are betterthan LT capital losses
ST capital gains are worsethan LT capital gains
When possible offset ST losses with ST gains
Because ST capital losses are better than LT, it often makes sense to see how long you have held assets that have larger paper (unrealized) losses. All things equal it is better to “harvest” the losses from the ST losers than from the LT losers.
Managing net ST capital gains can potentially save you a large amount of taxes, resulting in higher post-tax returns.
Tax Advantages for the Patient Investor
Deferring LT capital gains requires patience and discipline. Motivation can help reinforce patience. For motivation we go back to the example used to create Figure 1. The example starts today with $10,000 investment in a taxable account and a 30-year time horizon. The example assumes a starting cost basis of zero and an annual return of 8%.
This example was set up to help answer the question: “What is the impact of ‘tax events’ on after-tax returns?” To keep things as simple as possible a “tax event” is an event that triggers a long-term capital gains tax realization in a tax year. Also, in all cases, the investor liquidates the account at the start of year 31. (This year-31 sale is not counted in the tax event count.)
It turns out that it not just the number of tax events that matters — it is also the timing. To capture some of this timing-dependent behavior, I set up my spreadsheets to model two different timing modes. The first is called “stacked” and it simply stacks all tax events in back-to-back years. These second mode is called “spaced” because the tax events are spaced uniformly. Thus 2 stacked tax events occur in years 1, 2, while 2 spaced tax events occur in years 10 and 20. The results are interesting:
The most important thing to notice is that if an investor can completely avoid all “tax events” for 30 years the (compound) after-tax return is 7.2% per year, but if the investor triggers just one taxable event the after tax return is significantly reduced. A single “stacked” tax event in year 1 reduces after tax returns to 6.49% while a single “spaced” tax event in year 15 reduces returns to 6.67%. Thus for a single event the spaced tax event curve is higher, while for all other numbers of tax events (except 30 where they are identical) it is lower than the stacked-event curve.
The main take-away from this graph is that tax deferral discipline matters. The difference between 7.2% and 6.67% after-tax return, over thirty years is huge when framed in dollar terms. With zero (excess) tax events the after-tax result in year 31 is $80,501. With one excess tax event (with the least harmful timing!) that sum drops to $69,476.
In the worst case the future value drops to $51,444 with an annual compound after-tax return of only 5.61%.
Tax Complexity, Tax Modeling Complexity, and Other Factors
One of the challenges faced when bringing fresh perspectives to the tax-plus-investing dialog is in providing examples that paint the broad portfolio tax management themes in a concise way. The first challenge is that the tax code is constantly changing, so predicting future tax rates and tax rules is an imprecise game at best. The second challenge is that the tax code is so complex that any generalization will mostly likely have a counterexample buried somewhere in the tax code. The third complication is that baring significant future tax code changes and obscure tax code counterexamples, creating a one-size-fits-all model for investors results in large oversimplifications.
I believe that tax indifference is the wrong answer to the question of portfolio tax optimization. The right answer is more closely aligned with the maxim:
All models are wrong. Some are useful.
This common saying in statistics gets to the heart of the problem and the opportunity of investment tax management. It is better to build a model that gives deeper insight into opportunities that exist in reconciling prudent tax planning with prudent investment management, than to build no model at all.
The simple tax model used in this blog post makes some broad assumptions. Among these is that the long-term capital gains rate will be the same for 30 years and that the investor will occupy the same tax bracket for 30 years. The pre-tax return model is also very simple: 8% pre-tax return each and every year.
I argue that models as simple as this are still useful. They illustrate investment tax-management tax principles in a manner that is clear and draws the same conclusions as analysis using more complex tax modelling. (Complex models also have their place.)
I would like to highlight the oversimplification I think is most problematic from a tax perspective. The model assumes all the returns (8% per year) are in the form of capital appreciation. A better “8%” model would be to assume a 2% dividend and 6% capital appreciation. Dividends, even though receiving qualified-dividend tax treatment, would bring down the after-tax returns, especially on the left side of the curve. I will likely remedy that oversimplification in a future blog post.
Investment Tax Management Summary
Tax deferral does not hurt government revenues; it helps in the long run.
Realized net short-term capital gains can crater post-tax investment returns and should be avoided.
Deferral of (net) long-term capital gains can dramatically improve after-tax returns.
Tax deferral strategies require serious investment discipline to achieve maximum benefit.
Even simple tax modelling is far better than no tax modelling at all. Simple tax models can be useful and powerful. Nonetheless, investment tax models can and should be improved over time.
In order get close to bare-metal access to your compute hardware, use C. In order to utilize powerful, tested, convex optimization methods use CVXGEN. You can start with this CVXGEN code, but you’ll have to retool it…
Discard the (m,m) matrix for an (n,n) matrix. I prefer to still call it V, but Sigma is fine too. Just note that there is a major difference between Sigma (the covariance-variance matrix) and sigma (individual asset-return variances matrix; the diagonal of Sigma).
Go meta for the efficient frontier (EF). We’re going to iteratively generate/call CVXGEN with multiple scripts. The differences will be w.r.t the E(Rp).
Computing Max: E(Rp) is easy, given α. [I’d strongly recommend renaming this to something like expect_ret comprised of (r1, r2, … rn). Alpha has too much overloaded meaning in finance].
[Rmax] The first computation is simple. Maximize E(Rp) s.t constraints. This is trivial and can be done w/o CVXGEN.
[Rmin] The first CVXGEN call is the simplest. Minimize σp2 s.t. constraints, but ignoring E(Rp)
Using Rmin and Rmax, iteratively call CVXGEN q times (i=1 to q) using the additional constraint s.t. Rp_i= Rmin + (i/(q+1)*(Rmax-Rmin). This will produce q+2 portfolios on the EF [including Rmin and Rmax]. [Think of each step (1/(q+1))*(Rmax-Rmin) as a quantization of intermediate returns.]
Present, as you see fit, the following data…
(w0, w1, …wq+1)
[ E(Rp_0), …E(Rp_(q+1)) ]
[ σ(Rp_0), …σ(Rp_(q+1)) ]
My point is that — in two short blog posts — I’ve hopefully shown how easily-accessible advanced MVO portfolio optimization has become. In essence, you can do it for “free”… and stop paying for simple MVO optimization… so long as you “roll your own” in house.
I do this for the following reasons:
To spread MVO to the “masses”
To highlight that if “anyone with a master’s in finance and computer can do MVO for free” to consider their quantitative portfolio-optimization differentiation (AKA portfolio risk management differentiation), if any
To emphasize that this and the previous blog will not greatly help with semi-variance portfolio optimization
I ask you to consider that you, as one of the few that read this blog, have a potential advantage. You know who to contact for advanced, relatively-inexpensive SVO software. Will you use that advantage?
The Equation Everyone in Finance Show Know, but Many Probably Don’t!
Here it is:
… With thanks to codecogs.com which makes it really easy to write equations for the web.
This simple matrix equation is extremely powerful. This is really two equations. The first is all you really need. The second is just merely there for illustrative purposes.
This formula says how the variance of a portfolio can be computed from the position weights wT = [w1 w2 … wn] and the covariance matrix V.
σii ≡ σi2 = Var(Ri)
σij ≡ Cov(Ri, Rj) for i ≠ j
The second equation is actually rather limiting. It represents the smallest possible example to clarify the first equation — a two-asset portfolio. Once you understand it for 2 assets, it is relatively easy to extrapolate to 3-asset portfolios, 4-asset portfolios, and before you know it, n-asset portfolios.
Now I show the truly powerful “naked” general form equation:
This is really all you need to know! It works for 50-asset portfolios. For 100 assets. For 1000. You get the point. It works in general. And it is exact. It is the E = mc2 of Modern Portfolio Theory (MPT). It at least about 55 years old (2014 – 1959), while E = mc2 is about 99 years old (2014 – 1915). Harry Markowitz, the Father of (M)PT simply called it “Portfolio Theory” because:
There’s nothing modern about it.
Yes, I’m calling Markowitz the Einstein of Portfolio Theory AND of finance! (Now there are several other “post”-Einstein geniuses… Bohr, Heisenberg, Feynman… just as there are Sharpe, Scholes, Black, Merton, Fama, French, Shiller, [Graham?, Buffet?]…) I’m saying that a physicist who doesn’t know E = mc2 is not much of a physicist. You can read between the lines for what I’m saying about those that dabble in portfolio theory… with other people’s money… without really knowing (or using) the financial analog.
Why Markowitz is Still “The Einstein” of Finance (Even if He was “Wrong”)
Markowitz said that “downside semi-variance” would be better. Sharpe said “In light of the formidable
computational problems…[he] bases his analysis on the variance and standard deviation.”
Today we have no such excuse. We have more than sufficient computational power on our laptops to optimize for downside semi-variance, σd. There is no such tidy, efficient equation for downside semi-variance. (At least not that anyone can agree on… and none that that is exact in any sense of any reasonable mathematical definition of the word ‘exact’.)
Fama and French improve upon Markowitz (M)PT [I say that if M is used in MPT, it should mean “Markowitz,” not “modern”, but I digress.] Shiller, however, decimates it. As does Buffet, in his own applied way. I use the word decimate in its strict sense… killing one in ten. (M)PT is not dead; it is still useful. Diversification still works; rational investors are still risk-averse; and certain low-beta investments (bonds, gold, commodities…) are still poor very-long-term (20+ year) investments in isolation and relative to stocks, though they still can serve a role as Markowitz Portfolio Theory suggests.
Wanna Build your Own Optimizer (for Mean-Return Variance)?
This blog post tells you most of the important bits. I don’t really need to write part 2, do I? Not if you can answer these relatively easy questions…
What is the matrix expression for computing E(Rp) based on w?
What simple constraint is w subject to?
How does the general σp2 equation relate to the efficient frontier?
How might you adapt the general equation to efficiently compute the effects of a Δw event where wi increases and wj decreases? (Hint “cache” the wx terms that don’t change,)
What other constraints may be imposed on w or subsets (asset categories within w)? How will you efficientlydeal with these constraints?
The red and green “clover” pattern illustrates how traditional risk can be modeled. The red “leaves” are triggered when both the portfolio and the “other asset” move together in concert. The green leaves are triggered when the portfolio and asset move in opposite directions.
Each event represents a moment in time, say the closing price for each asset (the portfolio or the new asset). A common time period is 3-years of total-return data [37 months of price and dividend data reduced to 36 monthly returns.]
When a portfolio manager considers adding a new asset to an existing portfolio, she may wish to see how that asset’s returns would have interacted with the rest of the portfolio. Would this new asset have made the portfolio more or less volatile? Risk can be measured by looking at the time-series return data. Each time the asset and the portfolio are in the red, risk is added. Each time they are in the green, risk is subtracted. When all the reds and greens are summed up there is a “mathy” term for this sum: covariance. “Variance” as in change, and “co” as in together. Covariance means the degree to which two items move together.
If there are mostly red events, the two assets move together most of the time. Another way of saying this is that the assets are highly correlated. Again, that is “co” as in together and “related” as in relationship between their movements. If, however, the portfolio and asset move in opposite directions most of the time, the green areas, then the covariance is lower, and can even be negative.
It is not only the whether the two assets move together or apart; it is also the degree to which they move. Larger movements in the red region result in larger covariance than smaller movements. Similarly, larger movements in the green region reduce covariance. In fact it is the product of movements that affects how much the sum of covariance is moved up and down. Notice how the clover-leaf leaves move to the center, (0,0) if either the asset or the portfolio doesn’t move at all. This is because the product of zero times anything must be zero.
Getting Technical: The clover-leaf pattern relates to the angle between each pair of asset movements. It does not show the affect of the magnitude of their positions.
If the incremental covariance of the asset to the portfolio is less than the variance of the portfolio, a portfolio that adds the asset would have had lower overall variance (historically). Since there is a tenancy (but no guarantee!) for asset’s correlations to remain somewhat similar over time, the portfolio manager might use the covariance analysis to decide whether or not to add the new asset to the portfolio.
Semi-Variance: Another Way to Measure Risk
After staring at the covariance visualization, something may strike you as odd — The fact that when the portfolio and the asset move UP together this increases the variance. Since variance is used as a measure of risk, that’s like saying the risk of positive returns.
Most ordinary investors would not consider the two assets going up together to be a bad thing. In general they would consider this to be a good thing.
So why do many (most?) risk measures use a risk model that resembles the red and green cloverleaf? Two reasons: 1) It makes the math easier, 2) history and inertia. Many (most?) textbooks today still define risk in terms of variance, or its related cousin standard deviation.
There is an alternative risk measure: semi-variance. The multi-colored cloverleaf, which I will call the yellow-grey cloverleaf, is a visualization of how semi-variance is computed. The grey leaf indicates that events that occur in that quadrant are ignored (multiplied by zero). So far this is where most academics agree on how to measure semi-variance.
Variants on the Semi-Variance Theme
However differences exist on how to weight the other three clover leaves. It is well-known that for measuring covariance each leaf is weighted equally, with a weight of 1. When it comes to quantifying semi-covariance, methods and opinions differ. Some favor a (0, 0.5, 0.5, 1) weighting scheme where the order is weights for quadrants 1, 2, 3, and 4 respectively. [As a decoder ring Q1 = grey leaf, Q2 = green leaf, Q3 = red leaf, Q4 = yellow leaf].
Personally, I favor weights (0, 3, 2, -1) for the asset versus portfolio semi-covariance calculation. For asset vs asset semi-covariance matrices, I favor a (0, 1, 2, 1) weighting. Notice that in both cases my weighting scheme results in an average weight per quadrant of 1.0, just like for regular covariance calculations.
Financial Industry Moving toward Semi-Variance (Gradually)
Semi-variance more closely resembles how ordinary investors view risk. Moreover it also mirrors a concept economists call “utility.” In general, losing $10,000 is more painful than gaining $10,000 is pleasurable. Additionally, losing $10,000 is more likely to adversely affect a person’s lifestyle than gaining $10,000 is to help improve it. This is the concept of utility in a nutshell: losses and gains have an asymmetrical impact on investors. Losses have a bigger impact than gains of the same size.
Semi-variance optimization software is generally much more expensive than variance-based (MVO mean-variance optimization) software. This creates an environment where larger investment companies are better equipped to afford and use semi-variance optimization for their investment portfolios. This too is gradually changing as more competition enters the semi-variance optimization space. My guestimate is that currently about 20% of professionally-managed U.S. portfolios (as measured by total assets under management, AUM) are using some form of semi-variance in their risk management process. I predict that that percentage will exceed 50% by 2018.
I start with a hypothetical. You are considering between three portfolios A, B, and C. If you could know with certainty one of the following annual risk measures, which would you choose:
For me the choice is obvious: max drawdown. Variance and semi-variance are deliberately decoupled from return. In fact, we often say variance as short-hand for mean-return variance. Similarly, semi-variance is short-hand for mean-return semi-variance. For each variance flavor, mean-returns — average returns — are subtracted from the risk formula. The mathematical bifurcation of risk and return is deliberate.
Max drawdown blends return and risk. This is mathematically untidy — max drawdown and return are non-orthogonal. However, the crystal ball of max drawdown allows choosing the “best” portfolio because it puts a floor on loss. Tautologically the annual loss cannot exceed the annual max drawdown.
My revised answer stretches the rules. If all three portfolios have future max drawdowns of less than 5 percent, then I’d like to know the semi-variances.
Of course there are no infallible crystal balls. Such choices are only hypothetical.
Past variance tends to be reasonably predictive of future variance; past semi-variance tends to predict future semi-variance to a similar degree. However, I have not seen data about the relationship between past and future drawdowns.
Research Opportunities Regarding Max Drawdown
It turns out that there are complications unique to max drawdown minimization that are not present with MVO or semi-variance optimization. However, at Sigma1, we have found some intriguing ways around those early obstacles.
That said, there are other interesting observations about max drawdown optimization:
1) Max drawdown only considers the worst drawdown period; all other risk data is ignored.
2) Unlike V or SV optimization, longer historical periods increase the max drawdown percentage.
3) There is a scarcity of evidence of the degree (or lack) of relationship between past max drawdowns and future.
(#1) can possibly be addressed by using hybrid risk measures such as combined semi-variance and max drawdown measures. (#2) can be addressed by standardizing max drawdowns… a simple standardization would be DDnorm = DD/num_years. Another possibility is DDnorm = DD/sqrt(num_years). (#3) Requires research. Research across different time periods, different countries, different market caps, etc.
Also note that drawdown has many alternative flavors — cumulative drawdown, weighted cumulative drawdown (WCDD), weighted cumulative drawdown over threshold — just to name three.
The bottom line is that early adopters have embraced semi-variance based optimization and the trend appears to be snowballing. For instance, Morningstar now calculates risk “with an emphasis on downward variation.” I believe that drawdown measures, either stand-alone or hybridized with semi-variance, are the future of post post modern portfolio theory.
Bye PMPT. Time for a Better Name! Contemporary Portfolio Theory?
I recommend starting with the the acronym first. I propose CPT or CAPT. Either could be pronounced as “Capped”. However, CAPT could also be pronounced “Cap T” as distinct from CAPM (“Cap M”). “C” could stand for either Contemporary or Current. And the “A” — Advanced, Alternative — with the first being a bit pretentious, and the latter being more diplomatic. I put my two cents behind CAPT, pronounced “Cap T”; You can figure out what you want the letters to represent. What is your 2 cents? Please leave a comment!
Back to (Contemporary) Risk Measures
I see semi-variance beginning to transition from the early-adopter phase to the early-majority phase. However, my observations may be skewed by the types of interactions Sigma1 Financial invites. I believe that semi-variance optimization will be mainstream in 5 years or less. That is plenty of time for semi-variance optimization companies to flourish. However, we’re also looking for the nextnext big thing in finance.
Disclosure: The purpose of this post is to show how I, personally, use the HALO Portfolio Optimizer software to manage my personal portfolio. It is not investment advice! I use my personal opinions about which assets to select and expected one-year returns into the optimizer configuration. The optimizer then provides an efficient frontier (EF) based on historic total-return data and my personal expected-return estimates.
I use other software (User Tuner) to approach the EF, while limiting the number and size of trades (minimizing churn and trading costs). Getting exactly to the EF would require trading (buying or selling) every asset in my portfolio — which would cost approximately $159 in trading costs for 18 trades. Factoring in bid/ask spreads the cost would be even higher. However, by being frugal about trades, I was able to limit the number of trades to 6 while getting much closer to the EF.
Past performance is no guarantee of future performance, nor is past volatility necessarily indicative of future volatility. Nonetheless, I am making the personal decision to use past volatility information to possibly increase the empirical diversification of my retirement portfolio with the goal of increasing risk-adjusted return. Time will tell whether this approach was successful or not.
In my last post I blogged about reallocating my entire retirement portfolio closer to the MVO efficient frontier computed by the HALO Portfolio Optimizer. The zoomed in plot tells the story to date:
The “objective space” plot is zoomed in and only shows a small portion of the efficient frontier. As you can see the black X is closer to the efficient frontier than the blue diamond, but naturally the dimensions are not the same. Using a risk-free rate of 0.5% the predicted Sharpe ratio has improved from 0.68 to 0.75 – a marked increase of about 10.3%. [If you crunch the numbers yourself, don’t forget to annualize σ.]
While a 10.3% Sharpe ratio expected improvement is very significant, there is obviously room for compelling additional improvement. An expected Sharpe ratio of just north of 0.8 is attainable.
The primary reason the portfolio has not yet moved even closer to the efficient frontier is due to 18.6% of the retirement portfolio being tied up in red tape as a result of my recent voluntary severance or “buy-out” from Intel Corporation. [ Kudos to Intel for offering voluntary severance to all of my local coworkers and me. It is a much more compassionate method of workforce reduction than layoffs! I consider the package offered to me reasonably generous, and I gladly took the opportunity to depart and begin working full time building my start up.]
Time to Get Technical
I won’t finish without mentioning a few important technical details. The points in the objective space (of monthlyσ on the horizontal and expected annual return on the vertical) can be viewed as dependent variables of the (largely) independent variables of asset weights. Such points include the blue diamond, the black X, and all the red triangles on the efficient frontier. I often call the (largely) independent domain of asset allocation weights the “search space”, and the weightings in the search space that result in points on the efficient frontier the “solution space.”
One way to measure the progress from the blue diamond to the X is via improvement in the Sharpe ratio, which implicitly factors in the CAL, or the CML for the tangent CAL. As “X” approaches the red line visually it also approaches the efficient frontier quantitatively and empirically. However, X canmake significant progress towards the efficient frontier, say point EF#9 specifically, with little or no “progress” in the portfolio weights from the blue diamond to the black X.
“Progress” in the objective space is reasonably straight forward — just use Sharpe ratios, for instance. However measuring “progress” in the asset allocation (weight) space is perhaps less clear. Generally, I prefer the use of the L1-norms of differences of the asset-weight vectors Wo (corresponding to original portfolio weight; e.i. the blue diamond), Wx, and Wef_n. The distance of from the blue diamond in search space to the red triangle #9 is denoted as |Wef_9 – Wo|1 while the distance from X in the search space is |Wef_9 – Wx|1. Interestingly, the respective values are 0.572 and 0.664. Wx is, by this measure, actually further from Wef_9 in search space, but closer in objective space!
I sometimes refer to these as the “Hamming distances” (even though “Hamming distance” is typically applied to differences in binary codes or character inequality counts of two strings of characters.) It is simply easier to say the “Hamming distance from Wx to Wef_9” than the “ell-one norm of the difference of Wx and Wef_9.”
I have been working on an utility temporarily called “user tuner” that makes navigating in both the search space and the objective space quicker, easier and more productive. More details to follow in a future post.
Why Not Semi-Variance Optimization?
Frequent readers will know that I believe that mean semi-variance optimization (MSVO or SVO) is superior to vanilla MVO. So why am I starting with MVO? Three reasons:
To many, MVO is less scary because it is somewhat familiar. So I’m starting with the familiar “basics.”
I wanted to talk about Sharpe ratios first, because again they are more familiar than, say, Sortino ratios.
I wanted to use “User Tuner”, and I originally coded it for MVO (though that is easily remedied).
However, asymptotically refining allocation of my entire portfolio to get extremely close to the MVO efficient frontier is only phase 1. It is highly likely I will compute the SVO efficient frontier next and use a slightly modified “User Tuner” to approach the mean semi-variance efficient frontier… Likely in the next month or two, once my 18.6% of assets are freed up.
Explaining technical investment concepts in a non-technical way is critical to having a meaningful dialog with individual investors. Most individual investors (also called “retail investors”, or “small investors”) do not have the time nor the desire to learn the jargon and concepts behind building a solid investment portfolio. This is generally true for most individual investors regardless of the size of their investment portfolios. Individual investors expect investment professionals (also called “institutional investors”) to help manage their portfolios and explain the major investment decisions behind the management of their individual portfolios.
In the same way that a good doctor helps her patient make informed medical decisions, a good investment adviser helps her clients make informed investment decisions.
I get routinely asked how the HALO Portfolio Optimizer works. Every time I answer that question, I face two risks: 1) that I don’t provide enough information to convince the investment profession or their clients that HALO optimization provides significant value and risk-mitigation capability and 2) I risk sharing key intellectual property (IP) unique to the Sigma1 Financial HALO optimizer.
This post is my best effort to provide both investment advisers and their clients with enough information to evaluate and understand HALO optimization, while avoiding sharing key Sigma1 trade secrets and intelectual property. I would very much appreciate feedback, both positive and negative, as to whether I have achieved these goals.
First Principle of Portfolio Optimization Software
Once when J.P. Morgan was asked what the market would do, he answered “It will fluctuate.” While some might find this answer rather flippant, I find it extremely insightful. It turns out that so-called modern portfolio theory (MPT) is based understanding (or quantifying) market fluctuations. MPT labels these fluctuations as “risk” and identifies “return” as the reward that a rational investor is willing to accept for a given amount of risk. MPT assumes that a rational investor, or his/her investment adviser will diversify away most or all “diversifiable risk” by creating a suitable investment portfolio tailored to the investor’s current “risk tolerance.”
In other words, the primary job of the investment adviser (in a “fiduciary” role), is to maximize investment portfolio return for a client’s acceptable risk. Said yet another way, the job is to maximize the risk/reward ratio for the client, without incurring excess risk.
Now for the first principle: past asset “risk” tends to indicate future asset “risk”. In general an asset that has been previously more volatile will tend to remain more volatile, and and asset that has been less volatile will tend to remain less volatile. Commonly, both academia and professional investors have equated volatility with risk.
Second Principle of Portfolio Optimization Software
The Second Principle is closely related to the first. The idea is that the past portfolio volatility tends to indicate future portfolio volatility. This thesis is so prevalent that it is almost inherently assumed. This is evidenced by search results that reaches beyond volatility and looks at the hysteresis of return-versus-volatility ratios, papers such at this.
Past Performance is Not Necessarily Indicative of Future Results.
Third Principle of Portfolio Optimization Software
The benefits of diversification are manifest in risk mitigation. If two assets are imperfectly correlated, then their combined volatility (risk) will beless than the weighted averages of their individual volatilities. An in-depth mathematical description two-asset portfolio volatilities can be found on William Sharpe’s web page. Two-asset mean-variance optimization is relatively simple, and can be performed with relatively few floating-point operations on a computer. This process creates the two-asset efficient frontier*. As more assets are added to the mix, the computational demand to find the optimal efficient frontier grows geometrically, if you don’t immediately see why look at page 8 of this paper.
A much simpler explanation of the the third principle is as follows. If asset A has annual standard deviation of 10%, and asset B an annual standard deviation of 20%, and A and B are not perfectly correlated, then the portfolio of one half invested in A and the other half invested in B will have a annual standard deviation of less than 15%. (Non-perfectly correlated means a correlation of less than 1.0). Some example correlations of assets can be found here.
In so-called plain English, the Third Principle of Portfolio Optimization can be stated: “For a given level of expected return, portfolio optimization software can reduce portfolio risk by utilizing the fact that different assets move somewhat independently from each other.”
Forth Principle of Portfolio Optimization Software
The Forth Principle of Portfolio Optimization establishes a relationship between risk and return. The classic assumption of modern portfolio theory (MPT) is that so-called systematic risk is rewarded (over a long-enough time horizon) with increased returns. Portfolio-optimization software seeks to reduce or eliminate unsystematic risk when creating an optimized set of portfolios. The portfolio manager can thus select one of these optimized portfolios from the “best-in-breed” list created by the optimization software that is best suited to his/her client’s needs.
Fifth Principle of Portfolio Optimization Software
The 5th Principle is that the portfolio manager and his team adds value to the portfolio composition process by 1) selecting a robust mix of assets, 2) applying constraints to the weights of said assets and asset-groups, and 3) assigning expected returns to each asset. The 5th Principle focuses on the assignment of expected returns. This process can be grouped under the category of investment analysis or investment research. Investment firms pay good money for either in-house or contracted investment analysis of selected securities.
Applying the Portfolio Optimization Principles Together
Sigma1 Financial HALO Software applies these five principles together to help portfolio managers improve or fine-tune their proprietary-trading and/or client investment portfolios. HALO Portfolio Optimization software utilizes the assets, constraints, and expected returns from the 5th Principal as a starting point. It then uses the 4th Principal by optimizing away systematic risk from a set of portfolios by taking maximum advantage of varying degrees of non-correlation of the portfolio assets. The 3rd Principle alludes to the computational difficulty of solving the multi-asset optimization problem. Principles 1 and 2 form the bedrock of the concepts behind the use of historical correlation data to predict and estimate future correlations.
The Fine Print
Past asset volatility of most assets and most portfolios is historically well correlated with future volatility. However, not only are assets increasingly correlated, there is some evidence that asset correlations tend to increase during times of financial crisis. Even if assets are more correlated, there remains significant value in exploiting partial-discorrelation.
(*) The two-asset model can be represented as two parametric functions of a single variable, “t”, ER(t), and var(t). t simply represents the investment proportion invested in asset 0 (aka asset A). For three variables, expected return becomes ER(t0,t1) as does var(t0,t1). And so on for increasing numbers of assets. The computational effort required to compute ER(t0…tn) scales linearly with number of assets, but var(t0…tn) scales geometrically.
Optimizing efficiently within this complex space benefits from creative algorithms and heuristics.
The basic heuristics and algorithms I envisioned a year and a half ago have stood up well to testing, both internal testing and using external beta tester and client data. Lessons learned have largely been associated with learning what is important to institutional investors. Some of those lessons:
To date, every client has avoided long-short portfolios. All of my initial clients have asked for strictly long-only portfolio optimization. Based on their directions, I have temporarily incorporated a “zero-floor” for all securities, which modestly speeds up the optimization process. (Note: this constraint can easily be reversed.)
The initial portfolio-optimization code ran open-loop; that is to say that any asset could be assigned a weighting between 0 and 100%. Generally, extreme asset-weightings were mathematically avoided for the majority of the “optimization surface.” However, all Sigma1 clients have requested individual asset minimum and maximum asset-weighting constraints. While these constraints somewhat reduce the enormous search space, in practice, they tend to slow down the heuristic algorithms. Much of my optimization effort has been focused on efficiently enabling individual asset range constraints.
The third major lesson was that some clients want layered asset-class constraints. This capability has been incorporated into the base code.
The primary thesis behind HALO Portfolio Optimization is that compute technology and algorithms have sufficiently progressed to optimize portfolios beyond simple mean-variance optimization (MVO). Moreover, creating a set of three(an efficient surface) of portfolios optimized for multiple objectives (three: risk1, risk2, and expected return) is performed, rather than a simpler 2-D optimization.
Much more run-time optimization is on the Sigma1 road map. The primary speed up is via conversion of increasing conversion of key parts of the HALO Ruby code to C/C++. In the meantime, upgrading to arguably the fastest processor on the planet, the Intel i7-4700K, has shown a 2.95X speed up over benchmarks running the i5-2647M CPU running Ruby code that is currently the HALO Portfolio Optimization run-time bottleneck. The primary operations are (billions of) double-precision floating-point arithmetic computations.
The HALO portfolio-optimization algorithms/heuristics have already “fast enough” for every single institutional investor we have worked with to date. That does not measurably dampen my personal desire to push optimization speed to its limit. I intend to crush previous performance benchmarks, again and again.
Does a hard-working professional investment advisory team need optimization to faster than 30 minutes? I’d argue “No.” But do they want faster, of course “Yes!” If the crunch time is reduced to 5 minutes — the same logic applies — they want faster. I understand.
It could easily be argued that it would be better to apply my efforts to developing the web UI. I am, in parallel, at my own pace. Currently, however, my passion is speed. Having achieved some speed, I crave more. When I follow my passion, my productivity is dramatically improved. Moreover, the skill set I am trying to master has intrinsic value beyond the field of portfolio optimization. Fun and profit, in a start-up, is often more important than maximum profit (or maximum revenue).
The HAL0 algorithms and heuristics are intrinsically fast and scalable. Since I am not planning on sharing their inner architecture (except for millions of dollars), the proof of their power is measured in raw performance. If that effort results in temporary loss of revenue for enhanced future revenue, then so be it.