The more we learned about the current crop of robo advisory firms, the more we realized we could do better. This brief blog post hits the high points of that thinking.

Not Just the Same Robo Advisory Technology

It appears that all major robo advisory companies use 50+ year-old MPT (modern portfolio theory). At Sigma1 we use so-called post-modern portfolio theory (PMPT) that is much more current. At the heart of PMPT is optimizing return versus semivariance. The details are not important to most people, but the takeaway is the PMPT, in theory, allows greater downside risk mitigation and does not penalize portfolios that have sharp upward jumps.

Robo advisors, we infer, must use some sort of Monte Carlo analysis to estimate “poor market condition” returns. We believe we have superior technology in this area too.

Finally, while most robo advisory firms offer tax loss harvesting, we believe we can 1) set up portfolios that do it better, 2) go beyond just tax loss harvesting to achieve greater portfolio tax efficiency.

When asked to consider tax deferral investment strategies, many people instinctively conclude that tax deferral benefits the investor at the expense of the government. Such a belief is half-right. Tax deferral ultimately benefits boththe investor and the government’s tax revenues. While there are exceptions involving inheritance, in most other cases both parties benefit. Figure 1 summarizes the relationship between higher after-tax returns and higher nominal net cash flows to the government.

The reason I lead with the government’s side of the tax equation is for tax policy wonks in Washington D.C. I suspect many of them already know this information, and this is simply another data point to add to their arsenal of tax facts. For the others, I hope this a wake-up call. The message:

When investors, investment advisors, and fund managers successfully defer long-term capital gains, investors and governments win in the long run.

The phrase “in the long run” is important. When taxes are deferred, the government’s share grows along with the investor’s. In the short term, taxes are reduced; in the long run taxes are increased. For the investor this long-run tax increase is more than offset by increased compounding of return.

Please note that all of these win/win outcomes occur under a assumption of fixed tax rates — which is 20% in this example. It is also worth noting that these outcomes occur for funds that are spent at any point in the investor’s lifetime. This analysis does not necessarily apply to taxable assets that are passed on via inheritance.

Critical observers may acknowledge the government tax “win” holds for nominal tax dollars, but wonder whether it still holds in inflation-adjusted terms. The answer is “yes” so long as the the investor’s (long-run) pre-tax returns exceed the (long run) rate of inflation. In other words so long as g > i(g is pre-tax return, i is inflation), the yellow line will be upward sloping; More effective tax-deferral strategies, with higher post-tax returns, will benefit both parties. As inflation increases the slope of the yellow line gets flatter, but it retains an upward slope so long as pre-tax return is greater than inflation.

Tax Advantages for Investors

Responsible investors face many challenges when trying to preserve and grow wealth. Among these challenges are taxes and inflation. I will start by addressing two important maxims in managing investment taxes:

Avoid net realized short-term (ST) gains

Defer net long-term gains as long as possible

It is okay to realize some ST gains, however it is important to offset those gains with capital losses. The simplest way of achieving this offset is to realize an equal or greater amount of ST capital losses within the same tax year. ST capital losses directly offset ST capital gains.

A workable, but more complex way of offsetting ST gains is with netLT capital losses.The term netis crucial here, as LT capital losses can only be used to offset ST capital gain once they have been first used to offset LT capital gains. It is only LT capital losses in excess of LT capital gains that offset ST gains.

If the above explanation makes your head spin, you are not alone. Managing capital gains is really an exercise in linear programming. In order to make this tax exercise less (mentally) taxing, here are some simple concepts to help:

ST capital losses are betterthan LT capital losses

ST capital gains are worsethan LT capital gains

When possible offset ST losses with ST gains

Because ST capital losses are better than LT, it often makes sense to see how long you have held assets that have larger paper (unrealized) losses. All things equal it is better to “harvest” the losses from the ST losers than from the LT losers.

Managing net ST capital gains can potentially save you a large amount of taxes, resulting in higher post-tax returns.

Tax Advantages for the Patient Investor

Deferring LT capital gains requires patience and discipline. Motivation can help reinforce patience. For motivation we go back to the example used to create Figure 1. The example starts today with $10,000 investment in a taxable account and a 30-year time horizon. The example assumes a starting cost basis of zero and an annual return of 8%.

This example was set up to help answer the question: “What is the impact of ‘tax events’ on after-tax returns?” To keep things as simple as possible a “tax event” is an event that triggers a long-term capital gains tax realization in a tax year. Also, in all cases, the investor liquidates the account at the start of year 31. (This year-31 sale is not counted in the tax event count.)

It turns out that it not just the number of tax events that matters — it is also the timing. To capture some of this timing-dependent behavior, I set up my spreadsheets to model two different timing modes. The first is called “stacked” and it simply stacks all tax events in back-to-back years. These second mode is called “spaced” because the tax events are spaced uniformly. Thus 2 stacked tax events occur in years 1, 2, while 2 spaced tax events occur in years 10 and 20. The results are interesting:

The most important thing to notice is that if an investor can completely avoid all “tax events” for 30 years the (compound) after-tax return is 7.2% per year, but if the investor triggers just one taxable event the after tax return is significantly reduced. A single “stacked” tax event in year 1 reduces after tax returns to 6.49% while a single “spaced” tax event in year 15 reduces returns to 6.67%. Thus for a single event the spaced tax event curve is higher, while for all other numbers of tax events (except 30 where they are identical) it is lower than the stacked-event curve.

The main take-away from this graph is that tax deferral discipline matters. The difference between 7.2% and 6.67% after-tax return, over thirty years is huge when framed in dollar terms. With zero (excess) tax events the after-tax result in year 31 is $80,501. With one excess tax event (with the least harmful timing!) that sum drops to $69,476.

In the worst case the future value drops to $51,444 with an annual compound after-tax return of only 5.61%.

Tax Complexity, Tax Modeling Complexity, and Other Factors

One of the challenges faced when bringing fresh perspectives to the tax-plus-investing dialog is in providing examples that paint the broad portfolio tax management themes in a concise way. The first challenge is that the tax code is constantly changing, so predicting future tax rates and tax rules is an imprecise game at best. The second challenge is that the tax code is so complex that any generalization will mostly likely have a counterexample buried somewhere in the tax code. The third complication is that baring significant future tax code changes and obscure tax code counterexamples, creating a one-size-fits-all model for investors results in large oversimplifications.

I believe that tax indifference is the wrong answer to the question of portfolio tax optimization. The right answer is more closely aligned with the maxim:

All models are wrong. Some are useful.

This common saying in statistics gets to the heart of the problem and the opportunity of investment tax management. It is better to build a model that gives deeper insight into opportunities that exist in reconciling prudent tax planning with prudent investment management, than to build no model at all.

The simple tax model used in this blog post makes some broad assumptions. Among these is that the long-term capital gains rate will be the same for 30 years and that the investor will occupy the same tax bracket for 30 years. The pre-tax return model is also very simple: 8% pre-tax return each and every year.

I argue that models as simple as this are still useful. They illustrate investment tax-management tax principles in a manner that is clear and draws the same conclusions as analysis using more complex tax modelling. (Complex models also have their place.)

I would like to highlight the oversimplification I think is most problematic from a tax perspective. The model assumes all the returns (8% per year) are in the form of capital appreciation. A better “8%” model would be to assume a 2% dividend and 6% capital appreciation. Dividends, even though receiving qualified-dividend tax treatment, would bring down the after-tax returns, especially on the left side of the curve. I will likely remedy that oversimplification in a future blog post.

Investment Tax Management Summary

Tax deferral does not hurt government revenues; it helps in the long run.

Realized net short-term capital gains can crater post-tax investment returns and should be avoided.

Deferral of (net) long-term capital gains can dramatically improve after-tax returns.

Tax deferral strategies require serious investment discipline to achieve maximum benefit.

Even simple tax modelling is far better than no tax modelling at all. Simple tax models can be useful and powerful. Nonetheless, investment tax models can and should be improved over time.

Parts 1 and 2 left a trail of breadcrumbs to follow. Now I provide a full-color map, a GPS, and local guide. In other words the complete solution in the R statistical language.

Recall that the fast way to compute portfolio variance is:

The companion equation is r_{p}= w^{T}rtn, where rtn is a column vector of expected returns (or historic returns) for each asset. The first goal is to find find w_{0} and w_{n}. w_{0 }minimizes variance regardless of return, while w_{n }maximizes return regardless of variance. The goal is to then create the set of vectors {w_{0},w_{1},…w_{n}} that minimizes variance for a given level of expected return.

I just discovered that someone already wrote an excellent post that shows exactly how to write an MVO optimizer completely in R. Very convenient! Enjoy…

Suppose you have the tools to compute the mean-return efficient frontier to arbitrary (and sufficient) precision — given a set of total-return time-series data of asset/securities. What would you do with such potential?

I propose that the optimal solution is to “breach the frontier.” Current portfolios provide a historic reference. Provided reference/starting point portfolios have all (so far) provided sufficient room for meaningful and sufficient further optimization, as gauged by, say, improved Sortino ratios.

Often, when the client proposes portfolio additions, some of these additions allow the optimizer to push beyond the original efficient frontier (EF), and provide improved Sortino ratios. Successful companies contact ∑1 in order to see how each of their portfolios:

1) Land on a risk-versus-reward (expected-return) plot
2) Compare to one or more benchmarks, e.g. the S&P500 over the same time period
3) Compare to an EF comprised of assets in the baseline portfolio

Our company is not satisfied to provide marginal or incremental improvement. Our current goal is provide our client with more resilient portfolio solutions. Clients provide the raw materials: a list of vetted assets and expected returns. ∑1 software then provides near-optimal mix of asset allocations that serve a variety of goals:

1) Improved projectedrisk-adjusted returns (based on semi-variance optimization)
2) Identification of under-performing assets (in the context of the “optimal” portfolio)
3) Identification of potential portfolio-enhancing assets and their asset weightings

We are obsessed with meaningful optimization. We wish to find the semi-variance (semi-deviation) efficient frontier and then breach it by including client-selected auxiliary assets. Our “mission” is as simple as that — Better, more resilient portfolios

Disclosure: The purpose of this post is to show how I, personally, use the HALO Portfolio Optimizer software to manage my personal portfolio. It is not investment advice! I use my personal opinions about which assets to select and expected one-year returns into the optimizer configuration. The optimizer then provides an efficient frontier (EF) based on historic total-return data and my personal expected-return estimates.

I use other software (User Tuner) to approach the EF, while limiting the number and size of trades (minimizing churn and trading costs). Getting exactly to the EF would require trading (buying or selling) every asset in my portfolio — which would cost approximately $159 in trading costs for 18 trades. Factoring in bid/ask spreads the cost would be even higher. However, by being frugal about trades, I was able to limit the number of trades to 6 while getting much closer to the EF.

Past performance is no guarantee of future performance, nor is past volatility necessarily indicative of future volatility. Nonetheless, I am making the personal decision to use past volatility information to possibly increase the empirical diversification of my retirement portfolio with the goal of increasing risk-adjusted return. Time will tell whether this approach was successful or not.

In my last post I blogged about reallocating my entire retirement portfolio closer to the MVO efficient frontier computed by the HALO Portfolio Optimizer. The zoomed in plot tells the story to date:

The “objective space” plot is zoomed in and only shows a small portion of the efficient frontier. As you can see the black X is closer to the efficient frontier than the blue diamond, but naturally the dimensions are not the same. Using a risk-free rate of 0.5% the predicted Sharpe ratio has improved from 0.68 to 0.75 – a marked increase of about 10.3%. [If you crunch the numbers yourself, don’t forget to annualize σ.]

While a 10.3% Sharpe ratio expected improvement is very significant, there is obviously room for compelling additional improvement. An expected Sharpe ratio of just north of 0.8 is attainable.

The primary reason the portfolio has not yet moved even closer to the efficient frontier is due to 18.6% of the retirement portfolio being tied up in red tape as a result of my recent voluntary severance or “buy-out” from Intel Corporation. [ Kudos to Intel for offering voluntary severance to all of my local coworkers and me. It is a much more compassionate method of workforce reduction than layoffs! I consider the package offered to me reasonably generous, and I gladly took the opportunity to depart and begin working full time building my start up.]

Time to Get Technical

I won’t finish without mentioning a few important technical details. The points in the objective space (of monthlyσ on the horizontal and expected annual return on the vertical) can be viewed as dependent variables of the (largely) independent variables of asset weights. Such points include the blue diamond, the black X, and all the red triangles on the efficient frontier. I often call the (largely) independent domain of asset allocation weights the “search space”, and the weightings in the search space that result in points on the efficient frontier the “solution space.”

One way to measure the progress from the blue diamond to the X is via improvement in the Sharpe ratio, which implicitly factors in the CAL, or the CML for the tangent CAL. As “X” approaches the red line visually it also approaches the efficient frontier quantitatively and empirically. However, X canmake significant progress towards the efficient frontier, say point EF#9 specifically, with little or no “progress” in the portfolio weights from the blue diamond to the black X.

“Progress” in the objective space is reasonably straight forward — just use Sharpe ratios, for instance. However measuring “progress” in the asset allocation (weight) space is perhaps less clear. Generally, I prefer the use of the L^{1}-norms of differences of the asset-weight vectors W_{o} (corresponding to original portfolio weight; e.i. the blue diamond), W_{x}, and W_{ef_n}. The distance of from the blue diamond in search space to the red triangle #9 is denoted as |W_{ef_9} – W_{o}|_{1} while the distance from X in the search space is |W_{ef_9} – W_{x}|_{1}. Interestingly, the respective values are 0.572 and 0.664. W_{x }is, by this measure, actually further from W_{ef_9 }in search space, but closer in objective space!

I sometimes refer to these as the “Hamming distances” (even though “Hamming distance” is typically applied to differences in binary codes or character inequality counts of two strings of characters.) It is simply easier to say the “Hamming distance from W_{x} to W_{ef_9}” than the “ell-one norm of the difference of W_{x} and W_{ef_9}.”

I have been working on an utility temporarily called “user tuner” that makes navigating in both the search space and the objective space quicker, easier and more productive. More details to follow in a future post.

Why Not Semi-Variance Optimization?

Frequent readers will know that I believe that mean semi-variance optimization (MSVO or SVO) is superior to vanilla MVO. So why am I starting with MVO? Three reasons:

To many, MVO is less scary because it is somewhat familiar. So I’m starting with the familiar “basics.”

I wanted to talk about Sharpe ratios first, because again they are more familiar than, say, Sortino ratios.

I wanted to use “User Tuner”, and I originally coded it for MVO (though that is easily remedied).

However, asymptotically refining allocation of my entire portfolio to get extremely close to the MVO efficient frontier is only phase 1. It is highly likely I will compute the SVO efficient frontier next and use a slightly modified “User Tuner” to approach the mean semi-variance efficient frontier… Likely in the next month or two, once my 18.6% of assets are freed up.

Over 50 years of academic financial thinking is based on a kind of financial gravity: the notion that for a relatively diverse investment portfolio, higher risk translates into higher return given a sufficiently long time horizon. Stated simply: “Risk equals reward.” Stated less tersely, “Return for an optimized portfolio is proportional to portfolio risk.”

As I assimilated the CAPM doctrine in grad school, part of my brain rejected some CAPM concepts even as it embraced others. I remember seeing a graph of asset diversification that showed that randomly selected portfolios exhibited better risk/reward profiles up to 30 assets, at which point further improvement was minuscule and only asymptotically approached an “optimal” risk/reward asymptote. That resonated.

Conversely, strict CAPM thinking implied that a well-diversified portfolio of high-beta stocks will outperform a marketed-weighted portfolio of stocks over the long-term, albeit in a zero-alpha fashion. That concept met with cognitive dissonance.

Now, dear reader, as a reward for staying with this post this far, I will reward you with some hard-won insights. After much risk/reward curve fitting on compute-intensive analyses, I found that the best-fit expected-return metric for assets was proportional to the square root of beta. In my analyses I defined an asset’s beta as 36-month, monthly returns relative to the benchmark index. Mostly, for US assets, my benchmark “index” was VTI total-return data.

Little did I know, at the time, that a brilliant financial maverick had been doing the heavy academic lifting around similar financial ideas. His name is Bob Haugen. I only learned of the work of this kindred spirit upon his passing.

My academic number crunching on data since 1980 suggested a positive, but decreasingincremental total return vs. increasing volatility (or for increasing beta). Bob Haugen suggested a negative incremental total return for high-volatility assets above an inflection-point of volatility.

Mr. Haugen’s lifetime of published research dwarfs my to-date analyses. There is some consolation in the fact that I followed the data to conclusions that had more in common with Mr. Haugen’s than with the Academic Consensus.

An objective analysis of the investment approach of three investing greats will show that they have more in common with Mr. Haugen than Mr. E.M. Hypothesis (aka Mr. Efficient Markets, [Hypothesis] , not to be confused with “Mr. Market”). Those great investors are 1) Benjamin Graham, 2) Warren Buffet, 3) Peter Lynch.

CAPM suggests that, with either optimal “risk-free”or leveraged investments a capital asset line exists — tantamount to a linear risk-reward relationship. This line is set according to an unique tangent point to the efficient frontier curve of expected volatility to expected return.

My research at Sigma1 suggests a modified curve with a tangent point portfolio comprised, generally, of a greater proportion of low volatility assets than CAPM would indicate. In other words, my back-testing at Sigma1 Financial suggests that a different mix, favoring lower-volatility assets is optimal. The Sigma1 CAL (capital allocation line) is different and based on a different asset mix. Nonetheless, the slope (first derivative) of the Sigma1 efficient frontier is always upward sloping.

Mr. Haugen’s research indicates that, in theory, the efficient frontier curve past a critical point begins sloping downward with as portfolio volatility increases. (Arguably the curve past the critical point ceases to be “efficient”, but from a parametric point it can be calculated for academic or theoretical purposes.) An inverted risk/return curve can exist, just as an inverted Treasury yield curve can exist.

Academia routinely deletes the dominated bottom of the the parabola-like portion of the the complete “efficient frontier” curve (resembling a parabola of the form x = A + B*y^2) for allocation of two assets (commonly stocks (e.g. SPY) and bonds (e.g. AGG)).

Maybe a more thorough explanation is called for. In the two-asset model the complete “parabola” is a parametric equation where x = Vol(t*A, (1-t)*B) and y = ER( t*A, (1-t)*B. [Vol == Volatility or standard-deviation, ER = Expected Return)]. The bottom part of the “parabola” is excluded because it has no potential utility to any rational investor. In the multi-weight model, x=minVol (W), y=maxER(W), and W is subject to the condition that the sum of weights in vector W = 1. In the multi-weight, multi-asset model the underside is automatically excluded. However there is no guarantee that there is no point where dy/dx is negative. In fact, Bob Haugen’s research suggests that negative slopes (dy/dx) are possible, even likely, for many collections of assets.

Time prevents me from following this financial rabbit hole to its end. However I will point out the increasing popularity and short-run success of low-volatility ETFs such as SPLV, USMV, and EEMV. I am invested in them, and so far am pleased with their high returns AND lower volatilities.

==============================================

NOTE: The part about W is oversimplified for flow of reading. The bulkier explanation is y is stepped from y = ER(W) for minVol(W) to max expected-return of all the assets (Wmax_ER_asset = 1, y = max_ER_asset_return), and each x = minVol(W) s.t. y = ER(W) and sum_of_weights(W) = 1. Clear as mud, right? That’s why I wrote it the other way first.

Let’s start with the idea that CAPM (Capital Asset Pricing Model) is incomplete. Let me prove it in a few sentences. Everyone knows that, for investors, “risk-free” rates are always less than borrowing (margin) rates. Thus the concept of CAL (the capital asset line) is incomplete. If I had a sketch-pad I’d supply a drawing showing that there are really three parts of the “CAL” curve…

The traditional CAL that extends from Rf to the tangent intercept with the efficient-frontier curve.

Why? Because the CAML has it’s own tangent point based on the borrower’s marginal rate. Because the efficient frontier is monotonically-increasing the CAL and CAML points will be separated by a section of the EF curve I call the CAC.

All of this is so obvious, it almost goes without saying. It is strange, then, that I haven’t seen it pointed out in graduate finance textbooks, or online. [If you know of a reference, please comment on this post!] In reality, the CAL only works for an unleveraged portfolio.

Higher risk, higher return, right? Maybe not… at least on a risk-adjusted basis. Empirical data suggests that high-beta stock and portfolios do not receive commensurate return. Quite to the contrary, low-beta stocks and portfolios have received greater returns than CAPM predicts. In other words, low-beta portfolios (value portfolios in many cases) have had higher historical alphas. Add leverage, and folks like Warren Buffett have produced high long-term returns.

Black Swans and Grey Swans

On the fringe of modern-portfolio theory (MPT) and post-modern portfolio theory (PMPT), live black swans. Black swans are essentially the most potent of unknown unknowns, also known as “fat tails”.

At the heart of PMPT is what I call “grey swans.” This is also called “breakdown of covariance estimates” or, in some contexts, financial contagion. Grey-swan events are much more common, and somewhat more predictable… That is if one is NOT fixated on variance.

Variance is close, semivariance is closer. I put forth the idea that PMPT overstates its own potential. Black swans exists, are underestimated, and essentially impossible to predict. “Grey swans” are, however, within the realm of PMPT. They can be measured in retrospect and anticipated in part.

Assets are Incorrectly Priced

CAPM showed a better way to price assets and allocate capital. The principles of semivariance, commingled with CAPM form a better model for asset valuation. Simply replacing variance with semivariance changes fifty years of stagnant theory.

Mean-return variance is positively correlated with semivariance (mean semi-variance of asset return), but the correlation is far less than 1. Further, mean variance is most correlated when it matters most; when asset prices drop. The primary function of diversification and of hedging is to efficiently reduce variance. Investors and pragmatists note that this principle matters more when assets crash together — when declines are correlated.

The first step in breaking this mold of contagion is examining what matter more: semivariance. Simply put, investors care much less about compressed upward variance than they do about compressed downward variance. They care more about semivariance. And, eventually, they vote with their remaining assets.

A factor in retaining and growing an AUM base is content clients. The old rules say that the correct answer the a key Wall Street interview question is win big or lose all (of the client’s money). The new rules say that clients demand a value-add from their adviser/broker/hybrid. This value add can be supplied, in part, via using the best parts of PMPT. Namely semivariance.

That is the the end result of the of the success of semivariance. The invisible hand of Sigma1, and other forward-looking investment companies, is to guide investors to invest money in the way that best meets their needs. The eventual result is more efficient allocation of capital. In the beginning these investors win. In the end, both investors and the economy wins. This win/win situation is the end goal of Sigma1.

In many situations good quick action beats slow brilliant action. This is especially true when the “best” answer arrives too late. The perfect pass is irrelevant after the QB is sacked, just as the perfect diagnosis is useless after the patient is dead. Lets call this principle the temporal dominance threshold, or beat the buzzer.

Now imagine taking a multiple-choice test such as the SAT or GMAT. Let’s say you got every question right, but somehow managed to skip question 7. In the line for question #7 you put the answer to question #8, etc. When you answer the last question, #50, you finally realize your mistake when you see one empty space left on the answer sheet… just as the proctor announces “Time’s up!” Even thought you’ve answered every question right (except for question #7), you fail dramatically. I’ll call this principle query displacement, or right answer/wrong question.

The first scenario is similar to the problems of high-frequency trading (HFT). Good trades executed swiftly are much better than “great” trades executed (or not executed!) after the market has already moved. The second scenario is somewhat analogous to the problems of asset allocation and portfolio theory. For example, if a poor or “incomplete” set of assets is supplied to any portfolio optimizer, results will be mediocre at best. Just one example of right answer (portfolio optimization), wrong question (how to turn lead into gold).

I propose that the degree of fluctuation, or variance (or mean-return variance) is another partially-wrong question. Perhaps incomplete is a better term. Either way, not quite the right question.

Particularly if your portfolio is leveraged, what matters is portfolio semivariance. If you believe that “markets can remain irrational longer than you can remain solvent”, leverage is involved. Leverage via margin, or leverage via derivatives matters not. Leverage is leverage. At market close, “basic” 4X leverage means complete liquidation at a underlying loss of only 25%. Downside matters.

Supposing a long-only position with leverage, modified semivariance is of key importance. Modified, in my notation, means using zero rather than μ. For one reason, solvency does not care about μ, mean return over an interval greater than insolvency.

At the extreme, semivariance is most important factor for solvency… far more important than basic variance. In terms of client risk-tolerance, actual semi-variance is arguably more important than variance — especially when financial utility is factored in.

Now, finally, to the crux of the issue. It is far better to predict future semivariance than to predict future variance. If it turns out that past (modified) semivariance is more predictive of future semivariance than is past variance, then I’d favor a near-optimal optimization of expected return versus semivariance than an perfectly-optimal expected return versus variance asset allocation.

It turns out that respectively optimizing semivariance is computationally many orders of magnitude more difficult that optimizing for variance. It also turns out that Sigma1’s HAL0 software provides a near-optimal solution to the right question: least semivariance for a given expected return.

At the end of the day, at market close, I favor near-perfect semivariance optimization over “perfect” variance optimization. Period. Can your software do that? Sigma1 financial software, code-named HAL0, can. And that is just the beginning of what it can do. HALo answers the right questions, with near-perfect precision. And more precisely each day.

Two mathematical equations have transformed the world of modern finance. The first was CAPM, the second Black-Scholes. CAPM gave a new perspective on portfolio construction. Black-Scholes gave insight into pricing options and other derivatives. There have been many other advancements in the field of financial optimization, such as Fama-French — but CAPM and Black-Scholes-Merton stand out as perhaps the two most influential.

Enter Semi-Variance

When CAPM (and MPT) were invented, computers existed, but were very limited. Though the father of CAPM, Harry Markowitz, wanted to use semi-variance, the computers of 1959 were simply inadequate. So Markowitz used variance in his ground breaking book “Portfolio Selection — Efficient Diversification of Investments”.

Choosing variance over semi-variance made the computations orders of magnitude easier, but the were still very taxing to the computers of 1959. Classic covariance-based optimizations are still reasonably compute-intensive when a large number of assets are considered. Classic optimization of a 2000 asset portfolio starts by creating a 2,002,000-entry (technically 2,002,000 unique entries which, when mirrored about the shared diagonal, number 4,000,000) covariance matrix; that is the easy part. The hard part involves optimizing (minimizing) portfolio variance for a range of expected returns. This is often referred to as computing the efficient frontier.

The concept of semi-variance (SV) is very similar to variance used in CAPM. The difference is in the computation. A quick internet search reveals very little data about computing a “semi-covariance matrix”. Such a matrix, if it existed in the right form, could possibly allow quick and precise computation of portfolio semi-variance in the same way that a covariance matrix does for computing portfolio variance. Semi-covariance matrices (SMVs) exist, but none “in the right form.” Each form of SVM has strengths and weaknesses. Thus, one of the many problems with semi-covariance matrices is that there is no unique canonical form for a given data set. SVMs of different types only capture an incomplete portion of the information needed for semi-variance optimization.

The beauty of SV is that it measures “downside risk”, exclusively. Variance includes the odd concept of “upside risk” and penalizes investments for it. While not going to the extreme of rewarding upside “risk”, the modified semi-variance formula presented in this blog post simply disregards it.

I’m sure most of the readers of this blog understand this modified semi-variance formula. Please indulge me while I touch on some of the finer points. First, the 2 may look a bit out of place. The 2 simply normalizes the value of SV relative to variance (V). Second, the “question mark, colon” notation simply means if the first statement is true use the squared value in summation, else use zero. Third, notice I use r_{i} rather than r_{i} – r_{avg}.

The last point above is intentional and another difference from “mean variance”, or rather “mean semi-variance”. If R is monotonically increasing during for all samples (n intervals, n+1 data points), then SV is zero. I have many reasons for this choice. The primary reason is that with r_{avg} the SV for a straight descending R would be zero. I don’t want a formula that rewards such a performance with 0, the best possible SV score. [Others would substitute T, a usually positive number, as target return, sometimes called minimal acceptable return.]

Finally, a word about r_{i }— r_{i} is the total return over the interval i. Intervals should be as uniform as possible. I tend to avoid daily intervals due to the non-uniformity introduced by weekends and holidays. Weekly (last closing price of the trading week), monthly (last closing price of the month), and quarterly are significantly more uniform in duration.

Big Data and Heuristic Algorithms

Innovations in computing and algorithms are how semi-variance equations will change the world of finance. Common sense is why. I’ll explain why heuristic algorithms like Sigma1’s HALO can quickly find near-optimal SV solutions on a common desktop workstation, and even better solutions when leveraging a data center’s resources. And I’ll explain why SV is vastly superior to variance.

Computing SV for a single portfolio of 100 securities is easy on a modern desktop computer. For example 3-year monthly semi-variance requires 3700 multiply-accumulate operations to compute portfolio return, R_{p}, followed by a mere 37 subtractions, 36 multiplies (for squaring), and 36 additions (plus multiplying by 2/n). Any modern computer can perform this computation in the blink of an eye.

Now consider building a 100-security portfolio from scratch. Assume the portfolio is long-only and that any of these securities can have a weight between 0.1% and 90% in steps of 0.1%. Each security has 900 possible weightings. I’ll spare you the math — there are 6.385*10^{138} permutations. Needless to say, this problem cannot be solved by brute force. Further note that if the portfolio is turned into a long-short portfolio, where negative values down to -50% are allowed, the search space explodes to close to 10^{2000}.

I don’t care how big your data center is, a brute force solution is never going to work. This is where heuristic algorithms come into play. Heuristic algorithms are a subset of metaheuristics. In essence heuristic algorithms are algorithms that guide heuristics (or vise versa) to find approximate solution(s) to a complex problem. I prefer the term heuristic algorithm to describe HALO, because in some cases it is hard to say whether a particular line of code is “algorithmic” or “heuristic”, because sometimes the answer is both. For example, semi-variance is computed by an algorithm but is fundamentally a heuristic.

Heuristic Algorithms, HAs, find practical solutions for problems that are too difficult to brute force. They can be configured to look deeper or run faster as desired by the user. Smarter HAs can take advantage of modern computer infrastructure by utilizing multiple threads, multiple cores, and multiple compute servers in parallel. Many, such as HAL0, can provide intermediate solutions as they run far and deep into the solution space.

Let me be blunt — If you’re using Microsoft Excel Solver for portfolio optimization, you’re missing out. Fly me out and let me bring my laptop loaded with HAL0 to crunch your data set — You’ll be glad you did.

Now For the Fun Part: Why switch to Semi-Variance?

Thanks for reading this far! Would you buy insurance that paid you if your house didn’t burn down? Say you pay $500/year and after 10 years, if your house is still standing, you get $6000. Otherwise you get $0. Ludicrous, right? Or insurance that only “protects” your house from appreciation? Say it pays 50 cents for every dollar make when you resell your house, but if you lose money on the resale you get nothing?

In essence that is what you are doing when you buy (or create) a portfolio optimized for variance. Sure, variance analysis seeks to reduce the downs, but it also penalizes the ups (if they are too rapid). Run the numbers on any portfolio and you’ll see that SV ≠ V. All things equal, the portfolios with SV < V are the better bet. (Note that classic_SV ≤ V, because it has a subset of positive numbers added together compared to V).

Let me close with a real-world example. SPLV is an ETF I own. It is based on owning the 100 stocks out of the S&P 500 with the lowest 12-month volatility. It has performed well, and been received well by the ETF marketplace, accumulating over $1.5 billion in AUM. A simple variant of SPLV (which could be called PLSV for PowerShares Low Semi-Variance) would contain the 100 stocks with the least SV. An even better variant would contain the 100 stocks that in aggregate produced the lowest SV portfolio over the proceeding 12 months.

HALO has the power to construct such a portfolio. It could solve preserving the relative market-cap ratios of the 100 stocks, picking which 100 stocks are collectively optimal. Or it could produce a re-weighted portfolio that further reduced overall semi-variance.

[Even more information on semi-variance (in its many related forms) can be found here.]

Almost every stock chart presents incomplete data for a security’s total return. Simply put, stock charts don’t reflect dividends and distributions. Stock charts simply show price data. A handful of charts superimpose dividends over the price data. Such charts are an improvement, but require mental gymnastics to correctly interpret total return.

At the end of the year, I suspect the vast majority of investors are much more interested in how much money they made than whether their profits come from asset appreciation, dividends, interest or other distributions. In the case of tax-differed or tax-exempt accounts (such as IRA, Roth IRAs, 401k, etc. accounts) the source of returns is unimportant. Naturally, for other portfolios, some types of return are more tax-advantaged than others. In one case I tried to persuade a relative that MUB (iShares S&P National AMT-Free Muni Bd) was a good investment for them in spite of it’s chart, because the chart did not show the positive tax impact of tax-exempt income.

Our minds see what they want to see. When we compare two stocks (or ETFs) we often have a slight bias towards one. If we see what we want in a stock’s chart, we may look past the dividend annotations and make a incorrect decision.

This 1-year chart comparing two ETFs illustrates this point. These two ETFs track each other reasonably well until Dec 16th, where there is a sharp drop in PBP. This large dip reflects the effect of a large distribution of roughly 10%. Judging strictly by the price data, it at first appears that SPY beats PBP by 7%. When factoring the yield of PBP, about 10.1%, and SPY, roughly 1.9%, shows a 1.2% 1-year out-performance by PBP. First appearances show SPY outperforming; a little math shows PBP outperforming.

Yahoo! Finance provides raw data adjusted for dividends and distributions. Using the 1-year start and end data shows SPY returning a net 3.77%, and PBP returning a net 4.96%. The delta shows a 1.19% out performance by PBP. Yahoo! Finance’s table have all the right data; I would love to see Yahoo! add an option to display this adjusted-price data graphically.

Total return is not a new concept. Bill Gross was very insightful in naming PIMCO’s “Total Return” lineup of funds over 25 years ago. Many mutual funds provide total return charts. For instance, Vanguard provides total return charts for investments such as Vanguard Total Stock Market Index Fund Admiral Shares. I am pleased to see Fidelity offering similar charts for ETFs in research “performance” reports for its customers. Unfortunately, I have not found a convenient way to superimpose two total-return charts.

While traditional stock and ETF charts do not play a large roll in my investment decisions, I do look at them when evaluating potential additions to my investment portfolio. When I do look at charts, I’d prefer to have the option of looking at total return charts rather than “old fashioned” price charts.

That said, I prefer to use quantitative portfolio analysis as my primary asset allocation technology. For such analysis I compute total return data for each asset from price data and distribution data, assuming reinvestment. Reformatting asset data in this way allows HAL0 portfolio-optimization software to directly compare different asset classes (gold, commodities, stock ETFs, bond ETFs, leveraged ETFs, etc). Moreover, such pre-formatting allows faster computation of risk for various asset allocations within a portfolio.

A large part of my vision for Sigma1 is revolutionizing how investors and money managers visualize and conceptualize portfolio construction. The key pieces of that conceptual revolution are:

Rethinking return to always mean total return.

Rethinking risk to mean something other than variance or standard deviation.

Many already think of total return as the key measure of raw portfolio performance. It is odd, then, that so many charts display something other than total return. And some would like to measure, manage, and model risk in more robust ways. A major obstacle to alternate risk measures is a dearth of financial portfolio optimization tools that work with PMPT models such as semi-variance.

HAL0 is designed from the ground up to address the goals of optimizing portfolios based on total return and a wide variety of advanced, more-robust risk models. (And, yes, total return can be defined in terms of after-tax total return, if desired.)

Disclosure: I have long positions in SPY, the Vanguard Total Stock Market Index, and PBP.