In order get close to bare-metal access to your compute hardware, use C. In order to utilize powerful, tested, convex optimization methods use CVXGEN. You can start with this CVXGEN code, but you’ll have to retool it…
Discard the (m,m) matrix for an (n,n) matrix. I prefer to still call it V, but Sigma is fine too. Just note that there is a major difference between Sigma (the covariance-variance matrix) and sigma (individual asset-return variances matrix; the diagonal of Sigma).
Go meta for the efficient frontier (EF). We’re going to iteratively generate/call CVXGEN with multiple scripts. The differences will be w.r.t the E(Rp).
Computing Max: E(Rp) is easy, given α. [I’d strongly recommend renaming this to something like expect_ret comprised of (r1, r2, … rn). Alpha has too much overloaded meaning in finance].
[Rmax] The first computation is simple. Maximize E(Rp) s.t constraints. This is trivial and can be done w/o CVXGEN.
[Rmin] The first CVXGEN call is the simplest. Minimize σp2 s.t. constraints, but ignoring E(Rp)
Using Rmin and Rmax, iteratively call CVXGEN q times (i=1 to q) using the additional constraint s.t. Rp_i= Rmin + (i/(q+1)*(Rmax-Rmin). This will produce q+2 portfolios on the EF [including Rmin and Rmax]. [Think of each step (1/(q+1))*(Rmax-Rmin) as a quantization of intermediate returns.]
Present, as you see fit, the following data…
(w0, w1, …wq+1)
[ E(Rp_0), …E(Rp_(q+1)) ]
[ σ(Rp_0), …σ(Rp_(q+1)) ]
My point is that — in two short blog posts — I’ve hopefully shown how easily-accessible advanced MVO portfolio optimization has become. In essence, you can do it for “free”… and stop paying for simple MVO optimization… so long as you “roll your own” in house.
I do this for the following reasons:
To spread MVO to the “masses”
To highlight that if “anyone with a master’s in finance and computer can do MVO for free” to consider their quantitative portfolio-optimization differentiation (AKA portfolio risk management differentiation), if any
To emphasize that this and the previous blog will not greatly help with semi-variance portfolio optimization
I ask you to consider that you, as one of the few that read this blog, have a potential advantage. You know who to contact for advanced, relatively-inexpensive SVO software. Will you use that advantage?
The Equation Everyone in Finance Show Know, but Many Probably Don’t!
Here it is:
… With thanks to codecogs.com which makes it really easy to write equations for the web.
This simple matrix equation is extremely powerful. This is really two equations. The first is all you really need. The second is just merely there for illustrative purposes.
This formula says how the variance of a portfolio can be computed from the position weights wT = [w1 w2 … wn] and the covariance matrix V.
σii ≡ σi2 = Var(Ri)
σij ≡ Cov(Ri, Rj) for i ≠ j
The second equation is actually rather limiting. It represents the smallest possible example to clarify the first equation — a two-asset portfolio. Once you understand it for 2 assets, it is relatively easy to extrapolate to 3-asset portfolios, 4-asset portfolios, and before you know it, n-asset portfolios.
Now I show the truly powerful “naked” general form equation:
This is really all you need to know! It works for 50-asset portfolios. For 100 assets. For 1000. You get the point. It works in general. And it is exact. It is the E = mc2 of Modern Portfolio Theory (MPT). It at least about 55 years old (2014 – 1959), while E = mc2 is about 99 years old (2014 – 1915). Harry Markowitz, the Father of (M)PT simply called it “Portfolio Theory” because:
There’s nothing modern about it.
Yes, I’m calling Markowitz the Einstein of Portfolio Theory AND of finance! (Now there are several other “post”-Einstein geniuses… Bohr, Heisenberg, Feynman… just as there are Sharpe, Scholes, Black, Merton, Fama, French, Shiller, [Graham?, Buffet?]…) I’m saying that a physicist who doesn’t know E = mc2 is not much of a physicist. You can read between the lines for what I’m saying about those that dabble in portfolio theory… with other people’s money… without really knowing (or using) the financial analog.
Why Markowitz is Still “The Einstein” of Finance (Even if He was “Wrong”)
Markowitz said that “downside semi-variance” would be better. Sharpe said “In light of the formidable
computational problems…[he] bases his analysis on the variance and standard deviation.”
Today we have no such excuse. We have more than sufficient computational power on our laptops to optimize for downside semi-variance, σd. There is no such tidy, efficient equation for downside semi-variance. (At least not that anyone can agree on… and none that that is exact in any sense of any reasonable mathematical definition of the word ‘exact’.)
Fama and French improve upon Markowitz (M)PT [I say that if M is used in MPT, it should mean “Markowitz,” not “modern”, but I digress.] Shiller, however, decimates it. As does Buffet, in his own applied way. I use the word decimate in its strict sense… killing one in ten. (M)PT is not dead; it is still useful. Diversification still works; rational investors are still risk-averse; and certain low-beta investments (bonds, gold, commodities…) are still poor very-long-term (20+ year) investments in isolation and relative to stocks, though they still can serve a role as Markowitz Portfolio Theory suggests.
Wanna Build your Own Optimizer (for Mean-Return Variance)?
This blog post tells you most of the important bits. I don’t really need to write part 2, do I? Not if you can answer these relatively easy questions…
What is the matrix expression for computing E(Rp) based on w?
What simple constraint is w subject to?
How does the general σp2 equation relate to the efficient frontier?
How might you adapt the general equation to efficiently compute the effects of a Δw event where wi increases and wj decreases? (Hint “cache” the wx terms that don’t change,)
What other constraints may be imposed on w or subsets (asset categories within w)? How will you efficientlydeal with these constraints?
The best models are not the models that fit past data the best, they are the models that predict new data the best. This seems obvious, but a surprising number of business and financial decisions are based on best-fit of past data, with no idea of how well they are expected to correctly model future data.
Instant Profit, or Too Good to be True?
For instance, a stock analyst reports to you that they have a secret recipe to make 70% annualized returns by simply trading KO (The Coca-Cola Company). The analyst’s model tells what FOK limit price, y, to buy KO stock at each market open. The stock is then always sold with a market order at the end of each trading day.
The analyst tells you that her model is based on three years of trading data for KO, PEP, the S&P 500 index, aluminum and corn spot prices. Specifically, the analyst’s model uses closing data for the two preceding days, thus the model has 10 inputs. Back testing of the model shows that it would have produced 70% annualized returns over the past three years, or a whooping 391% total return over that time period. Moreover, the analyst points out that over 756 trading days 217 trades would have been executed, resulting in profit a 73% of the time (that the stock is bought).
The analyst, Debra, says that the trading algorithm is already coded, and U.S. markets open in 20 minutes. Instant profit is only moments away with a simple “yes.” What do you do with this information?
Choices, Chances, Risks and Rewards
You know this analyst and she has made your firm’s clients and proprietary trading desks a lot of money. However you also know that, while she is thorough and meticulous; she is also bold and aggressive. You decide that caution is called for, and allocate a modest $500,000 to the KO trading experiment. If after three months, the KO experiment nets at least 7% profit, you’ll raise the risk pool to $2,000,000. If, after another three months, the KO-experiment generates at least 7% again; you’ll raise the risk pool to $10,000,000 as well as letting your firms best clients in on the action.
Three months pass, and the KO-experiment produces good results: 17 trades, 13 winners, and a 10.3% net profit. You OK raising the risk pool to $2,000,000. After only 2 months the KO-experiment has executed 13 trades, with 10 winners, and a 11.4% net profit. There is a buzz around the office about the “knock-out cola trade”, and brokers are itching to get in on it with client funds. You are considering giving the green light to the “Full Monty,” when Stan the Statistician walks into your office.
Stan’s title is “Risk Manager”, but people around the office call him Stan the Statistician, or Stan the Stats Man, or worse (e.g. “Who is the SS going to s*** on today?”) He’s actually a nice guy, but most folks consider him an interloper. And Stan seems to have clout with corporate, and he has been known to use it to shut down trades. You actually like Stan, but you already know why he is stopping by.
Stan begins probing about the KO-trade. He asks what you know. You respond that Debra told you that the model has an R-squared of 0.92 based on 756 days of back-tested data. “And now?” asks Stan. You answer, “a 76% success rate, and profits of around 21% in 5 months.” And then Stan asks, “What is the probability that that profit is essentially due to pure chance?”
You know that the S&P 500 historically has over 53% “up” days, call it 54% to be conservative. So stocks should follow suit. To get exactly23 wins on KO out of 30 tries is C(30, 23)*0.54^23*(0.46)^7 = 0.62%. To get at least 23 (23 or more wins) brings the percentage up to about 0.91%. So you say 1/0.091 or about one in 110.
Stan says, “Your math is right, but your conclusion is wrong. For one thing, KO is up 28% over the period, and has had 69% up days over that time.” You interject, “Okay, wait one second… so my math now says about 23%, or about a 1 in 4.3 chance.”
Stan smiles, “You are getting much closer to the heart of the matter. I’ve gone over Debra’s original analysis, and have made some adjustments. My revised analysis shows that there is a reasonable chance that her model captures some predictive insight that provides positive alpha.” Stan’s expression turns more neutral, “However, the confidence intervals against the simple null hypothesis are not as high as I’d like to see for a big risk allocation.”
Getting all Mathy? Feedback Requested!
Do you want to hear more from “Stan”? He is ready to talk about adjusted R-squared, block-wise cross-validation, and data over-fitting. And why Debra’s analysis, while correct, was also incomplete. Please let me know if you are interested in hearing more on this topic.
Please let me know if I have made any math errors yet (other than the overtly deliberate ones). I love to be corrected, because I want to make Sigma1 content as useful and accurate as possible.
Explaining technical investment concepts in a non-technical way is critical to having a meaningful dialog with individual investors. Most individual investors (also called “retail investors”, or “small investors”) do not have the time nor the desire to learn the jargon and concepts behind building a solid investment portfolio. This is generally true for most individual investors regardless of the size of their investment portfolios. Individual investors expect investment professionals (also called “institutional investors”) to help manage their portfolios and explain the major investment decisions behind the management of their individual portfolios.
In the same way that a good doctor helps her patient make informed medical decisions, a good investment adviser helps her clients make informed investment decisions.
I get routinely asked how the HALO Portfolio Optimizer works. Every time I answer that question, I face two risks: 1) that I don’t provide enough information to convince the investment profession or their clients that HALO optimization provides significant value and risk-mitigation capability and 2) I risk sharing key intellectual property (IP) unique to the Sigma1 Financial HALO optimizer.
This post is my best effort to provide both investment advisers and their clients with enough information to evaluate and understand HALO optimization, while avoiding sharing key Sigma1 trade secrets and intelectual property. I would very much appreciate feedback, both positive and negative, as to whether I have achieved these goals.
First Principle of Portfolio Optimization Software
Once when J.P. Morgan was asked what the market would do, he answered “It will fluctuate.” While some might find this answer rather flippant, I find it extremely insightful. It turns out that so-called modern portfolio theory (MPT) is based understanding (or quantifying) market fluctuations. MPT labels these fluctuations as “risk” and identifies “return” as the reward that a rational investor is willing to accept for a given amount of risk. MPT assumes that a rational investor, or his/her investment adviser will diversify away most or all “diversifiable risk” by creating a suitable investment portfolio tailored to the investor’s current “risk tolerance.”
In other words, the primary job of the investment adviser (in a “fiduciary” role), is to maximize investment portfolio return for a client’s acceptable risk. Said yet another way, the job is to maximize the risk/reward ratio for the client, without incurring excess risk.
Now for the first principle: past asset “risk” tends to indicate future asset “risk”. In general an asset that has been previously more volatile will tend to remain more volatile, and and asset that has been less volatile will tend to remain less volatile. Commonly, both academia and professional investors have equated volatility with risk.
Second Principle of Portfolio Optimization Software
The Second Principle is closely related to the first. The idea is that the past portfolio volatility tends to indicate future portfolio volatility. This thesis is so prevalent that it is almost inherently assumed. This is evidenced by search results that reaches beyond volatility and looks at the hysteresis of return-versus-volatility ratios, papers such at this.
Past Performance is Not Necessarily Indicative of Future Results.
Third Principle of Portfolio Optimization Software
The benefits of diversification are manifest in risk mitigation. If two assets are imperfectly correlated, then their combined volatility (risk) will beless than the weighted averages of their individual volatilities. An in-depth mathematical description two-asset portfolio volatilities can be found on William Sharpe’s web page. Two-asset mean-variance optimization is relatively simple, and can be performed with relatively few floating-point operations on a computer. This process creates the two-asset efficient frontier*. As more assets are added to the mix, the computational demand to find the optimal efficient frontier grows geometrically, if you don’t immediately see why look at page 8 of this paper.
A much simpler explanation of the the third principle is as follows. If asset A has annual standard deviation of 10%, and asset B an annual standard deviation of 20%, and A and B are not perfectly correlated, then the portfolio of one half invested in A and the other half invested in B will have a annual standard deviation of less than 15%. (Non-perfectly correlated means a correlation of less than 1.0). Some example correlations of assets can be found here.
In so-called plain English, the Third Principle of Portfolio Optimization can be stated: “For a given level of expected return, portfolio optimization software can reduce portfolio risk by utilizing the fact that different assets move somewhat independently from each other.”
Forth Principle of Portfolio Optimization Software
The Forth Principle of Portfolio Optimization establishes a relationship between risk and return. The classic assumption of modern portfolio theory (MPT) is that so-called systematic risk is rewarded (over a long-enough time horizon) with increased returns. Portfolio-optimization software seeks to reduce or eliminate unsystematic risk when creating an optimized set of portfolios. The portfolio manager can thus select one of these optimized portfolios from the “best-in-breed” list created by the optimization software that is best suited to his/her client’s needs.
Fifth Principle of Portfolio Optimization Software
The 5th Principle is that the portfolio manager and his team adds value to the portfolio composition process by 1) selecting a robust mix of assets, 2) applying constraints to the weights of said assets and asset-groups, and 3) assigning expected returns to each asset. The 5th Principle focuses on the assignment of expected returns. This process can be grouped under the category of investment analysis or investment research. Investment firms pay good money for either in-house or contracted investment analysis of selected securities.
Applying the Portfolio Optimization Principles Together
Sigma1 Financial HALO Software applies these five principles together to help portfolio managers improve or fine-tune their proprietary-trading and/or client investment portfolios. HALO Portfolio Optimization software utilizes the assets, constraints, and expected returns from the 5th Principal as a starting point. It then uses the 4th Principal by optimizing away systematic risk from a set of portfolios by taking maximum advantage of varying degrees of non-correlation of the portfolio assets. The 3rd Principle alludes to the computational difficulty of solving the multi-asset optimization problem. Principles 1 and 2 form the bedrock of the concepts behind the use of historical correlation data to predict and estimate future correlations.
The Fine Print
Past asset volatility of most assets and most portfolios is historically well correlated with future volatility. However, not only are assets increasingly correlated, there is some evidence that asset correlations tend to increase during times of financial crisis. Even if assets are more correlated, there remains significant value in exploiting partial-discorrelation.
(*) The two-asset model can be represented as two parametric functions of a single variable, “t”, ER(t), and var(t). t simply represents the investment proportion invested in asset 0 (aka asset A). For three variables, expected return becomes ER(t0,t1) as does var(t0,t1). And so on for increasing numbers of assets. The computational effort required to compute ER(t0…tn) scales linearly with number of assets, but var(t0…tn) scales geometrically.
Optimizing efficiently within this complex space benefits from creative algorithms and heuristics.
Setting up a basic HALO optimization requires a list of asset tickers, their min and max constraints, and expected returns. Also at least one user specified category designation is required. Below is a short example:
Generally, it is advisable to keep the sum of the individual asset minimums below 50%, and the sum of maximums above 200%. This provides the HALO optimizer the freedom to create a wide range of optimized portfolios with different risk/reward trade offs.
The above example is a very basic configuration. In order for asset managers to specify asset-class constraints, it is necessary to tell the optimizer that the “string” is a user-defined category. Currently this is done with a leading gastritis (*):
The above config specifies that Equities must comprise a minimum of 25% of the investment portfolio and a maximum of 85%. As with the individual asset constraints, it is advised to provide reasonably wide latitude to the optimization algorithms to produce a diverse set of optimized portfolios.
By default, the HALO Optimizer will produce a set of portfolios optimized to:
a) semi-variance, σd (the default)
b) –OR– annualized standard deviation of total return, σ
2) maximize expected return, E(R)
The default time series used for computing σ and σd is end-of-month total-return deltas for the previous 36 months. (This requires 37 months of total-return data for each security.) The time period can be customized to use, say 60 months worth of data in the analysis. HALO also supports using weekly closing data or even daily closing data — however I generally recommend using monthly data for a variety of reasons. First, it speeds the computation. Second, monthly data captures multi-day and multi-week trends, correlations, and specifically low-correlation asset optimization. Third, monthly data is closer to the sampling period of a “typical” high-net-worth retail investor. [That said, a case could be made for using quarterly data — which is also supported.]
Frequently HALO clients want to model newer securities that do not have 37 months of historical data. For example, min-volatility ETFs such as SPLV, USMV, and EEMV are popular ETFs that are less than 3 years old. The HALO software suite has utilities that can statistically back fill the missing data. The configuration of the statistical back-fill process is beyond the scope of this blog post, however it is an important and popular HALO Optimization Suite capability that so far has been used by allof Sigma1’s clients and beta testers.
Occasionally, Sigma1 clients and beta testers have had in-house funds that do not externally report their price or total return data. For in-house funds, HALO can read client-supplied total-return data. Naturally, HALO can include stocks, bonds, commodities, futures, and other assets with historical data into the portfolio optimization mix.
In today’s near-zero interest rate economy, the reward versus risk of an investment portfolio can be measured using the Sharpe ratio. Like a batting average, higher numbers are better, and 0.400 is very good.
If portfolio Z has a forward-looking Sharpe ratio of 0.400, and an expected return of 8%, there is a 68% chance its 1-year return will be between -12% and +28%.
The math is surprisingly easy. Because the Sharpe ratio is a return/risk ratio it can be transformed into a risk/return ratio by finding its inverse (using the “1/x” button on a calculator). The inverse of 0.400 is 2.5. The return is 8%, so the “risk” is 2.5 times 8% which is 20%.
For the Sharpe ratio, the downside risk and the upside “risk” are the same. So the downside is 8% -20%, or -12%. The upside risk is 8%+20%, or 28%. Easy!
Sharpe Ratios and Risk (more detail)
Where did the “68% chance” come from? The answer is a bit more complicated, but still fairly easy to understand.
It comes from the 3-sigma1 rule of statistics. The range of -12% to +28% comes from 1 standard deviations of the mean (or plus or minus one sigma). The 3-sigma rule also says that 95% of outcomes will fall within two standard deviations. Double the deviation means two times the upside and downside risk, so the 95% confidence range becomes -32% to 48%. Finally the 3-sigma rule means triple the upside and downside risk, meaning outcomes from -52% to +68% will occur 99.7 percent of the time.
Almost every investor will be be pleased with a positive sigma event, where the return is above 8%. For example a +1 sigma (+1σ) occurrence has a +28% return — quite nice.
A downside event is potentially quite troublesome. Even a -1σ event means a 12% loss. A -2σ is a much worse 32% loss.
Ex Ante and Ex Post Sharpe Ratios
Forward-looking (ex ante) Sharpe ratios are predictions “prior to the event(s)”. They are always positive, because no rational investor would invest in a negative expected return. The assumptions baked into an ex ante Sharpe ratio predictions are 1) expected standard deviation of total return, σ, 2) expected future return.
Backward-looking, or after the fact, (ex post) Sharpe ratios can be negative or positive. In fact, assuming “normal distributions of return”, there is a reasonable (but less than 50%) chance of a negative ex post Sharpe ratio.
Sigma1 HAL0 software optimizes for Sharpe ratios by optimizing for return and standard deviation. It also optimizes for semivariance. More “plain English” on that advantage later.
This marks the first month (30 days) of engagement with beta financial partners. The goal is to test Sigma1 HAL0 portfolio-optimization software on real investment portfolios and get feedback from financial professionals. The beta period is free. Beta users provide tickers and expected-returns estimates via email, and Sigma1 provides portfolio results back with the best Sharpe, Sortino, or Sharpe/Sortino hybrid ratio results.
HAL0 portfolio-optimization software provides a set of optimized portfolios, often 40 to 100 “optimal” portfolios, optimized for expected return, return-variance and return-semivariance. “Generic” portfolios containing a sufficiently-diverse set of ETFs produce similar-looking graphs. A portfolio set containing SPY, VTI, BND, EFA, and BWX is sufficient to produce a prototypical graph. The contour lines on the graph clearly show a tradeoff between semi-variance and variance.
Once the set of optimized portfolios has been generated the user can select the “best” portfolio based on their selection criteria.
So far I have learned that many financial advisers and fund managers are aware of post-modern portfolio theory (PMPT) measures such as semivariance, but also a bit wary of them. At the same time, some I have spoken with acknowledge that semivariance and parts of PMPT are the likely future of investing. Portfolio managers want to be equipped for the day when one of their big investors asks, “What is the Sortino ratio of my portfolio? Can you reduce the semi-variance of my portfolio?”
I was surprised to hear that all of Sigma1 beta partners are interested exclusively in a web-based interface. This preliminary finding is encouraging because it aligns with a business model that protects Sigma1 IP from unsanctioned copying and reverse-engineering.
Another surprise has been the sizes of the asset sets supplied, ranging from 30 to 50 assets. Prior to software beta, I put significant effort into ensuring that HAL0 optimization could handle 500+ asset portfolios. My goal, which I achieved, was high-quality optimization of 500 assets in one hour and overnight deep-dive optimization (adding 8-10 basis points of additional expected-return for a given variance/semi-variance). On the portfolio assets provided to-date, deep-dive runtimes have all been under 5 minutes.
The best-testing phase has provided me with a prioritized list of software improvements. #1 is per-asset weighting limits. #2 is an easy-to-use web interface. #3 is focused optimization, such as the ability to set max variance. There have also been company-specific requests that I will strive to implement as time permits.
Financial professionals (financial advisers, wealth managers, fund managers, proprietary trade managers, risk managers, etc.) seem inclined to want to optimize and analyze risk in both old ways (mean-return variance) and new (historic worst-year loss, VAR measures, tail risk, portfolio stress tests, semivariance, etc.).
Some Sigma1 beta partners have been hesitant to provide proprietary risk measure algorithms. These partners prefer to use built-in Sigma1 optimizations, receive the resulting portfolios, and perform their own in-house analysis of risk. The downside of this is that I cannot optimize directly to proprietary risk measures. The upside is that I can further refine the HAL0 algos to solve more universal portfolio-optimization problems. Even indirect feedback is helpful.
Portfolio and fund managers are generally happy with mean-return variance optimization, but are concerned that semivariance-return measures are reasonably likely to change the financial industry in the coming years. Luckily the Sharpe ratio and Sortino ratio differ by only the denominator (σp versus σd) . By normalizing the definitions of volatility (currently called modified-return variance and modified-return semivariance) HAL0 software optimizes simultaneously for both (modified) Sharpe and Sortino ratios, or any Sharpe/Sortino hybrid ratios in-between. A variance-focused investor can use a 100% variance-optimized portfolio. An investor wanting to dabble with semi-variance can explore portfolios with, say, a 70%/30% Sharpe/Sortino ratio. And an investor, fairly bullish on semivariance minimization, could use a 20%/80% Sharpe/Sortino hybrid ratio.
I am very thankful to investment managers and other financial pros who are taking the time to explore the capabilities of HAL0 portfolio-optimization software. I am hopeful that, over time, I can persuade some beta partners to become clients as HAL0 software evolves and improves. In other cases I hope to provide Sigma1 partners with new ideas and perspectives on portfolio optimization and risk analysis. Even in one short month, every partner has helped HAL0 software become better in a variety of ways.
Sigma1 is interested in taking on 1 or 2 additional investment professionals as beta partners. If interested please submit a brief request for info on our contact page.
In my last post I showed that there are far more that a googol permutations of portfolio of 100 assets with (positive, non-zero) weights in increments of 10 basis points, or 0.1%. That number can be expressed as C(999,99), or C(999,900) or 999!/(99!*900!), or ~6.385*10138. Out of sheer audacity, I will call this number Balhiser’s first constant (Kβ1). [Wouldn’t it be ironic and embarrassing if my math was incorrect?]
In the spirit of Alan Turing’s 100th birthday today and David Hilbert’s 23 unsolved problems of 1900, I propose the creation of an initial set of financial problems to rate the general effectiveness of various portfolio-optimization algorithms. These problems would be of a similar form: each having a search space of Kβ1. There would be 23 initial problems P1…P23. Each would have a series of 37 monthly absolute returns. Each security will have an expected annualized 3-year return (some based on the historic 37-month returns, others independent). The challenge for any algorithm A to score the best average score on these problems.
I propose the following scoring measures: 1) S”(A) (S double prime) which simply computes the least average semi-variance portfolio independent of expected return. 2) S'(A) which computes the best average semi-variance and expected return efficient frontier versus a baseline frontier. 3) S(A) which computes the best average semi-variance, variance, and expected return efficient frontier surface versus a baseline surface. Any algorithm would be disqualified if any single test took longer than 10 minutes. Similarly any algorithm would be disqualified if it failed to produce a “sufficient solution density and breadth” for S’ and S” on any test. Obviously, a standard benchmark computer would be required. Any OS, supporting software, etc could be used for purposes of benchmarking.
The benchmark computer would likely be a well-equipped multi-core system such as a 32 GB Intel i7-3770 system. There could be separate benchmarks for parallel computing, where the algorithm + hardware was tested as holistic system.
I propose these initial portfolio benchmarks for a variety of reasons. 1) Similar standardized benchmarks have been very helpful in evaluating and improving algorithms in other fields such as electrical engineering. 2) Providing a standard that helps separate statistically significant from anecdotal inference. 3) Illustrate both the challenge and the opportunity for financial algorithms to solve important investing problems. 4) Lowering barriers to entry for financial algorithm developers (and thus lowering the cost of high-quality algorithms to financial businesses). 5) I believe HAL0 can provide superior results.
Two mathematical equations have transformed the world of modern finance. The first was CAPM, the second Black-Scholes. CAPM gave a new perspective on portfolio construction. Black-Scholes gave insight into pricing options and other derivatives. There have been many other advancements in the field of financial optimization, such as Fama-French — but CAPM and Black-Scholes-Merton stand out as perhaps the two most influential.
When CAPM (and MPT) were invented, computers existed, but were very limited. Though the father of CAPM, Harry Markowitz, wanted to use semi-variance, the computers of 1959 were simply inadequate. So Markowitz used variance in his ground breaking book “Portfolio Selection — Efficient Diversification of Investments”.
Choosing variance over semi-variance made the computations orders of magnitude easier, but the were still very taxing to the computers of 1959. Classic covariance-based optimizations are still reasonably compute-intensive when a large number of assets are considered. Classic optimization of a 2000 asset portfolio starts by creating a 2,002,000-entry (technically 2,002,000 unique entries which, when mirrored about the shared diagonal, number 4,000,000) covariance matrix; that is the easy part. The hard part involves optimizing (minimizing) portfolio variance for a range of expected returns. This is often referred to as computing the efficient frontier.
The concept of semi-variance (SV) is very similar to variance used in CAPM. The difference is in the computation. A quick internet search reveals very little data about computing a “semi-covariance matrix”. Such a matrix, if it existed in the right form, could possibly allow quick and precise computation of portfolio semi-variance in the same way that a covariance matrix does for computing portfolio variance. Semi-covariance matrices (SMVs) exist, but none “in the right form.” Each form of SVM has strengths and weaknesses. Thus, one of the many problems with semi-covariance matrices is that there is no unique canonical form for a given data set. SVMs of different types only capture an incomplete portion of the information needed for semi-variance optimization.
The beauty of SV is that it measures “downside risk”, exclusively. Variance includes the odd concept of “upside risk” and penalizes investments for it. While not going to the extreme of rewarding upside “risk”, the modified semi-variance formula presented in this blog post simply disregards it.
I’m sure most of the readers of this blog understand this modified semi-variance formula. Please indulge me while I touch on some of the finer points. First, the 2 may look a bit out of place. The 2 simply normalizes the value of SV relative to variance (V). Second, the “question mark, colon” notation simply means if the first statement is true use the squared value in summation, else use zero. Third, notice I use ri rather than ri – ravg.
The last point above is intentional and another difference from “mean variance”, or rather “mean semi-variance”. If R is monotonically increasing during for all samples (n intervals, n+1 data points), then SV is zero. I have many reasons for this choice. The primary reason is that with ravg the SV for a straight descending R would be zero. I don’t want a formula that rewards such a performance with 0, the best possible SV score. [Others would substitute T, a usually positive number, as target return, sometimes called minimal acceptable return.]
Finally, a word about ri — ri is the total return over the interval i. Intervals should be as uniform as possible. I tend to avoid daily intervals due to the non-uniformity introduced by weekends and holidays. Weekly (last closing price of the trading week), monthly (last closing price of the month), and quarterly are significantly more uniform in duration.
Big Data and Heuristic Algorithms
Innovations in computing and algorithms are how semi-variance equations will change the world of finance. Common sense is why. I’ll explain why heuristic algorithms like Sigma1’s HALO can quickly find near-optimal SV solutions on a common desktop workstation, and even better solutions when leveraging a data center’s resources. And I’ll explain why SV is vastly superior to variance.
Computing SV for a single portfolio of 100 securities is easy on a modern desktop computer. For example 3-year monthly semi-variance requires 3700 multiply-accumulate operations to compute portfolio return, Rp, followed by a mere 37 subtractions, 36 multiplies (for squaring), and 36 additions (plus multiplying by 2/n). Any modern computer can perform this computation in the blink of an eye.
Now consider building a 100-security portfolio from scratch. Assume the portfolio is long-only and that any of these securities can have a weight between 0.1% and 90% in steps of 0.1%. Each security has 900 possible weightings. I’ll spare you the math — there are 6.385*10138 permutations. Needless to say, this problem cannot be solved by brute force. Further note that if the portfolio is turned into a long-short portfolio, where negative values down to -50% are allowed, the search space explodes to close to 102000.
I don’t care how big your data center is, a brute force solution is never going to work. This is where heuristic algorithms come into play. Heuristic algorithms are a subset of metaheuristics. In essence heuristic algorithms are algorithms that guide heuristics (or vise versa) to find approximate solution(s) to a complex problem. I prefer the term heuristic algorithm to describe HALO, because in some cases it is hard to say whether a particular line of code is “algorithmic” or “heuristic”, because sometimes the answer is both. For example, semi-variance is computed by an algorithm but is fundamentally a heuristic.
Heuristic Algorithms, HAs, find practical solutions for problems that are too difficult to brute force. They can be configured to look deeper or run faster as desired by the user. Smarter HAs can take advantage of modern computer infrastructure by utilizing multiple threads, multiple cores, and multiple compute servers in parallel. Many, such as HAL0, can provide intermediate solutions as they run far and deep into the solution space.
Let me be blunt — If you’re using Microsoft Excel Solver for portfolio optimization, you’re missing out. Fly me out and let me bring my laptop loaded with HAL0 to crunch your data set — You’ll be glad you did.
Now For the Fun Part: Why switch to Semi-Variance?
Thanks for reading this far! Would you buy insurance that paid you if your house didn’t burn down? Say you pay $500/year and after 10 years, if your house is still standing, you get $6000. Otherwise you get $0. Ludicrous, right? Or insurance that only “protects” your house from appreciation? Say it pays 50 cents for every dollar make when you resell your house, but if you lose money on the resale you get nothing?
In essence that is what you are doing when you buy (or create) a portfolio optimized for variance. Sure, variance analysis seeks to reduce the downs, but it also penalizes the ups (if they are too rapid). Run the numbers on any portfolio and you’ll see that SV ≠ V. All things equal, the portfolios with SV < V are the better bet. (Note that classic_SV ≤ V, because it has a subset of positive numbers added together compared to V).
Let me close with a real-world example. SPLV is an ETF I own. It is based on owning the 100 stocks out of the S&P 500 with the lowest 12-month volatility. It has performed well, and been received well by the ETF marketplace, accumulating over $1.5 billion in AUM. A simple variant of SPLV (which could be called PLSV for PowerShares Low Semi-Variance) would contain the 100 stocks with the least SV. An even better variant would contain the 100 stocks that in aggregate produced the lowest SV portfolio over the proceeding 12 months.
HALO has the power to construct such a portfolio. It could solve preserving the relative market-cap ratios of the 100 stocks, picking which 100 stocks are collectively optimal. Or it could produce a re-weighted portfolio that further reduced overall semi-variance.
[Even more information on semi-variance (in its many related forms) can be found here.]
Almost every stock chart presents incomplete data for a security’s total return. Simply put, stock charts don’t reflect dividends and distributions. Stock charts simply show price data. A handful of charts superimpose dividends over the price data. Such charts are an improvement, but require mental gymnastics to correctly interpret total return.
At the end of the year, I suspect the vast majority of investors are much more interested in how much money they made than whether their profits come from asset appreciation, dividends, interest or other distributions. In the case of tax-differed or tax-exempt accounts (such as IRA, Roth IRAs, 401k, etc. accounts) the source of returns is unimportant. Naturally, for other portfolios, some types of return are more tax-advantaged than others. In one case I tried to persuade a relative that MUB (iShares S&P National AMT-Free Muni Bd) was a good investment for them in spite of it’s chart, because the chart did not show the positive tax impact of tax-exempt income.
Our minds see what they want to see. When we compare two stocks (or ETFs) we often have a slight bias towards one. If we see what we want in a stock’s chart, we may look past the dividend annotations and make a incorrect decision.
This 1-year chart comparing two ETFs illustrates this point. These two ETFs track each other reasonably well until Dec 16th, where there is a sharp drop in PBP. This large dip reflects the effect of a large distribution of roughly 10%. Judging strictly by the price data, it at first appears that SPY beats PBP by 7%. When factoring the yield of PBP, about 10.1%, and SPY, roughly 1.9%, shows a 1.2% 1-year out-performance by PBP. First appearances show SPY outperforming; a little math shows PBP outperforming.
Yahoo! Finance provides raw data adjusted for dividends and distributions. Using the 1-year start and end data shows SPY returning a net 3.77%, and PBP returning a net 4.96%. The delta shows a 1.19% out performance by PBP. Yahoo! Finance’s table have all the right data; I would love to see Yahoo! add an option to display this adjusted-price data graphically.
Total return is not a new concept. Bill Gross was very insightful in naming PIMCO’s “Total Return” lineup of funds over 25 years ago. Many mutual funds provide total return charts. For instance, Vanguard provides total return charts for investments such as Vanguard Total Stock Market Index Fund Admiral Shares. I am pleased to see Fidelity offering similar charts for ETFs in research “performance” reports for its customers. Unfortunately, I have not found a convenient way to superimpose two total-return charts.
While traditional stock and ETF charts do not play a large roll in my investment decisions, I do look at them when evaluating potential additions to my investment portfolio. When I do look at charts, I’d prefer to have the option of looking at total return charts rather than “old fashioned” price charts.
That said, I prefer to use quantitative portfolio analysis as my primary asset allocation technology. For such analysis I compute total return data for each asset from price data and distribution data, assuming reinvestment. Reformatting asset data in this way allows HAL0 portfolio-optimization software to directly compare different asset classes (gold, commodities, stock ETFs, bond ETFs, leveraged ETFs, etc). Moreover, such pre-formatting allows faster computation of risk for various asset allocations within a portfolio.
A large part of my vision for Sigma1 is revolutionizing how investors and money managers visualize and conceptualize portfolio construction. The key pieces of that conceptual revolution are:
Rethinking return to always mean total return.
Rethinking risk to mean something other than variance or standard deviation.
Many already think of total return as the key measure of raw portfolio performance. It is odd, then, that so many charts display something other than total return. And some would like to measure, manage, and model risk in more robust ways. A major obstacle to alternate risk measures is a dearth of financial portfolio optimization tools that work with PMPT models such as semi-variance.
HAL0 is designed from the ground up to address the goals of optimizing portfolios based on total return and a wide variety of advanced, more-robust risk models. (And, yes, total return can be defined in terms of after-tax total return, if desired.)
Disclosure: I have long positions in SPY, the Vanguard Total Stock Market Index, and PBP.