Software Development Choices for Portfolio Optimization

The first phase of developing the HALO (Heuristic Algorithm Optimizer) Portfolio Optimizer was testing mathematical and heuristic concepts.  The second phase was teaming up with beta partners in the financial industry to exchange optimization work for feedback on the optimizer features and results.

For the first phase, my primary tool for software development was the Ruby language.  Because Ruby is a “high-level” extensible language I was able to quickly prototype and test many diverse and complex concepts.  This software development process is sometimes referred to as software prototyping.

For the second, beta phase of software development I kept most of the software in Ruby, but began re-implementing selected portions of the code in C/C++. The goal was to keep the high-change-rate code in Ruby, while coding the more stable portions in C/C++ for run-time improvement.  While a good idea in theory, it turned out that my ability to foresee beta-partner changes was mixed at best.  While many changes hit the the Ruby code, and were easily implemented, a significant fraction hit deep into the C/C++ code, requiring significant development and debugging effort.  In some cases, the C/C++ effort was so high, I switched back portions of the code to Ruby for rapid development and ease of debugging.

Now that the limited-beta period is nearly complete, software development has entered a third phase: run-time-performance optimization.  This process involves converting the vast majority of Ruby code to C.  Notice, I specifically say C, not C/C++.   In phase 2, I was surprised at the vast increase in executable code size with C++ (and STL and Boost).  As an experiment I pruned test sections of code down to pure C and saw the binary (and in-memory) machine code size decrease by 10X and more.

By carefully coding in pure C, smaller binaries were produced, allowing more of the key code to reside in the L1 and L2 caches.  Moreover, because C allows very precise control over memory allocation, reallocation, and de-allocation, I was able to more-or-less ensure than key data resided primarily in the L1 and/or L2 caches as well.  When both data and instructions live close to the CPU in cache memory, performance skyrockets.

HALO code is very modular, meaning that it is carefully partitioned into independent functional pieces.  It is very difficult, and not worth the effort, to convert part of a module from Ruby to C — it is more of an all-or-nothing process.  So when I finished converting another entire module to C today, I was eager to see the result.  I was blown away.  The speed-up was 188X.  That’s right, almost 200 times faster.

A purely C implementation has its advantages.  C is extremely close to the hardware without being tied directly to any particular hardware implementation.   This enables C code (with the help of a good compiler) to benefit from specific hardware advantages on any particular platform.  Pure C code, if written carefully, is also very portable — meaning it can be ported to a variety of different OS and hardware platforms with relative ease.

A pure C implementation has disadvantages.  Some include susceptibility to pointer errors, buffer-overflow errors, and memory leaks as a few examples.  Many of these drawbacks can be mitigated by software regression testing, particularly to a “golden” reference spec coded in a different software language.  In the case of HALO Portfolio-Optimization Software, the golden reference spec is the Ruby implementation.  Furthermore unit testing can be combined with regression testing to provide even better software test coverage and “bug” isolation.  The latest 188X speedup was tested against a Ruby unit test regression suite and proven to be identical (within five or more significant digits of precision) to the Ruby implementation.  Since the Ruby and C implementations were coded months apart, in different software languages, it is very unlikely that the same software “bug” was independently implemented in each.  Thus the C helps validate the “golden” Ruby spec, and vice versa.

I have written before about how faster software is greener software.  At the time HALO was primarily a Ruby implementation, and I expected about a 10X speed up for converting from Ruby to C/C++.  Now I am increasingly confident that an overall 100X speedup for an all C implementation is quite achievable.  For the SaaS (software as a service) implementation, I plan to continue to use Ruby (and possibly some PHP and/or Python) for the web-interface code.  However, I am hopeful I can create a pure C implementation of the entire number-crunch software stack.  The current plan is to use the right tool for the right job:  C for pure speed, Ruby for prototyping and as a golden regression reference, and Ruby/PHP/Python/etc for their web-integration capabilities.


Inverted Risk/Return Curves

Over 50 years of academic financial thinking is based on a kind of financial gravity:  the notion that for a relatively diverse investment portfolio, higher risk translates into higher return given a sufficiently long time horizon.  Stated simply: “Risk equals reward.”  Stated less tersely, “Return for an optimized portfolio is proportional to portfolio risk.”

As I assimilated the CAPM doctrine in grad school, part of my brain rejected some CAPM concepts even as it embraced others.  I remember seeing a graph of asset diversification that showed that randomly selected portfolios exhibited better risk/reward profiles up to 30 assets, at which point further improvement was minuscule and only asymptotically approached an “optimal” risk/reward asymptote.  That resonated.

Conversely, strict CAPM thinking implied that a well-diversified portfolio of high-beta stocks will outperform a marketed-weighted portfolio of stocks over the long-term, albeit in a zero-alpha fashion.  That concept met with cognitive dissonance.

Now, dear reader, as a reward for staying with this post this far, I will reward you with some hard-won insights.  After much risk/reward curve fitting on compute-intensive analyses, I found that the best-fit expected-return metric for assets was proportional to the square root of beta.  In my analyses I defined an asset’s beta as 36-month, monthly returns relative to the benchmark index.  Mostly, for US assets, my benchmark “index” was VTI total-return data.

Little did I know, at the time, that a brilliant financial maverick had been doing the heavy academic lifting around similar financial ideas.  His name is Bob Haugen. I only learned of the work of this kindred spirit upon his passing.

My academic number crunching on data since 1980 suggested a positive, but decreasing incremental total return vs. increasing volatility (or for increasing beta).  Bob Haugen suggested a negative incremental total return for high-volatility assets above an inflection-point of volatility.

Mr. Haugen’s lifetime of  published research dwarfs my to-date analyses. There is some consolation in the fact that I followed the data to conclusions that had more in common with Mr. Haugen’s than with the Academic Consensus.

An objective analysis of the investment approach of three investing greats will show that they have more in common with Mr. Haugen than Mr. E.M. Hypothesis (aka Mr. Efficient Markets, [Hypothesis] , not to be confused with “Mr. Market”).  Those great investors are 1) Benjamin Graham, 2) Warren Buffet, 3) Peter Lynch.

CAPM suggests that, with either optimal “risk-free”or leveraged investments a capital asset line exists — tantamount to a linear risk-reward relationship. This line is set according to an unique tangent point to the efficient frontier curve of expected volatility to expected return.

My research at Sigma1 suggests a modified curve with a tangent point portfolio comprised, generally, of a greater proportion of low volatility assets than CAPM would indicate.  In other words, my back-testing at Sigma1 Financial suggests that a different mix, favoring lower-volatility assets is optimal.  The Sigma1 CAL (capital allocation line) is different and based on a different asset mix.  Nonetheless, the slope (first derivative) of the Sigma1 efficient frontier is always upward sloping.

Mr. Haugen’s research indicates that, in theory, the efficient frontier curve past a critical point begins sloping downward with as portfolio volatility increases. (Arguably the curve past the critical point ceases to be “efficient”, but from a parametric point it can be calculated for academic or theoretical purposes.)  An inverted risk/return curve can exist, just as an inverted Treasury yield curve can exist.

Academia routinely deletes the dominated bottom of the the parabola-like portion of the the complete “efficient frontier” curve (resembling a parabola of the form x = A + B*y^2) for allocation of two assets (commonly stocks (e.g. SPY) and bonds (e.g. AGG)).

Maybe a more thorough explanation is called for.   In the two-asset model the complete “parabola” is a parametric equation where x = Vol(t*A, (1-t)*B) and y = ER( t*A, (1-t)*B.  [Vol == Volatility or standard-deviation, ER = Expected Return)].   The bottom part of the “parabola” is excluded because it has no potential utility to any rational investor.  In the multi-weight model, x=minVol (W), y=maxER(W), and W is subject to the condition that the sum of weights in vector W = 1.  In the multi-weight, multi-asset model the underside is automatically excluded.  However there is no guarantee that there is no point where dy/dx is negative.  In fact, Bob Haugen’s research suggests that negative slopes (dy/dx) are possible, even likely, for many collections of assets.

Time prevents me from following this financial rabbit hole to its end.  However I will point out the increasing popularity and short-run success of low-volatility ETFs such as SPLV, USMV, and EEMV.  I am invested in them, and so far am pleased with their high returns AND lower volatilities.


NOTE: The part about W is oversimplified for flow of reading.  The bulkier explanation is y is stepped from y = ER(W) for minVol(W) to max expected-return of all the assets (Wmax_ER_asset = 1, y = max_ER_asset_return), and each x = minVol(W) s.t. y = ER(W) and sum_of_weights(W) = 1.   Clear as mud, right?  That’s why I wrote it the other way first.


Variance, Semivariance Convergence

In running various assets through portfolio-optimization software, I noticed that for an undiversified set of assets there can be wide differences between portfolios with the highest Sharpe ratios versus portfolios with the Sortino ratios.  Further, if the efficient frontier of ten portfolios is constructed (based on mean-variance optimization) and sorted according to both Sharpe and Sortino ratios the ordering is very different.

If, however, the same analysis is performed on a globally-diversified set of assets the portfolios tend to converge.  The broad ribbon of of the 3-D efficient surface seen with undiversfied assets narrows until it begins to resemble a string arching smoothly through space.  The Sharpe/Sortino ordering becomes very similar with ranks seldom differing by more than 1 or 2 positions.  Portfolios E and F may rank 2 and 3 in the Sharpe ranking but rank      2 and 1 in the Sortino ranking, for example.

Variance/Semivariance divergence is wider for optimized portfolios of individual stocks.  When sector-based stock ETFs are used instead of individual stocks, the divergence narrows.  When bond- and broad-based index ETFs are optimized, the divergence narrows to the point that it could be considered by many to be insignificant.

This simple convergence observation has interesting ramifications.  First, a first-pass of faster variance optimization can be applied, followed by a slower semivariance-based refinement to more efficiently achieve a semivariance-optimized portfolio.  Second, semivariance distinctions can be very significant for non-ETF (stock-picking) and less-diversified portfolios.  Third, for globally-diversified, stock/bond, index-EFT-based portfolios, the differences between variance-optimized and semivariance-optimized portfolios are extremely subtle and minute.



Engineering Profit versus Theoretical Profit

Either there is a veil of silence covering the world of finance, or the obvious parallels between electrical engineering (EE) have been overlooked.   I suspect the former.

Almost every EE worth their salt has been exposed to the concepts of signals and signal processing in undergrad.  From signal-to-noise ratios (SNR) to filters (dB/decade) to digital signal processors (DSPs), EE’s are trained to be experts at receiving the signal in spite of the noise.  More technobabble (but its not!) are the Fourier and Laplace transforms we routinely use to analyze the propagation of signals through circuits.  Not to mention wave-guides, complex-conjugate reflections, amplitude- and frequency- modulation, etc.   Then there are the concepts of signal error detection, error correction, and information content.

My point is that financial firms made a mistake in hiring more physicists than electrical engineers.  At the end of the day (or the project) the work of the EE has to stand up to more than just academic scrutiny; it has to stand up to the real world — real products, real testing, real use.

EE’s with years of experience have been there and done that.  Mind you, most are not interested in finance.  However, a handful of us are deeply interested in finance and investing.

These thoughts occurred to me as I was listening to speakers I built 15 years ago.  They still sound spectacular (unglaublich gut, for you Germans).  They are now my second-tier speakers relegated to computer audio.  Naturally, I have an amp fed by Toslink 48K/s 20-bit per channel audio data. My point is that these speakers have audio imaging that is achieved by a smooth first-order crossover with tweaters/speakers chosen to support phase-accurate performance over a the frequencies that the human ear can best make use of audio imaging.

My second point is that a lot of engineering went into these speakers.   This engineering goes beyond electrical.   Speakers are fundamentally in the grey region between mechanical and electrical engineering.  However the mechanical parameters can be “mapped” into the “domain” of electrical engineering concepts.  This positions EEs to pick the best designs and combine them in most advantageous designs  on a maximum value- per-dollar basis.

This post is targeting a different audience than most.  Apologies.  An EE with a CS (computer science) background is an even better choice..

The analysis of financial data as concurrent, superimposed discrete waveforms is natural to EEs as air is to mammals and water is is to fish.  Audio is, perhaps, the simplest application.   Just Google “Nyquist-Shannon” if you want to know of which I speak.

I’m not for hire — I only do contract work.  I’m just telling hiring managers to both broaden and restrict their search criteria.  A well-qualified EE with financial expertise and a passion for finance is likely to be a a better candidate than a Ph.D. in Physics.  Don’t hire Sheldon Cooper until you evaluate Howard Wolowitz (not an EE, but you get my point, I hope).

The #1 Question about Sigma1 Financial Software

The question that potential investors and clients inevitably ask me is, “With thousands of very smart people working in finance, what makes you think Sigma1 Portfolio-Optimization Software is better?”   Others are more blunt, asking, “Do you think you are smarter than all the quants working at, say, Goldman Sachs?”

Let me start by saying that there are people in the industry that could understand Sigma1’s HAL0 algorithms.  In order to do so, it would be very helpful to have an electrical engineering background, a computer science background and understanding of CAPM.  Because of this fact, it is very important that Sigma1 keep these proprietary algorithms and trade secrets secret.  Sigma1 has also started evaluating patent attorneys and exploring patent protection.

The seeds of HAL0 methods and algos were born out of my early experiences in electrical engineering, specifically the area of VLSI design.  In the late 90’s and early 00’s it was becoming apparent to leading-edge VLSI design companies that silicon physical design could not continue to keep up with Moore’s Law unless there was a paradigm shift in our industry.  You see, VLSI design had already been revolutionized by a technology called logic synthesis.  But synthesis was breaking down for large designs due to it’s reliance on statistical wireload models (WLMs).    It turned out that actual wireloads did not follow normal probability distributions.  They had long, fat tails on the side of bad variance.

In 2001, my college and I wrote an article that became the July cover story for an industry magazine.  One of the two main points of the article was that the statistical models for wire delay were sub-optimal to the point of being broken.   My co-author and I were suggesting that companies that continued to follow the statistical methods of the past were going to either adopt new methods or become obsolete.  More than ten years later, the VLSI industry has shown this to be a true prediction.  Today, statistical WLMs are virtually obsolete.

A couple years later, while still working in VLSI design, I started grad school, taking graduate-level classes in finance and electrical engineering.  I worked with some really smart folks working on algorithms to solve some difficult circuit optimization problems.  Meanwhile I was learning about CAPM and MPT, as well as Fama French factors.  As I applied myself to both disciplines, little did I suspect that, years later, I would connect them both together.  As I learned more about finance, I saw how the prices of fixed income investments made mathematical sense.  Meanwhile I was stuck with the nagging idea that the theories of stock portfolio optimization were incomplete, especially because variance was a sub-optimal risk model.

I was on sabbatical, driving across New Mexico, thinking about my favorite theoretical problems to pass the time.  I was mentally alternating between Fourier transforms, semivariance portfolio-optimization, and heuristic optimization algorithms.   I was thinking back to when my undergrad colleagues and I predicted that class-D audio amplifiers would replace class A, A/B, and B amplifiers, especially for sub-woofers.  [I now own a superb class-D subwoofer that I bought on the web.]   In a flash it occurred to me that the same principles of superposition that make Fourier transforms and class-D amplifiers work, also apply to investment portfolios.  I suddenly knew how to solve semivariance and variance simultaneously and efficiently.  In the next 5 minutes I know what I was going to do for the rest of my sabbatical:  develop and test software to optimize for 3 variables — variance, semivariance, and total return.

A bit like Alexander Fleming, who “accidentally” discovered penicillin from observing a discarded Petri dish, I discovered something new my “accident” while driving on that New Mexico highway.  The Petri dish was a thought experiment in my mind.  Bits of knowledge of VLSI and computer science fell into the Petri dish of semivariance, and I saw it flourish.  All the previous semivariance dishes had languished, but suddenly I had found a glimmer of success.

In short, I have been thinking about semivariance for almost 10 years.  I have seen the limitations of variance-based models fail in my line of work in VLSI, and seen how improved models transformed the industry, despite early doubters.  The financial industry has been  happy to hire Ph.D.s in physics, a field in which variance-based models have proven extremely successful, from quantum physics, to thermodynamics, to PV=nRT.  Where I saw the breakdown of variance-based models, physicists have seen near universal success in applications tied to physics.

When your only tool is a hammer, nails appear everywhere.   I happened to have the right tools at the right time, and a spark of innovation on a stormy day driving across New Mexico.  I was lucky and sufficiently smart.  That is why I believe Sigma1’s portfolio-optimization solution is perhaps the best solution currently on the planet.  It is possible others have found similarly-effective solutions, but I my reasonably rigorous search I have found no evidence of such yet.

There are two primary factors that set Sigma1 portfolio-optimization software apart:  1)  Efficient solutions to semivariance-based portfolio optimization that scale to 1000+ assets, 2) 3-objective (3-D) models that concurrently optimize for, say, variance, semi-variance and expected return simultaneously.  I have seen a handful of software offerings that support semivariance and expected return in addition to variance and expected return, but independently (in a 2-D objective space).  Superimposing two 2-D models on the same chart is not the same as HAL0’s 3-D optimization.  For exampled, HAL0’s 3-D surface models (of the efficient frontier in 3 dimensions) allow exploration of the the variance/semivariance trade offs over any desired common expected return contour.  In the same way that topo maps graph the topology of the Earth’s terrain, Sigma1 technology can map the objective space of portfolio optimization in 3-D.

In summary, I was in the right place at the right time with the right knowledge to create revolutionary portfolio-optimization software.





Capital Allocation

Let’s start with the idea that CAPM (Capital Asset Pricing Model) is incomplete.   Let me prove it in a  few sentences.  Everyone knows that, for investors, “risk-free” rates are always less than borrowing (margin) rates.  Thus the concept of CAL (the capital asset line) is incomplete.  If I had a sketch-pad I’d supply a drawing showing that there are really three parts of the “CAL” curve…

  1. The traditional CAL that extends from Rf to the tangent intercept with the efficient-frontier curve.
  2. CAC (capital-asset curve)
  3. CAML (capital-asset margin line, pronounced “camel”)

Why?  Because the CAML has it’s own tangent point based on the borrower’s marginal rate.  Because the efficient frontier is monotonically-increasing the CAL and CAML points will be separated by a section of the EF curve I call the CAC.

All of this is so obvious, it almost goes without saying.  It is strange, then, that I haven’t seen it pointed out in graduate finance textbooks, or online.  [If you know of a reference, please comment on this post!]  In reality, the CAL only works for an unleveraged portfolio.

CAPM is Incomplete; Warren Buffett Shows How

Higher risk, higher return, right?  Maybe not… at least on a risk-adjusted basis.  Empirical data suggests that high-beta stock and portfolios do not receive commensurate return.  Quite to the contrary, low-beta stocks and portfolios have received greater returns than CAPM predicts.   In other words, low-beta portfolios (value portfolios in many cases) have had higher historical alphas.  Add leverage, and folks like Warren Buffett have produced high long-term returns.

Black Swans and Grey Swans

On the fringe of modern-portfolio theory (MPT) and post-modern portfolio theory (PMPT), live black swans.   Black swans are essentially the most potent of unknown unknowns, also known as “fat tails”.

At the heart of PMPT is what I call “grey swans.”  This is also called “breakdown of covariance estimates” or, in some contexts, financial contagion.  Grey-swan events are much more common, and somewhat more predictable… That is if one is NOT fixated on variance.

Variance is close, semivariance is closer.  I put forth the idea that PMPT overstates its own potential.  Black swans exists, are underestimated, and essentially impossible to predict.  “Grey swans” are, however, within the realm of PMPT.   They can be measured in retrospect and anticipated in part.

Assets are Incorrectly Priced

CAPM showed a better way to price assets and allocate capital.  The principles of semivariance, commingled with CAPM form a better model for asset valuation.  Simply replacing variance with semivariance changes fifty years of stagnant theory.

Mean-return variance is positively correlated with semivariance (mean semi-variance of asset return), but the correlation is far less than 1.   Further, mean variance is most correlated when it matters most; when asset prices drop.  The primary function of diversification and of hedging is to efficiently reduce variance.  Investors and pragmatists note that this principle matters more when assets crash together — when declines are correlated.

The first step in breaking this mold of contagion is examining what matter more: semivariance.   Simply put, investors care much less about compressed upward variance than they do about compressed downward variance.   They care more about semivariance.  And, eventually, they vote with their remaining assets.

A factor in retaining and growing an AUM base is content clients.  The old rules say that the correct answer the a key Wall Street interview question is win big or lose all (of the client’s money).  The new rules say that clients demand a value-add from their adviser/broker/hybrid.  This value add can be supplied, in part, via using the best parts of PMPT.  Namely semivariance.

That is the the end result of the of the success of semivariance.  The invisible hand of Sigma1, and other forward-looking investment companies, is to guide investors to invest money in the way that best meets their needs.  The eventual result is more efficient allocation of capital.  In the beginning these investors win.  In the end, both investors and the economy wins.  This win/win situation is the end goal of Sigma1.



A Choice: Perfectly Wrong or Imperfectly Right?

In many situations good quick action beats slow brilliant action.   This is especially true when the “best” answer arrives too late.  The perfect pass is irrelevant after the QB is sacked, just as the perfect diagnosis is useless after the patient is dead.  Lets call this principle the temporal dominance threshold, or beat the buzzer.

Now imagine taking a multiple-choice test such as the SAT or GMAT.   Let’s say you got every question right, but somehow managed to skip question 7.   In the line for question #7 you put the answer to question #8, etc.   When you answer the last question, #50, you finally realize your mistake when you see one empty space left on the answer sheet… just as the proctor announces “Time’s up!”   Even thought you’ve answered every question right (except for question #7), you fail dramatically.   I’ll call this principle query displacement, or right answer/wrong question.

The first scenario is similar to the problems of high-frequency trading (HFT).  Good trades executed swiftly are much better than “great” trades executed (or not executed!) after the market has already moved.   The second scenario is somewhat analogous to the problems of asset allocation and portfolio theory.  For example, if a poor or “incomplete” set of assets is supplied to any portfolio optimizer, results will be mediocre at best.  Just one example of right answer (portfolio optimization), wrong question (how to turn lead into gold).

I propose that the degree of fluctuation, or variance (or mean-return variance) is another partially-wrong question.  Perhaps incomplete is a better term.  Either way, not quite the right question.

Particularly if your portfolio is leveraged, what matters is portfolio semivariance.  If you believe that “markets can remain irrational longer than you can remain solvent”, leverage is involved.  Leverage via margin, or leverage via derivatives matters not.  Leverage is leverage.  At market close, “basic” 4X leverage means complete liquidation at a underlying loss of only 25%.  Downside matters.

Supposing a long-only position with leverage, modified semivariance is of key importance.  Modified, in my notation, means using zero rather than μ.  For one reason, solvency does not care about μ, mean return over an interval greater than insolvency.

The question at hand is what is the best predictor of future semivariance — past variance or past semivariance?  These papers make the case for semivariance:  “Good Volatility, Bad Volatility: Signed Jumps and the Persistence of Volatility” and “Mean-Semivariance Optimization: A Heuristic Approach“.

At the extreme, semivariance is most important factor for solvency… far more important than basic variance.  In terms of client risk-tolerance, actual semi-variance is arguably more important than variance — especially when financial utility is factored in.

Now, finally, to the crux of the issue.   It is far better to predict future semivariance than to predict future variance.  If it turns out that past (modified) semivariance is more predictive of future semivariance than is past variance, then I’d favor a near-optimal optimization of expected return versus semivariance than an perfectly-optimal expected return versus variance asset allocation.

It turns out that respectively optimizing semivariance is computationally many orders of magnitude more difficult that optimizing for variance.  It also turns out that Sigma1’s HAL0 software provides a near-optimal solution to the right question: least semivariance for a given expected return.

At the end of the day, at market close, I favor near-perfect semivariance optimization over “perfect” variance optimization.  Period.  Can your software do that?  Sigma1 financial software, code-named HAL0, can.  And that is just the beginning of what it can do.  HALo answers the right questions, with near-perfect precision.  And more precisely each day.







Financial Software Tech

In order to create software that is appealing to the enterprise market today, Sigma1 must create software for five years from now.   In this post I will answer the questions of why and how Sigma1 software intends to achieve this goal.

The goal of Sigma1 HAL0 software is to solve financial asset allocation problems quickly and efficiently.  HALo is portfolio-optimization software that makes use of a variety of proprietary algorithms.  HALo’s algorithms solve difficult portfolio problems quickly on a single-core computer, and much more rapidly with multi-core systems.

Savvy enterprise software buyers want to buy software that runs well on today’s hardware, but will also run on future generations of compute hardware.   I cannot predict all the trends for future hardware advanced, but I can predict one:  more cores.  Cores per “socket” are increasing on a variety of architectures:  Intel x86, AMD x86, ARM, Itanium, and IBM Power7 to name a few.  Even if this trend slows, as some predict, the “many cores” concept is here to stay and progress.

Simply put — Big Iron applications like portfolio-optimization and portfolio-risk management and modelling are archaic and virtually DOA if they cannot benefit from multi-core compute solutions.   This is why HAL0 is designed from day 1 to utilize multi-core (as well as multi-socket) computing hardware.  Multiprocessing is not a bolt-on retrofit, but an intrinsic part of HAL0 portfolio-optimization software.

That’s the why, now the how.  Google likes to use the phrase “map reduce”  while others like the phase embarrassingly parallel.   I like both terms because it can be embarrassing when a programmer discovers that the problems his software was slogging through in series were being solved in parallel by another programmer who mapped them to parallel sub-problems.

The “how” for HAL0’s core algorithm is multi-layered.   Some of these layers are trade secrets, I can disclose one.  Portfolio optimization involves creating an “efficient frontier” comprised of various portfolios along the frontier.  Each of these portfolios can be farmed out in parallel to evaluate its risk and reward values.   Depending on the parameters of a particular portfolio-optimization problem this first-order parallelism can provide roughly a 2-10x speedup — parallel, but not massively parallel.

HALo was developed under a paradigm I call CAP (congruent and parallel).  Congruent in this context means that given the same starting configuration, HAL0 will always produce the same result.  This is generally easy for single-threaded programs to accomplish, but often more difficult for programs running multiple threads on multiple cores.    Maintaining congruence is extremely helpful in debugging parallel software, and is thus very important to Sigma1 software.  [Coherent or Deterministic could be used in lieu of Congruent.]

As HAL0 development continued, I expanded the CAP acronym to CHIRP (Congruent, Heterogeneous, Intrinsically Recursively Parallel).   Not only does CHIRP have a more open, happier connotation that CAP, it adds two additional tenets:  heterogeneity and recursion.

Heterogeneity, in the context of CHIRP, means being able to run, in parallel, on a variety of machines will different computing capabilities.  On on end of the spectrum, rather than requiring all machines in the cloud or compute queue having the exact same specs (CPU frequency, amount of RAM, etc), the machines can be different.  On the other end of the spectrum, heterogeneity means running in parallel on multiple machines with different architectures (say x86 and ARM, or x86 and GPGPUs).  This is not to say that HAL0 has complete heterogeneous support; it does not.  HALo is, however, architected with modest support for heterogeneous solutions and extensibility for future enhancements.

The recursive part of CHIRP is very important.  Recursively parallel means that the same code can be run (forked) to solve sub-problems in parallel, and those sub-problems can be divided into sub-sub problems, etc.   This means that the same tuned, tight, and tested code can leveraged in a massively parallel fashion.

By far the most performance-enhancing piece of HAL0 portfolio-optimization CHIRP is RP.  The RP optimizations are projected to produce speedups of 50 to 100X over single-threaded performance (in a compute environment with, for example, 20 servers with 10 cores each).  Moreover, the RP parts of HAL0 only require moderate bandwidth and are tolerant of relatively high latency (say, 100 ms).

Bottom line:  HAL0 portfolio-optimization software is designed to be scalable and massively parallel.




Portfolio-Optimization Software: A Financial Software Suite for Power Users

When I’m not coding Sigma1 financial software, I’m often away from the keyboard thinking about it.  Lately I’ve been thinking about how to subdivide the HAL0 portfolio-optimization code into autonomous pieces.  There are many reasons to do this, but I will start by focusing on just one:  power users.

For this blog post, I’m going to consider Jessica, a successful 32 year-old proprietary trader.  Jessica is responsible for managing just over $500 million in company assets.  She has access to in-house research, and can call on the company’s analysts, quants, and researchers as needed and available.  Jessica also has a dedicated $500,000 annual technology budget, and she’s in the process of deciding how to spend it most effectively. She is evaluating financial software from several vendors.

Jessica is my target market for B2B sales.

In my electrical engineering career I have been responsible for evaluating engineering software.  Often I would start my search with 3 to 5 software products.  Due to time constraints, I would quickly narrow my evaluation to just two products.   Key factors in the early software vetting process were 1) ease of integration into our existing infrastructure and 2) ease of turn-on.  After narrowing the search to two, the criteria switched to performance, performance, performance, and flexibility… essentially a bake-off between the two products.

Ease of use and integration was initially critical because I (or others on my team) needed a product we could test and evaluate in-house.  I refused to make a final selection or recommendation based on vendor-provided data.  We needed software that we could get up and running quickly… solving our problems on our systems.  We’d start with small problems to become familiar with the products and then ramp up to our most challenging problems to stress and evaluate the software.

Assuming Jessica (and others like her) follow a similar approach to software purchases, HAL0 portfolio software has to get through both phases of product evaluation.

HAL0 software is optimized to “solve” simultaneously for 3 goals, normally:

  1. A risk metric
  2. A total-return metric
  3. An “x-factor” metric (often an orthogonal risk metric)

While being great for power users, the x-factor metric is also non-standard.  To help get through the “ease of start up” phase of product evaluation, I will likely default the x-factor to off.

Once the preliminary vetting is complete and final evaluation begins in earnest, performance and flexibility will be put to the test.  If, for instance, Jessica’s team has access to a proprietary risk-analysis widget utilizing GPGPUs that speeds up risk computation 100X over CPUs, HAL0 software can be configured to support it.  Because HAL0 is increasingly modular and componentized, Jessica’s resident quants can plug in proprietary components using relatively simple APIs.  Should competing products lack this plug-in capability, HAL0 software will have a massive advantage out of the starting gate.

When it comes to flexibility, component-based design wins hands down.  Proprietary risk metrics can be readily plugged in to HAL0 optimization software.  Such risk models can replace the default semi-variance model, or be incorporated as an adjunct risk metric in “x-space.”  Users can seed portfolio optimization with existing real and/or hypothetical portfolios, and HAL0 software will explore alternatives using user-selected risk and performance metrics.

HAL0 Portfolio-Optimization Basic Features

Out-of-the-box HAL0 software comes pre-loaded with semi-variance and variance risk metrics.  Historic or expected-return metrics are utilized for each security.  By default, the total return of any portfolio is computed as the weighted average of expected (or historic) total return of the securities that portfolio.  Naturally, leveraged and short-position weightings are supported as desired.

These basic features are provided for convenience and ease of setup.  While robust and and “battle-ready”, I do not consider them major value-add components.  The key value-proposition for HAL0 portfolio-optimization is its efficient, multi-objective engine.  HAL0 software is like a race car delivered ready to compete with a best-in-class engine.  All the components are race-ready, but some world-class race teams will buy the car simply to acquire the engine and chassis, and retrofit the rest.

Because HAL0 software is designed from the ground up to efficiently optimize large-asset-count portfolios using 3 concurrent objectives, switching to conventional 2-D optimization is child’s play.  Part of the basic line up of “x-space” metrics includes:

  • Diversification as measured against root-mean-square deviation from market or benchmark in terms of sector-allocation, market-cap allocation, or a weighted hybrid of both.
  • Quarterly, 1-year, 3-year, or 5-year worst loss.
  • Semi-variance (SV) or variance (V) — meaning concurrent optimization for both V and SV is possible.

Don’t “Think Different”, Be Different!

My primary target audience is professional investors who simply refuse to run with the herd.  They don’t seek difference for its own sake, but because they wish to achieve more.  HAL0 is designed to help ease its own learning curve by enabling users to quickly achieve the portfolio-optimization equivalent of “Hello, World!”, while empowering the power user to configure, tweak, and augment to virtually any desired extreme.

The Equation that Will Change Finance

Two mathematical equations have transformed the world of modern finance.  The first was CAPM, the second Black-Scholes.  CAPM gave a new perspective on portfolio construction.  Black-Scholes gave insight into pricing options and other derivatives.  There have been many other advancements in the field of financial optimization, such as Fama-French — but CAPM and Black-Scholes-Merton stand out as perhaps the two most influential.

Enter Semi-Variance

Modified Semi-Variance Equation
Modified Semi-Variance Equation, A Financial Game Changer

When CAPM (and MPT) were invented, computers existed, but were very limited.  Though the father of CAPM, Harry Markowitz, wanted to use semi-variance, the computers of 1959 were simply inadequate.  So Markowitz used variance in his ground breaking book “Portfolio Selection — Efficient Diversification of Investments”.

Choosing variance over semi-variance made the computations orders of magnitude easier, but the were still very taxing to the computers of 1959.  Classic covariance-based optimizations are still reasonably compute-intensive when a large number of assets are considered.  Classic optimization of a 2000 asset portfolio starts by creating a 2,002,000-entry (technically 2,002,000 unique entries which, when mirrored about the shared diagonal, number 4,000,000) covariance matrix; that is the easy part.  The hard part involves optimizing (minimizing) portfolio variance for a range of expected returns.  This is often referred to as computing the efficient frontier.

The concept of semi-variance (SV) is very similar to variance used in CAPM.  The difference is in the computation.  A quick internet search reveals very little data about computing a “semi-covariance matrix”.  Such a matrix, if it existed in the right form, could possibly allow quick and precise computation of portfolio semi-variance in the same way that a covariance matrix does for computing portfolio variance.  Semi-covariance matrices (SMVs) exist, but none “in the right form.” Each form of SVM has strengths and weaknesses. Thus, one of the many problems with semi-covariance matrices is that there is no unique canonical form for a given data set.  SVMs of different types only capture an incomplete portion of the information needed for semi-variance optimization.

The beauty of SV is that it measures “downside risk”, exclusively.  Variance includes the odd concept of “upside risk” and penalizes investments for it.  While not  going to the extreme of rewarding upside “risk”, the modified semi-variance formula presented in this blog post simply disregards it.

I’m sure most of the readers of this blog understand this modified semi-variance formula.  Please indulge me while I touch on some of the finer points.   First, the 2 may look a bit out of place.  The 2 simply normalizes the value of SV relative to variance (V).  Second, the “question mark, colon” notation simply means if the first statement is true use the squared value in summation, else use zero.  Third, notice I use ri rather than ri – ravg.

The last point above is intentional and another difference from “mean variance”, or rather “mean semi-variance”.  If R is monotonically increasing during for all samples (n intervals, n+1 data points), then SV is zero.  I have many reasons for this choice.  The primary reason is that with  ravg the SV for a straight descending R would be zero.  I don’t want a formula that rewards such a performance with 0, the best possible SV score.  [Others would substitute T, a usually positive number, as target return, sometimes called minimal acceptable return.]

Finally, a word about r— ri is the total return over the interval i.  Intervals should be as uniform as possible.  I tend to avoid daily intervals due to the non-uniformity introduced by weekends and holidays.  Weekly (last closing price of the trading week), monthly (last closing price of the month), and quarterly are significantly more uniform in duration.

Big Data and Heuristic Algorithms

Innovations in computing and algorithms are how semi-variance equations will change the world of finance.  Common sense is why. I’ll explain why heuristic algorithms like Sigma1’s HALO can quickly find near-optimal SV solutions on a common desktop workstation, and even better solutions when leveraging a data center’s resources.  And I’ll explain why SV is vastly superior to variance.

Computing SV for a single portfolio of 100 securities is easy on a modern desktop computer.  For example 3-year monthly semi-variance requires 3700 multiply-accumulate operations to compute portfolio return, Rp, followed by a mere 37 subtractions, 36 multiplies (for squaring), and 36 additions (plus multiplying by 2/n).  Any modern computer can perform this computation in the blink of an eye.

Now consider building a 100-security portfolio from scratch.  Assume the portfolio is long-only and that any of these securities can have a weight between 0.1% and 90% in steps of 0.1%.  Each security has 900 possible weightings.  I’ll spare you the math — there are 6.385*10138 permutations. Needless to say, this problem cannot be solved by brute force.  Further note that if the portfolio is turned into a long-short portfolio, where negative values down to -50% are allowed, the search space explodes to close to 102000.

I don’t care how big your data center is, a brute force solution is never going to work.  This is where heuristic algorithms come into play.  Heuristic algorithms are a subset of metaheuristics.  In essence heuristic algorithms are algorithms that guide heuristics (or vise versa) to find approximate solution(s) to a complex problem.   I prefer the term heuristic algorithm to describe HALO, because in some cases it is hard to say whether a particular line of code is “algorithmic” or “heuristic”, because sometimes the answer is both.  For example, semi-variance is computed by an algorithm but is fundamentally a heuristic.

Heuristic Algorithms, HAs, find practical solutions for problems that are too difficult to brute force.  They can be configured to look deeper or run faster as desired by the user.  Smarter HAs can take advantage of modern computer infrastructure by utilizing multiple threads, multiple cores, and multiple compute servers in parallel.  Many, such as HAL0, can provide intermediate solutions as they run far and deep into the solution space.

Let me be blunt — If you’re using Microsoft Excel Solver for portfolio optimization, you’re missing out.  Fly me out and let me bring my laptop loaded with HAL0 to crunch your data set — You’ll be glad you did.

Now For the Fun Part:  Why switch to Semi-Variance?

Thanks for reading this far!  Would you buy insurance that paid you if your house didn’t burn down?   Say you pay $500/year and after 10 years, if your house is still standing, you get $6000. Otherwise you get $0. Ludicrous, right?  Or insurance that only “protects” your house from appreciation?  Say it pays 50 cents for every dollar make when you resell your house, but if you lose money on the resale you get nothing?

In essence that is what you are doing when you buy (or create) a portfolio optimized for variance.   Sure, variance analysis seeks to reduce the downs, but it also penalizes the ups (if they are too rapid).  Run the numbers on any portfolio and you’ll see that SV ≠ V.  All things equal, the portfolios with SV < V are the better bet. (Note that classic_SV ≤ V, because it has a subset of positive numbers added together compared to V).

Let me close with a real-world example.  SPLV is an ETF I own.  It is based on owning the 100 stocks out of the S&P 500 with the lowest 12-month volatility.  It has performed well, and been received well by the ETF marketplace, accumulating over $1.5 billion in AUM.  A simple variant of SPLV (which could be called PLSV for PowerShares Low Semi-Variance) would contain the 100 stocks with the least SV.  An even better variant would contain the 100 stocks that in aggregate produced the lowest SV portfolio over the proceeding 12 months.

HALO has the power to construct such a portfolio. It could solve preserving the relative market-cap ratios of the 100 stocks, picking which 100 stocks are collectively optimal.  Or it could produce a re-weighted portfolio that further reduced overall semi-variance.

[Even more information on semi-variance (in its many related forms) can be found here.]