In my last post I showed that there are far more that a googol permutations of portfolio of 100 assets with (positive, non-zero) weights in increments of 10 basis points, or 0.1%. That number can be expressed as C(999,99), or C(999,900) or 999!/(99!*900!), or ~6.385*10138. Out of sheer audacity, I will call this number Balhiser’s first constant (Kβ1). [Wouldn’t it be ironic and embarrassing if my math was incorrect?]
In the spirit of Alan Turing’s 100th birthday today and David Hilbert’s 23 unsolved problems of 1900, I propose the creation of an initial set of financial problems to rate the general effectiveness of various portfolio-optimization algorithms. These problems would be of a similar form: each having a search space of Kβ1. There would be 23 initial problems P1…P23. Each would have a series of 37 monthly absolute returns. Each security will have an expected annualized 3-year return (some based on the historic 37-month returns, others independent). The challenge for any algorithm A to score the best average score on these problems.
I propose the following scoring measures: 1) S”(A) (S double prime) which simply computes the least average semi-variance portfolio independent of expected return. 2) S'(A) which computes the best average semi-variance and expected return efficient frontier versus a baseline frontier. 3) S(A) which computes the best average semi-variance, variance, and expected return efficient frontier surface versus a baseline surface. Any algorithm would be disqualified if any single test took longer than 10 minutes. Similarly any algorithm would be disqualified if it failed to produce a “sufficient solution density and breadth” for S’ and S” on any test. Obviously, a standard benchmark computer would be required. Any OS, supporting software, etc could be used for purposes of benchmarking.
The benchmark computer would likely be a well-equipped multi-core system such as a 32 GB Intel i7-3770 system. There could be separate benchmarks for parallel computing, where the algorithm + hardware was tested as holistic system.
I propose these initial portfolio benchmarks for a variety of reasons. 1) Similar standardized benchmarks have been very helpful in evaluating and improving algorithms in other fields such as electrical engineering. 2) Providing a standard that helps separate statistically significant from anecdotal inference. 3) Illustrate both the challenge and the opportunity for financial algorithms to solve important investing problems. 4) Lowering barriers to entry for financial algorithm developers (and thus lowering the cost of high-quality algorithms to financial businesses). 5) I believe HAL0 can provide superior results.
Two mathematical equations have transformed the world of modern finance. The first was CAPM, the second Black-Scholes. CAPM gave a new perspective on portfolio construction. Black-Scholes gave insight into pricing options and other derivatives. There have been many other advancements in the field of financial optimization, such as Fama-French — but CAPM and Black-Scholes-Merton stand out as perhaps the two most influential.
When CAPM (and MPT) were invented, computers existed, but were very limited. Though the father of CAPM, Harry Markowitz, wanted to use semi-variance, the computers of 1959 were simply inadequate. So Markowitz used variance in his ground breaking book “Portfolio Selection — Efficient Diversification of Investments”.
Choosing variance over semi-variance made the computations orders of magnitude easier, but the were still very taxing to the computers of 1959. Classic covariance-based optimizations are still reasonably compute-intensive when a large number of assets are considered. Classic optimization of a 2000 asset portfolio starts by creating a 2,002,000-entry (technically 2,002,000 unique entries which, when mirrored about the shared diagonal, number 4,000,000) covariance matrix; that is the easy part. The hard part involves optimizing (minimizing) portfolio variance for a range of expected returns. This is often referred to as computing the efficient frontier.
The concept of semi-variance (SV) is very similar to variance used in CAPM. The difference is in the computation. A quick internet search reveals very little data about computing a “semi-covariance matrix”. Such a matrix, if it existed in the right form, could possibly allow quick and precise computation of portfolio semi-variance in the same way that a covariance matrix does for computing portfolio variance. Semi-covariance matrices (SMVs) exist, but none “in the right form.” Each form of SVM has strengths and weaknesses. Thus, one of the many problems with semi-covariance matrices is that there is no unique canonical form for a given data set. SVMs of different types only capture an incomplete portion of the information needed for semi-variance optimization.
The beauty of SV is that it measures “downside risk”, exclusively. Variance includes the odd concept of “upside risk” and penalizes investments for it. While not going to the extreme of rewarding upside “risk”, the modified semi-variance formula presented in this blog post simply disregards it.
I’m sure most of the readers of this blog understand this modified semi-variance formula. Please indulge me while I touch on some of the finer points. First, the 2 may look a bit out of place. The 2 simply normalizes the value of SV relative to variance (V). Second, the “question mark, colon” notation simply means if the first statement is true use the squared value in summation, else use zero. Third, notice I use ri rather than ri – ravg.
The last point above is intentional and another difference from “mean variance”, or rather “mean semi-variance”. If R is monotonically increasing during for all samples (n intervals, n+1 data points), then SV is zero. I have many reasons for this choice. The primary reason is that with ravg the SV for a straight descending R would be zero. I don’t want a formula that rewards such a performance with 0, the best possible SV score. [Others would substitute T, a usually positive number, as target return, sometimes called minimal acceptable return.]
Finally, a word about ri — ri is the total return over the interval i. Intervals should be as uniform as possible. I tend to avoid daily intervals due to the non-uniformity introduced by weekends and holidays. Weekly (last closing price of the trading week), monthly (last closing price of the month), and quarterly are significantly more uniform in duration.
Big Data and Heuristic Algorithms
Innovations in computing and algorithms are how semi-variance equations will change the world of finance. Common sense is why. I’ll explain why heuristic algorithms like Sigma1’s HALO can quickly find near-optimal SV solutions on a common desktop workstation, and even better solutions when leveraging a data center’s resources. And I’ll explain why SV is vastly superior to variance.
Computing SV for a single portfolio of 100 securities is easy on a modern desktop computer. For example 3-year monthly semi-variance requires 3700 multiply-accumulate operations to compute portfolio return, Rp, followed by a mere 37 subtractions, 36 multiplies (for squaring), and 36 additions (plus multiplying by 2/n). Any modern computer can perform this computation in the blink of an eye.
Now consider building a 100-security portfolio from scratch. Assume the portfolio is long-only and that any of these securities can have a weight between 0.1% and 90% in steps of 0.1%. Each security has 900 possible weightings. I’ll spare you the math — there are 6.385*10138 permutations. Needless to say, this problem cannot be solved by brute force. Further note that if the portfolio is turned into a long-short portfolio, where negative values down to -50% are allowed, the search space explodes to close to 102000.
I don’t care how big your data center is, a brute force solution is never going to work. This is where heuristic algorithms come into play. Heuristic algorithms are a subset of metaheuristics. In essence heuristic algorithms are algorithms that guide heuristics (or vise versa) to find approximate solution(s) to a complex problem. I prefer the term heuristic algorithm to describe HALO, because in some cases it is hard to say whether a particular line of code is “algorithmic” or “heuristic”, because sometimes the answer is both. For example, semi-variance is computed by an algorithm but is fundamentally a heuristic.
Heuristic Algorithms, HAs, find practical solutions for problems that are too difficult to brute force. They can be configured to look deeper or run faster as desired by the user. Smarter HAs can take advantage of modern computer infrastructure by utilizing multiple threads, multiple cores, and multiple compute servers in parallel. Many, such as HAL0, can provide intermediate solutions as they run far and deep into the solution space.
Let me be blunt — If you’re using Microsoft Excel Solver for portfolio optimization, you’re missing out. Fly me out and let me bring my laptop loaded with HAL0 to crunch your data set — You’ll be glad you did.
Now For the Fun Part: Why switch to Semi-Variance?
Thanks for reading this far! Would you buy insurance that paid you if your house didn’t burn down? Say you pay $500/year and after 10 years, if your house is still standing, you get $6000. Otherwise you get $0. Ludicrous, right? Or insurance that only “protects” your house from appreciation? Say it pays 50 cents for every dollar make when you resell your house, but if you lose money on the resale you get nothing?
In essence that is what you are doing when you buy (or create) a portfolio optimized for variance. Sure, variance analysis seeks to reduce the downs, but it also penalizes the ups (if they are too rapid). Run the numbers on any portfolio and you’ll see that SV ≠ V. All things equal, the portfolios with SV < V are the better bet. (Note that classic_SV ≤ V, because it has a subset of positive numbers added together compared to V).
Let me close with a real-world example. SPLV is an ETF I own. It is based on owning the 100 stocks out of the S&P 500 with the lowest 12-month volatility. It has performed well, and been received well by the ETF marketplace, accumulating over $1.5 billion in AUM. A simple variant of SPLV (which could be called PLSV for PowerShares Low Semi-Variance) would contain the 100 stocks with the least SV. An even better variant would contain the 100 stocks that in aggregate produced the lowest SV portfolio over the proceeding 12 months.
HALO has the power to construct such a portfolio. It could solve preserving the relative market-cap ratios of the 100 stocks, picking which 100 stocks are collectively optimal. Or it could produce a re-weighted portfolio that further reduced overall semi-variance.
[Even more information on semi-variance (in its many related forms) can be found here.]
When I Google for “green software” top searches are about software that helps other activities become greener, such as utility power optimization or HVAC optimization. This makes sense because “smart power” has a bigger footprint than compute power consumption. Other search results focus on green IT (information technology).
I would like to contribute to the dialog about green software itself. My definition of green technologies (green software, green hardware) is technology that consumes less power while producing comparable or better results. My introduction to green technology began with power-efficient hardware design 7 years ago. I learned that power savings equals performance improvement due to thermodynamic and other considerations. This mantra (less power equals more performance) is gradually transforming semiconductor design. Technology companies that understand this best (Intel, ARM, Samsung, Google) stand to benefit in the years ahead.
In general, for every watt of power consumed by compute hardware, a watt or more of building cooling power is required. Software’s true power consumption is about 2 times that of the compute power it consumes. Compute power includes system power (CPU, RAM, driver, peripherals, and power supply losses) plus power for networking (routers, switches, etc) plus other power consumers like network-attached storage.
Some of the electrical engineering compute jobs I run take a week running 24×7 to complete, and often run on multiple CPUs and on multiple computers. Each compute job easily consumes an average of 1 Kilowatt of compute power, hence an average of 2 KW of data center power. This works out to about 335 KW*h per run. This is about $33 worth of power and is enough to power the average home’s electric needs (not counting heating and cooling) for about 8 days.
Right now the “greenness” of software is a relative. Today the software development world doesn’t have the right models to compare whether, say, particular database software is more or less green than particular financial software. Software and IT professionals can, however, assess whether one specific portfolio optimization solution is more or less green than another.
Creating green software begins with asking the right questions. The fundamental questions are “How much power does our software consume, and how can we reduce it?” I started to ask myself these questions early in the development stage of HAL0 financial software. I realized that the software running fairly large computations on the same data over and over again. Computations like the 3-year volatility of an asset. I created a simple software cache mechanism that first checked if the exact complex computation had been performed before. If it had the cache simply returned the previous result. If not the computation was performed and the result was saved in the cache. The result was a 3X speed up and an approximately 3x improvement in performance per watt for the HAL0 portfolio-optimization software. The mantra that “power saved is performance gained” is even more true in the world of software.
In other words, green software design practices and focusing on faster software often lead to the same types of software improvement. The thought process to arrive at software improvement can be different enough to give software developers new perspectives on their algorithms and code. I found that some solutions that eluded me while looking for performance-enhancement (speed ups) were easily discover by thinking about power and resource inefficiencies. Similarly some software improvement that came quickly from profiling performance data would have been unlikely to occur to me when thinking about green software methods; it is only in retrospect that I saw their performance per watt benefit. The concept of “green software” is complimentary with other software concepts such lightweight software, rapid prototyping, and time-complexity analysis and optimization.
HAL0 portfolio-optimization software is designed to be green. It is designed to get more done with less power consumption. Like all other green software, greener means faster. Some of HAL0 speedups have come from thinking green, and other haves come from “thinking fast.” The speed and efficiency of HAL0’s core engine is high, but I already envision further improvements of 5 to 10X. It is simply a question of time to implement them.
Simply put, when one software product is more efficient than another, it runs faster and takes less time to solve the same problem. The less time software takes to run, the less power is consumed.
By way of illustration, consider the efficiency of a steam ship going from New York to San Fransisco before and after the Panama Canal was built. The canal was a technological marvel of its time, and it cut the journey distance from 13,000 miles to 5,000. It cut travel time by (more than) half, and reduced the journey’s coal consumption by 50%. The same work was performed, with the same “hardware” (the steamer), but in just 30 days rather than 60, and using half the fuel.
Faster run time is the most significant and most visible component of green software, but it is not the only significant factor. Other factors affecting how much power software consumes include:
Without getting too technical, I’ll briefly touch on each bullet point. A cache hit is when a CPU finds the information it needs in its internal cache memory, while a cache miss is when the CPU must send an off-chip request to the computer’s RAM to get the required data. A cache miss is about 100x slower than a cache hit, in part because the data has to travel about 10cm for a cache miss, versus about 5mm for a cache hit. The difference in power consumption between a cache hit and a cache miss easily be 20x to 100X, or more.
Most software starts out reasonably streamlined. Later, if the software is popular, comes a time when enhancement requests and bug reports come in faster than developers can implement them in a streamlined manner. Consequently many developers implement quick but inefficient fixes. Often this behavior is encouraged by managers trying to hit aggressive schedule commitments. The developers have intentions to come back and improve the code, but frequently their workload doesn’t permit that. After a while developers forget where the software “kludges” or hack are. Even worse, the initial developers either get reassigned to other projects or leave for other jobs. The new software developers are challenged learn the unfamiliar code and implement fixes and enhancements — adding their own cruft. This is how crufty, bloated software emerges: overworked developers, focused on schedule over software efficiency, and developer turnover.
Modern CPUs have specialized instructions and hardware for different compute operations. One example is Intel SSE technology which features a variety of Single-Instruction, Multiple-Data (SIMD) extensions. For example, SSE4 (and AVX) can add 4 or more pairs of numbers (2 4-number vectors) in one operation, rather than 4 separate ADD operations. This reduces CPU instruction traffic and saves power and time.
Finally algorithm scalability is increasingly important to modern computing and compute efficiency. Scalability has many meanings, but I will focus on the ability of software to use multiple compute resources in parallel. [Also known as parallel computing.] Unfortunately most software in use today has limited or no compute-resource scalability. This means that this software can only use 1 core of a modern 4-core CPU. In contrast, linearly-scalable software could run 3x faster by using 3 of the 4 cores at full speed. Even better, it could run 3x faster on 4 cores running at 75% speed, and consume about 30% less power. [I’ll spare you the math, but if you are curious this link will get you started.]
“Distributed Software” is Greener
Distributed computing is technology that allows compute jobs to be distributed into the “cloud” or data center queue. Rather than having desktop workstations sitting idle much of the day, a data center is a room full of computers that direct compute jobs to the least busy computers. Jobs can be directed to the computers best-suited to a particular compute request. Intelligent data centers can even put unused computers into “deep sleep” mode that uses very little power.
I use the term distributed software to mean software that is easily integrated with a job-submission or queuing software infrastructure. [Short for distributed-computing-capable software.] Clearly distributed software benefits directly from the efficiencies of a given data center. Distributed software can also benefit from the ability to run in parallel on multiple machines. The more tightly-coupled with the capabilities and status of the data center, the more efficiently distributed software can adapt to dynamic changes.
Sigma1 Software is Green
Sigma1 financial software (code-named HAL0) has been designed from the ground up to be lean and green. First and foremost, HAL0 (code-named in honor of Arthur C. Clarke’s HAL 9000 — “H-A-L is derived from Heuristic ALgorithmic (computer)”) is architected to scale near-linearly to tens or hundreds of cores, “sockets”, or distributed machines. Second, the central kernel or engine is designed to be as light-weight and streamlined as possible — helping to reduce expensive cache misses. Third, HAL0 uses Heuristic Algorithms and other “AI” features to efficiently navigate astronomically-large search spaces (10^18 and higher). Fourth, HAL0 uses an innovative computation cache system that allows repeated complex computations to be looked up in the cache, rather than recomputed. In alpha testing, this feature alone accounted for a 3X run-time improvement. Finally, HAL0 portfolio software incorporates a number of more modest run-time and power-saving features such as coding vector operations explicitly as vector operations, thus allowing easier use of SIMD and possibly GPGPU instructions and hardware.
Some financial planners still use Microsoft Excel to construct and optimize portfolios. This is slow and inefficient — to say the least. Other portfolio software I have read about is an improvement over Excel, but doesn’t mention scalability nor heuristic algorithms. It is possible, perhaps likely, that other financial software with some the capabilities of HAL0 exists. I suspect, however, that if it does, it is proprietary, in-house software that is not for sale.
A Plea for Better, Greener Software
In closing, I’d like the software community to consider how the efficiency (or inefficiency) of their current software products contribute to world-wide power consumption. Computer hardware has made tremendous strides to improving performance/power in the last ten years, and continues to do so. IT and data-center technology is also becoming more power efficient. Unfortunately, most software has been trending in the opposite direction — becoming more bloated and less efficient. I urge software developers and software managers to consider the impact of the software they are developing. I challenge you to consider, probably for the first time, how many kilowatt- or megawatt-hours your current software is likely to consume. Then ask yourself, “How can I reduce that power?”
Most of the reading I have done regarding angel investing suggests that finding the right “match” is a critical part of the process. This process is not just about a business plan and a product, it is also about people and personalities.
Let me attempt to give some insight into my entrepreneurial personality. I have been working (and continue to work) in a corporate environment for 15 years. Over that time I have received a lot of feedback. Two common themes emerge from that feedback. 1) I tend to be a bit too “technical”. 2) I tend to invest more effort on work that I like.
Long Story about my Tech Career
Since I work in the tech industry, being too technical at first didn’t sound like something I should work on. I eventually came to understand that this wasn’t feedback from my peers, but from managers. Tech moves so fast that many managers simply do not keep up with these changes except in the most superficial ways. (Please note I say many, not most). While being technical is my natural tendency, I have learned to adjust the technical content to suite the composition of the meeting room.
The second theme has been a harder personal challenge. Two general areas I love are technical challenges and collaboration. I love when there is no “smartest person in the room” because everybody is the best at at least one thing, if not many. When a team like that faces a new critical issue — never before seen — magic often occurs. To me this is not work; it is much closer to play.
I have seen my industry, VLSI and microprocessor design, evolve and mature. While everyone is still the “smartest person in the room”, the arrival of novel challenges is increasingly rare. We are increasingly challenged to become masters of execution rather than masters of innovation.
Backing up a bit, when I started at Hewlett-Packard, straight out of college, I had the best job in the world, or darn near. For 3-4 months I “drank from a fire hose” of knowledge from my mentor. After just 6 months I was given what, even in retrospect, was tremendous responsibilities (and a nice raise). I was put in charge of integrating “logic synthesis” software into the lab’s compute infrastructure. When I started, about 10% of the lab’s silicon area was created via synthesis; when I left 8 years later about 90% of the lab’s silicon was created via logic synthesis. I was part of that transformation, but I wasn’t the cause — logic synthesis was simply the next disruptive technology in the industry.
So why did change companies? I was developing software to build advanced “ASICs”. First the company moved ASIC manufacturing overseas, then increasingly ASIC hardware design. The writing was on the wall… ASIC software development would eventually move. So I made a very difficult choice and moved into microprocessor software development. Looking back now, this was the likely the best career choice I have ever made.
Practically overnight I was again “drinking from a fire hose.” Rather than working with software, my former teammates and I had built from scratch, I was knee-deep in poorly-commented code that been abandoned by all but one of the original developers. In about 9 months my co-developer and I had transformed this code into something that resembled properly-architected software.
Again, I saw the winds of change transforming my career environment: this time, microprocessor design. Software development was moving from locally-integrated hardware/software design labs to a centralized software-design organization. Seeing this shift, I moved within the company, to microprocessor hardware design. Three and a half years later I see the pros and cons of this choice. The largest pro is having about 5 times more opportunities in the industry — both within the company, and without. The largest con, for me, is dramatically less software development work. Hardware design still requires some software work, perhaps, 20-25%. Much of this software design, however, is very task-specific. When the task is complete — perhaps after a week or a month — it is obsolete.
A Passion for Software and Finance
While I was working, I spent some time in grad school. I took all the EE classes that related to VLSI and microprocessor design. The most interesting class was an open-ended research project. The project I chose, while related directly to microprocessor design, had a 50/50 mix of software design and circuit/device-physics research. I took over the software design work, and my partner took on most of the other work. The resulting paper was shortened and revised (with the help of our professor and third grad student) and accepted for presentation at the 2005 Society of Industrial and Applied Mathematics (SIAM) Conference in Stockholm, Sweden. Unfortunately, none of us where able to attend due to conflicting professional commitments.
Having exhausted all “interesting” EE/ECE courses, I started taking grad school courses in finance. CSU did not yet have a full-fledged MSBA in Financial Risk Management program, but it did offer a Graduate Certificate in Finance, which I earned. Some research papers of note include “Above Board Methods of Hedging Company Stock Option Grants” and “Building an ‘Optimal’ Bond Portfolio including TIPS.”
Software development has been an interest of mine since I took a LOGO summer class in 5th grade. It has been a passion of mine since I taught myself “C” in high school. During my undergrad in EE, I took enough CS electives to earn a Minor in Computer Science along with my BSEE. Almost all of my elective CS courses centered around algorithms and AI. Unlike EE, which at times I found very challenging, I found CS courses easy and fun. That said, I earned straight A’s in college, grad and undergrad, with one exception: I got a B- in International Marketing. Go figure.
My interest in finance started early as well. I had a paper route at the age of 12, and a bank account. I learned about compound interest and was hooked. With help from my Dad, and still 12 years old, I soon had a money market account and long-maturity zero-coupon bond. My full-fledged passion for finance developed when I was issued my first big grant of company stock options. I realized I knew quite a bit about stocks, bonds, CD’s and money market funds, but I knew practically nothing about options. Learning about options was the primary reason I started studying Finance in grad school. I was, however, soon to learn about CAPM and MPT, and portfolio construction and optimization. Since then, trying to build the “perfect” portfolio has been a lingering fascination.
Gradually, I began to see flaws in MPT and the efficient-markets hypothesis (EMH). Flaws that Markowitz acknowledged from the beginning! [Amazing what you can learn from going beyond textbooks, and back to original sources.] I read in some depth about the rise and demise of Long-Term Capital Management. I read about high-frequency trading methods and algorithms. I looked into how options can be integrated into long-term portfolio-building strategies. And finally, I started researching the ever-evolving field of Post-Modern Portfolio Theory (PMPT.)
When I finally realized how I could integrate my software development skills, my computer science (AI) background, my graduate EE/ECE work and my financial background into a revolutionary software product, I was thunderstruck. I can and did build the alpha version of this product, HAL0, and it works even better than I expected. If I can turn this product into a robust business, I can work on what I like, even what I love. And that passion will be a strength rather than a “flaw”. Send me an angel!
The first business plan I wrote was a basic outline for a small residential real-estate venture. It detailed the property, the company equity in the property, re-curing expenses, estimated value of the property, competitive rental market data and expected cash flow. This simple, one-page, business plan helped secure a $10,000 private loan, that has hence been repaid. This business is still operating successfully and profitably.
Putting together a business plan for a start up is a different matter. The financials are not there yet, and financial forecasting is at best a guess. Instead the business plan must focus on ideas and products that serve to fill a gap in the target market. It must also demonstrate why this company and this product is well-positioned to fill that market need. Next, I’ll strive to write a start up business plan…
Sigma1 Financial: A Business Plan for Revolutionizing Financial Portfolio Software.
The Market — Sigma1’s market analysis reveals a stunning gap in the B2C financial software space. The exists plenty of portfolio analysis software, but nothing that is truly portfolio-optimization software. I refer the reader to two prime examples: 1) Quicken Premier 2012 (R) and 2) Financial Engines (R). Both tools help investors manage and analyze investment portfolios. They help with tracking asset allocations. Financial Engines goes further by providing portfolio advice on increasing or decreasing risk level and changing allocations between the following asset classes: cash, bonds, large-cap stocks, mid/small-cap stocks, and international stocks. In some cases Financial Engines partners with other firms and recommended changes can be implemented automatically.
Competing products tend to focus on broad market sectors and have little to no support for individual stocks and non-traditional-asset-class ETFs (such as gold ETFs, sector ETFs, convertible securities ETFs).
Market analysis of the B2B space is more challenging because publicly available product data is scarce. Nonetheless, in online social media conversations with investment professions, several features of Sigma1 software appear to be unique. For now market analysis of the B2B space is an ongoing process.
Core Product(s) — The Sigma1 Financial Engine, presently code-named HAL0 (HAL zero), is based, in part, on heuristic modeling, machine learning, and evolutionary algorithms. HAL0 has gone through rigorous alpha testing and has proven itself to be very robust for alpha code. Surrounding the HAL0 core are both traditional and proprietary financial heuristic and quantitative investment models. These models have been transformed into utility functions that plug into the HAL0 optimization engine. Additionally, there are scripts and add-ons that enable 2-D and 3-D data visualization using standard Open Source tools such as gnuplot.
Beyond the Products:
The software developer: I have been coding and investing since I was ten years old. In college (undergrad) I earned a degree in Electrical Engineering with a minor in Computer Science, graduating with a 3.97 GPA. After my undergrad work I developed electrical engineering software for Hewlett Packard, Agilent Technologies, and Intel Corporation. I lead software development on a 5-person team that created the “silicon construction” engine used by 200+ engineers in the R&D lab.
While working and HP and Intel, I took graduate-level coursework in both Finance and Electrical Engineering. During that coursework, my partner and I created software that used EA and heuristic methods to quickly solve difficult non-linear engineering problems. It was years later, that I realized these methods could be adapted to optimize financial portfolios… using not only classical modern portfolio theory (MPT), but also other methods beyond MPT.
I also manage a proprietary trading fund within Balhiser LLC and have written over 150 investing articles posted at balhiser.com.
Software Infrastructure and Development Model: There is a crucial difference between undisciplined “coding” and real software development. Both can create software that works in the present moment. A structured software development model, however, creates software with a future.
Sigma1’s HAL0 software development environment (SDE) includes a revision control system, software regression tests, unit tests and tailored software testing support and debug tools. Some of the regression tests required special effort to work with the (pseudo-random) algorithms used in the software. Careful use of srand() and rand() calls allows HALo to maintain robust regression testing capability.
Further, a revision history and log, dates back to day 1 of software development. It charts what bugs were found and how they were fixed. It explains what tradeoffs were made and why. The revision history and comments in the software suggest possible improvements.
The HALo SDE makes it much easier to add and test both run-time improvements and new features. And, it would allow other software developers to more quickly come up to speed on the code. This would allow new developers to collaborate on or even take over HAL0 software development.
Marketing, Branding, Company Structure: Currently Sigma1 and HALo IP and assets are held within Balhiser LLC. Among these assets are approximately 40 registered domain names, about 20 of which are suited towards portfolio optimization software. My goal is to secure US trademarks on one or more of these names. Obviously overt disclosure of these would be unwise at this time.
Having learned that building web and social media awareness is not an overnight process, I have begun building that online presence using a variety of avenues, including sigma1.com. This process in is its early stages, but Googling “Sigma1 Financial Software” or “Financial Software Heuristics” produce page 1 results.
The Business Model — The Sigma1 Financial Software model includes both B2B and B2C components. The B2B model is centered around leasing the portfolio optimization engine and software add-ons to money managers, institutional investors and/or investment advisers. The optimization is not limited to portfolios alone, but can also optimize funds and proprietary-trading accounts. Along with software leasing fees, businesses are likely to require training in the use of software. Limited training could be negotiated as part of the software lease agreement, however additional training will also be a revenue source. Finally consulting and custom-feature development may be additional revenue sources to the business.
The B2C component of the business model is currently planned to be completely internet-centric. A very limited free online version can serve 2 purposes. 1) As a marketing tool to induce users to pay for a full-featured subscription-based model, 2) possibly, as a source of ad revenue. A full-featured paid-subscription B2C version would be ad-free and feature larger portfolios and greater investment modeling, optimization, and visualization features.
* Product names, logos, brands, and other trademarks featured or referred to within this document are the property of their respective trademark holders.
Building superior investment portfolios is what money managers are paid to do. As a fund manager, I wanted software to help me build superior, positive-alpha portfolios.
Not finding software that did anything like I wanted, I decided to write my own.
When I build or modify a portfolio I start with investment ideas. Ideas like going short BWX (international government debt) and long JNK (US junk bonds). I want some US equity exposure with VTI and some modest buy-write protection through ETB. And I have a few stocks that I believe are likely to outperform the market. What I’d like is portfolio software that will take my list of stocks, ETFs, and other securities and show me the risk/reward trade off for a variety of portfolios comprised of these securities.
Before I get too far ahead of myself, let me explain the above graphic. It uses two measures of risk and a proprietary measure of expected return. The risk measures are 3-year portfolio beta (vs. the S&P500), and sector diversification. This risk measures are transformed into “utility metrics”, which simply means bigger is better. By maximizing utility, risk is minimized.
The risk utility metrics (or heuristics) are set up as follows. 10 is the absolute best score and 0 the worst. In this graph a beta of 1.0 results in a beta “risk metric” of 10. A beta of infinity would result in a beta risk metric of 0. For this simulation, I don’t care about betas less than 1, though they are not excluded. The sector diversification metric measures how closely any portfolio matches sector market-cap weights in the S&P 500. A perfect match scores a 10. The black “X” surrounded by a white circle denotes such a perfectly balanced portfolio. In fact this portfolio is used to seed the construction of the wide range of investment portfolios depicted in the chart.
On thing is immediately clear. Moving away from the relative safety of the 10/10 corner, expected returns increase, from 7.8% up to 15%. Another observation is that the software doesn’t think there is much benefit in increased beta (decreased beta metric) unless sector diversification is also decreased. [This is the software “talking”, not my opinion, per se.]
The contour lines help visualize the risk tradeoffs (trading beta risk for non-diversification risk) for a particular expected rate of return. The pink 11% return contour looks almost linear — an outcome I find a bit surprising given the non-linear risk-estimation heuristics used in the modeling.
For all that the graphic shows, there is much it does not. It does not show the composition or weightings of securities used to build the 100 portfolios whose scores appear. That data appears in reports produced by the portfolio-tuner software. The riskiest, but highest expected-return portfolios are heavy in financials and, intriguingly, consumer goods. More centrally-located portfolios, with expected returns in the 11% range, are over-weighted in the basic materials, services (retail), consumer goods, financial, and technology sectors.
Back to the original theme: desirable features of financial software — particularly portfolio-optimization software. For discussion, let’s assign the codename HAL0 (HAL zero in homage to HAL 9000) to this portfolio-optimizing software. I don’t want dime-a-dozen stock/ETF screeners, but I do want software that I can ask “HAL0, help me build a complete portfolio by finding securities that optimally complements this 70% core of securities.” Or “HAL, let’s create an volatility-optimized portfolio based on this particular list of securities, using my expected rates of return.” Even, “HAL, forget volatility, standard-deviation, etc, and use my measures of risk and return, and build a choice of portfolios tuned and optimized to these heuristics”.
These are things the alpha version of HAL0 can do today (except for understanding English… you have to speak HAL’s language to pose your requests). The plot you see was generated from data generated in just under 3 hours on an inexpensive desktop running Linux. That run used 10,000 iterations of the optimization engine. However 100 iterations, running in a mere 2 minutes, will produce a solution-space that is nearly identical.
HAL0 supports n-dimensional solution spaces (surfaces, frontiers), though I’ve only tested 2-D and 3-D so far. The fact that visualizing 4-D data would probably involve an animated 3-D video makes me hesitate. And preserving “granularity” requires an exponential scaling in time complexity. Ten data points provides acceptable granularity for a 2-D optimization, 100 data points is acceptable for 3-D, and 1000 data points for 4-D. Under such conditions the 4-D sim would be a bit more than 10x slower. If a granularity of 20 is desired, the 3-D sim would be slowed by 4X, and a 4-D optimization by an additional 8X. I have considered the idea that a 4-D optimization could be used for a short time, say 10 iterations and/or with low granularity (say 8), and then one of the utility heuristics could be discarded and 3-D optimization (with higher depth and granularity )could continue from there… nothing in the HALo software precludes this.
HAL0 is software built to build portfolios. It uses algorithms from software my partner and I developed in grad school to solve engineering problems– algorithms that built upon evolutionary algorithms, AI, machine learning and heuristic algorithms. HAL0 also incorporates ideas and insights that I have had in the intervening 8 years. Incorporated into its software DNA are features that I find extremely important: robustness, scalability and extensibility.
Today HAL0 can construct portfolios comprised of stocks, ETFs, and highly-liquid bonds and commodities. I have not yet figured out a satisfactory way to include options, futures, or assets such as non-negotiable CDs into the optimization engine. Nor have I implemented multi-threading nor distributed computing, though the software is designed from the ground up to support these scalability features.
HAL0 is in the late alpha-testing phase. I plan to have a web-based beta-testing model ready by the end of 2012.
Disclaimer: Do not make adjustments to your investment portfolio without first consulting a registered investment adviser (RIA), CFP or other investment professional.
Personally the easiest part of the financial software business is software development. I have been involved with sales before and feel reasonably confident about this aspect of the business. The primary challenge for me is marketing.
Sales is a face-to-face process. Software development is either a solo process or a collaborative process usually involving a small group of developers. Marketing is very different. It is a one-to-many (or few-to-many) situation. Striking a chord with the “many” is a perpetual challenge because the feedback is indirect and slow. With marketing, I miss the face-to-face feedback and real-time personal interaction.
Knowing that marketing is not my strongest point, I have put extra effort into SEO, SEM, social media, and web marketing. Over the past couple weeks I have purchased about 20 new domains. Market and entrepreneurial research has shown me that a good idea, a good product, and a good domain name are not sufficient to achieve my business goals. I realize that solid branding and trademarks are also important.
As a holder of 4 U.S. patents, I understand the importance of IP protection. However, I am ideologically opposed to patents on software, algorithms, and “business processes.” Therefore I feel that I must focus on branding, trademark protection, trade-secret protection, and copyright protection.
My redoubled marketing efforts have been exhausting and I hope they will pay off. Next I plan to get back to software creation and refinement.