PsyFi Search


Tuesday, 7 August 2012

Knight, Knight, Automated Trading Dreams

“The four most beautiful words in our common language: I told you so.” (Gore Vidal)
Glitch Driven Trading

The software glitch that pushed Knight Capital to the brink of extinction has simply reaffirmed two truths we’ve known for a long time (see: Rise of the Machines).  Firstly, the rise of machine driven trading is exposing markets to risks that aren’t manageable.  Secondly, the people regulating the markets don't understand, or don't want to acknowledge, what those risks actually are.

Behind this lies a fundamental issue: you can regulate to punish people retrospectively for their failures or you can limit innovation to reduce the probability of the issues happening in the first place.  It’s time to focus on the latter rather than the former because, if "this could happen to anyone" it could happen to you.


Each time some software problem causes markets to shudder we see a knee-jerk reaction by commentators and regulators.  Each time we see shock and horror expressed by people who don't seem to have any idea what software development really involves. It’s like an attempt to explain the Higgs boson by a fine arts student; or a journalist, which usually amounts to the same thing.

The first thing that these technical neophytes need to get fixed in their heads is that software is never perfect, never without risk and will always, eventually, go wrong.  This isn’t negotiable, no software can be made 100% reliable and programs used in securities markets will tend to be less reliable than that in other situations because of a combination of the competitive pressures and the adaptive nature of the markets themselves. This IEEE paper gives a summary of the things that can go wrong:
“If there's a theme that runs through the tortured history of bad software, it's a failure to confront reality.”

Take, for example, some of the most reliable software around, that used in fly-by-wire aircraft systems.  Typically these systems have in-built triple redundancy – basically you have three computers, running software from different sources, each of which is cross-checking the other computers in order to make a decision.  And even this sometimes goes wrong.

Usually when systems of this type fail it’s because they’ve been exposed to external conditions that the designers didn’t take into account.  Given that aircraft have to fly about in all sorts of weather conditions and that the weather isn’t entirely predictable (see: Whither Forecasting? The Butterfly Stirs) it’s not surprising that occasionally a glitch occurs.  This is why we have pilots as well as autopilots.  And sometimes this too goes wrong because of the way that the pilots interact with the software.

Computer on Computer

However, securities markets are if anything more unpredictable than global weather systems.  Firstly you have the problem that any system is going to be relentlessly probed by competitor systems looking for an edge.  It’s hard to get an accurate estimate but there’s fair evidence that a high proportion of trades these days may be between computers, possibly as high (or even higher) than 50%. Leaving aside what the purpose of such trades is, it’s easy to imagine two independently designed systems will expose different weaknesses in each other’s programming.  Now imagine that multiplied many times as dozens, if not hundreds, of these systems compete, and wonder not that we have glitches but that we have so few.

Secondly, the behavior of automated systems changes the market conditions – and may do so in such a way that the systems themselves are opened up to behaviors that they don’t expect.  Changing market conditions may change the behavior of human participants as well, so the whole system is continually in flux.

Easy Hard

Yet trading companies don’t have the luxury of aircraft designers, who are usually designing and testing their systems over periods of years, before declaring them stable.  Once tested and deployed aircraft software doesn’t change very much, yet still has the odd bug in it.  Trading software, on the other hand, will be being produced rapidly and usually under time pressure.  What chance perfection?

Part of the problem with software systems is that the people managing the developers often don’t understand the nature of coding.  They mistake the fact that it’s easy to change a line of code – after all, it’s just typing – for the fact that it’s hard to change a line of code and get it right.  Even the best designed software will contain unexpected coupling where changing a bit in one place can cause problems somewhere else: this is why testing is such a crucial part of the software development process. 

Unfortunately testing trading software in isolation from the market can still only take you so far.  Sometimes the only true test is when systems are deployed into the market.  Unfortunately, as Knight Capital has discovered, this can uncover problems that testing doesn't reveal. 


If this was an isolated and an isolatable problem then we could leave this to the retrospective actions of regulators and, no doubt, lawyers.  Unfortunately neither of these conditions applies.  It’s not an isolated problem – we’ve seen the 2010 flash crash (see: Fall of the Machines), and the Facebook IPO issues (see: Unfriend Those IPOs), including UBS’s spectacular losses therein, and there are no doubt lots of less obvious issues.  And it’s also not an isolatable problem, because these issues introduce systemic risk to the complex and globally integrated Heath Robinson contraption we call financial markets.

If coding cock-ups impact only the organization directly implicated then so be it.  Knight Capital has lost $440 million when, as an intermediary, they weren’t even supposed to have significant exposure to markets.  The firm's shareholders are paying the price for their company's failings.  This is how markets are supposed to work, and if companies can’t manage their own risks then they will ultimately suffer the consequences. The problem really occurs when this has systemic implications beyond the companies directly involved.

Avoiding Swans

Investors, of course, have several lessons to learn from this.  The first one might be to avoid all companies that trade in software risk because the exposure to negative Black Swans is all too obvious.  Better buy something that doesn't require you to trust the ability of caffeine addicted coders to get complex stuff right under tight deadlines. The second is that institutional investors should be demanding detailed explanations of how such companies are managing their software risks; and taking the appropriate actions if they don’t like what they see.

Regulators need to take a different approach, too.  The systemic risk of organizations failing to manage their software demands a proactive rather than a retrospective process for overseeing this.  Insisting on the equivalent of a pilot, and ensuring that the pilot can actually take control of the plane when required, would be a small first step.  Introducing a software audit process, which could be handled by existing standards bodies, to ensure that firms are meeting the development and testing processes required to allow what is mission critical software to be deployed into live markets is another.

A Greater Purpose

Of course, this still leaves open one question.  Is it right that automated trading systems should be allowed at all, because if these trades are purely to generate profits for securities firms but do so by exposing us all to greater risks then would we all benefit by outlawing them?  After all, none of this seems to contribute to the total sum of human happiness, or at least to the main purpose of securities markets which is to provide mechanisms for firms to raise capital and for investors to take partial and equitable ownership of companies. Of course, the argument is that this type of speculation has happened since the dawn of trading; but never before has it threatened the integrity of global markets.  New times require new thinking.

Still, as we saw in When Muddled Modellers Muddle Models, there are all sorts of model risks.  Recognizing that software models are always going to fail and operating on the assumption that there’s a problem just around the corner is the only safe way to proceed.  Hopefully managements and regulators will catch up soon, but there’s no need for investors to wait for them. 

Related articles 


  1. A big part of automated trading is just automated market making, which does benefit regular market users: look at what the spreads and commissions were like in 1960. It's hard to remove the zero-sum overlay without losing the market's core function as well. You could slow it down a bit I guess.

    The software is hard thing may be a bit overdone in this case. Unlike a flight control system, in trading you can just pull the plug when some really simple metrics go funny, which is not that hard to do in a fail safe way. Some people just can't be bothered, and that's fine, as long as the system itself has its own circuit breakers. It seems to have worked well here: an incompetent operator was taken out of the game, at the expense of risk-taking shareholders who should have known they were exposed to incompetence, without much adverse effect for outsiders. It probably even had a positive systemic effect by reminding other dilettante operators to check their systems. Perhaps regulators should not mandate audits -- that careless people work around anyway -- but organise a flash crash every six months to check everyone is still awake.

  2. Good theory, truly. I really like your ideas here. However, in this case, it appears to have been random chance that the rogue order injector ("tester" in nanex parlance) was stopped before depleting some multiple of Knight's capital. The excess loss (above it's own total capital) would have been absorbed by some other entity in the system, not just the offender's employees and shareholders. The tester theory of what happened is quite interesting, especially the difficulty of detecting what was going on by those who were unaware that the program had been launched. Consider the irony: the RLP aspects of the Knight trading system were in all likelihood extremely well tested and robust, and suitable for release into the marketplace. Tested by the tester, which created arbitrary patterns of possible order flow against the RLP software in the lab. But inadvertently launching the tester to spew liquidity-taking orders into the real market (not the lab) caused the hidden mayhem. They were literally losing the spread (and influencing it to grow) 100 times a second, in about 150 stocks. That's all it takes to lose 440M in 30 minutes. The tester itself probably worked flawlessly, it's a simple program. One error, if this is all true, is that they should never have allowed the tester to spew legally formed order messages. But then again, that would not have tested the RLP program against the actual bit streams it was going to see in production.

    I have a friend, a former Naval pilot who flew radar prop planes off aircraft carriers in the Mediterranean during the Vietnam war (relatively good gig at the time). The Navy has a manual, NATOPS, the operational manual for all Naval pilots. He used to tell me, it was written in blood. It was basically a compendium of all the fatal mistakes that had been made flying off aircraft carriers and the operational procedures that prevented the mistakes. So it will be with the electronic trading environment, one fatal error at a time. But the Navy kept flying planes, and similarly, we will continue to trade electronically. There's nothing that unusual about the evolution of the electronic trading infrastructure.

  3. Sounds highly likely, I've seen multiple examples of test software and scripts being accidentally deployed into live environments. It's usually a mess. Still, as you say, it really still comes back to process: and, as you say, most process is developed one error at a time.

    However, I think there is a difference with electronic trading: if a Navy fighter crashes then that's usually an isolated incident and can be learned from. If an automated trading system goes haywire then you might have the equivalent of planes falling out of the sky all around us - it's more like the automated aircraft identification system going wrong than an individual and localised error. It's (sort of) OK to learn the lessons one aircraft at a time I'm not sure we can afford to learn the lessons of this one airforce at time :-/

  4. As a more conventional fund manager we welcome dumb, fast or automated money in the market. It offers an opportunity for our investors to pick up stock at prices that woudl otherwise be unavailable.

    The last thing any fund manager wants is a market that is 100% rational and perfectly priced.

  5. "none of this seems to contribute to the total sum of human happiness"

    Not sure about this. How much does the ability of brokers to offer a cheap, effective service depend on their ability to very rapidly hedge against individual investors?

    If I, as a broker, end up exposed, rather than stopping my customers from doing what they want it's much better for all parties if I can hedge. Obviously, if I got a thousand customers, I'm placing and cancelling orders very rapidly, all the time as my book swings one way and the other.

  6. <a href=">Automated trading systems</a> has never been easy in this era, try Investment capital Systems software and see the difference.