Preventing Crashes

Lessons for the SEC from the Airline Industry

by Chris Clearfield, András Tilcsik, and Benjamin Berman
A small error on August 1, 2012 nearly bankrupted the Knight Capital Group. Code from a discontinued software component was accidentally reused after nine years, and in just 45 minutes Knight’s automated order router had flooded the market with millions of unintended orders. Knight lost $460 million when it sold back the inadvertently traded stocks—a staggering $10 million dollars per minute.
Such disruptions—where a single technical error can cripple broad swathes of the market in a matter of minutes—are the natural consequence of changes to the stock trading industry in the past decade.  First, a tightly connected national securities market has replaced independently operated regional exchanges, following the implementation of the U.S. Securities and Exchange Commission’s (SEC’s) Regulation National Market System. Additionally, recent technological advances, including the growth of the internet and rapidly increasing computing power, have helped shift the majority of trades to high-speed, automated electronic transactions.
With individual exchanges linked to form a single national market, securities trading has become a complex system—the operation of each exchange involves a multitude of distinct functions that all need to be working properly to process incoming orders, disseminate market data, and execute trades. These functions include linking each exchange to broker-dealer participants and to other exchanges; the continuous matching of buy orders with sell orders; and the real-time reporting of quotes and volume data.  Many of these functions are tightly coupled, meaning that the failure of one quickly exerts a significant effect on the rest of the system. Yale sociologist Charles Perrow has shown that in such systems, unexpected interactions between components can turn a trivial incident into a major meltdown before operators are able to understand and address the situation.Regulators have been struggling to adjust to these fundamental shifts in the securities industry.  Their traditional role has been to protect investors from bad actors—those misrepresenting information or misleading customers.  However, now the biggest risks to the stability of the market stem from unpredictable and unpreventable accidents. As a result, the standard regulatory tools of examination and enforcement have proven to be woefully inadequate.

Bad actors are generally identified by examining records, conducting background checks, and investigating customer complaints. But infrequent rule-oriented examinations cannot effectively identify the potential pitfalls in a complex system. The intricacy of proprietary trading software means that examiners are unable to understand the detailed workings of each firm’s unique trading system.  As a result, most errors identified in such investigations are self-reported and minor (otherwise they would have been uncovered due to their consequence, not during a routine examination).

Similarly, enforcement actions that are used to punish and eliminate bad actors are ineffective against errors caused by complexity. First of all, enforcement actions—like the SEC’s order issued after Knight’s failure—are perceived as punitive rather than instructive by the industry, and as such will elicit the minimum required level of cooperation.  Firms are likely to focus on the minutiae of a particular order in order to avoid disciplinary actions rather than step back and think critically about increasing the overall safety and reliability of their systems.  Even more troubling is that enforcement actions can increase the likelihood of market disruptions. For instance, a recent rule requires broker-dealers to continuously quote prices for securities in which they make markets; such a requirement makes firms unlikely to stop trading when they suspect a system malfunction—even in the face of catastrophic failure—because they are loathe to risk enforcement.

To mitigate the risk inherent in electronic trading, regulators should begin by recognizing that they share a key goal with broker-dealers: avoiding catastrophic failures that result in massive losses. The SEC needs to lead the charge to foster an industry-wide cultural emphasis on safety.  When firms suspect a technical glitch, they should be able to stop trading without fear of disciplinary action from regulators. And instead of waiting for problems to erupt, the SEC should proactively seek out bugs and potential adverse interactions between system components before they can disrupt the national market.  Regulators would benefit from a partnership with broker-dealers and from each firm’s expertise in its own electronic trading system.

The SEC should look to commercial aviation—another complex system—for a successful model of such a partnership between regulators and industry.

First, the aviation industry uses anonymous reporting to collect and share data on regulatory violations and near misses. Anyone, from maintenance technicians to flight crews and air traffic controllers, can self-report errors, and, as an additional incentive, proof of a submitted report will allow violators to avoid sanctions (assuming an absence of intent). By analogy, both securities regulators and trading firms can benefit from reports of errors in the deployment of critical software, even if those errors did not have direct consequences. These reports can shed light on the root causes of incorrect deployments and aid the development of procedures that can mitigate issues and prevent catastrophic failures.

Second, the Federal Aviation Administration (FAA) relies on airline operators to monitor the safety and reliability of their own systems.  After ValuJet 592 crashed in 1996, the FAA concluded that the complexity of modern airline operations made it difficult for regulators to reliably identify and correct systemic flaws that may lead to such catastrophes.  Today, the FAA uses enforcement to eliminate bad actors and maintain oversight, but routine monitoring is carried out through operator-implemented Safety Management Systems.  Airlines are expected to collect and analyze data from daily operations and to address any potential risks uncovered by the analysis.  In the context of finance, an analogous system would instruct broker-dealers to collect and analyze data from routine operations and from reported errors in order to identify potential risks. Any changes in operations to mitigate these risks would be followed up with internal audits to verify their effectiveness. The results of each firm’s risk management process would provide regulators with insights into important operational concerns.

Finally, commercial aviation benefits from the independent investigations of the National Transportation Safety Board (“NTSB”), which has the primary authority to identify the causes of aircraft accidents in the United States. Following its investigations, the NTSB makes non-binding recommendations to regulators and operators in order to prevent the recurrence of similar accidents.  Since the NTSB’s chief mission is safety, rather than enforcement, it is also able to consider the contribution of regulatory procedures to the accidents.  A similar approach would greatly enhance the stability and robustness of electronic trading.  An investigation of market failures with no disciplinary consequences would be instructive for market participants and would lead firms to focus on overall safety instead of on implementing the ad hoc solutions contained in SEC Orders. As an added benefit, independent investigations could take into account the role of regulators in shaping the environment that made the failure possible.

Embracing these three lessons from commercial aviation may seem like a radical gamble for the SEC. Moving from an emphasis on rule-based enforcement to a focus on industry-wide safety would require a radical cultural shift as well as new capabilities, particularly technical knowledge of operational issues. These barriers to change are difficult to overcome. But maintaining the status quo is a sure bet for increasingly frequent catastrophic failures that will ultimately shake the world’s confidence in the vitally important U.S. securities market.

This piece was originally published in the blog of the Kennedy School Review, the Harvard Kennedy School’s Public Policy’s Journal. We thank them for their editing feedback and their permission to repost it here.

This piece is based on a System Logic memorandum to the House of Representatives Committee on Financial Services Subcommittee on Capital Markets and Government-Sponsored Enterprises. The full memorandum can be found here and discusses each of these issues in more detail.

Author: Chris Clearfield and András Tilcsik

#BrackenBower winners Chris Clearfield and András Tilcsik on managing the risk of catastrophic failure.