Move Slowly, and Don't Break Things

As we make the move from digital to cognitive, the tech industry should be thinking a lot harder about putting safety first.


“If you are not embarrassed by the first version
of your product, you’ve launched too late.”
~ Reid Hoffman
 

How’s that for a quote that hasn’t aged very gracefully?

For many years, move fast and break things was gospel in Silicon Valley. Facebook’s classic mantra represented a philosophy of trying out new ideas quickly so you could see if they survived in the marketplace. If they did, you refined them; if they didn’t, you could throw them away without blowing more time and money on development.

That approach was ideal for a world where digital technologies were exploding into popular usage. Software engineers could deploy fast, safe in the knowledge that it was just code and that any mistakes or bugs could be fixed on the fly. A generation of startups swallowed the principle whole and an entire industry of consultants and soundbytes grew up around it, the Lean Startup, fail fast, agile, Scrum, done is better than perfect.

Today, a new class of technologies is emerging that’s calling that mindset into question. We live in a world where ‘digital’ is no longer a glorified marketing department in a company, or an economic sector on its own, but a layer over everything. More than half of humanity is now online, using the internet every day. And as digital evolves into ‘cognitive,’ design decisions that may have seemed inconsequential can turn out to have significant and irreversible consequences further down the line. Don’t be evil and Move fast and break things served their purpose for a while, but they’re now too vague and too dangerous for services that are deeply interwoven into the daily lives of billions of people.

Doesn’t exactly have the same ring, but maybe their replacements could be, “is this algorithm toxic?”


In March 2018, an Uber car being driven by code on the streets of Phoenix, Arizona, hit and killed a woman named Elaine Herzberg, who was crossing the darkened road with her bicycle. It was the first fatal crash involving a vehicle driven by a computer. As Slate reports:

Uber’s sensors first perceived Herzberg about six seconds before impact — more than twice the commonly accepted reaction time of 2.5 seconds. But the sensors struggled to classify Herzberg (first as an unknown object, then as a car, then as a bicycle) and determine her expected path across the road. At 1.3 seconds before impact, the system determined emergency braking was required, a function that was disabled under computer control “to reduce the potential for erratic vehicle behaviour.”

The way the algorithm was designed to operate, and to be used, didn’t take certain factors into account, including human error, and ended up costing Elaine Herzberg her life.

  View of the self-driving system data playback at about 1.3 seconds before impact, when the system determined an emergency braking maneuver would be needed to mitigate a collision. The emergency brakes were turned off because there was a human driver in the seat. Yellow bands are shown in meters ahead. Orange lines show the center of mapped travel lanes. The purple shaded area shows the path the vehicle traveled, with the green line showing the center of that path. (Source:    NTSB   )

View of the self-driving system data playback at about 1.3 seconds before impact, when the system determined an emergency braking maneuver would be needed to mitigate a collision. The emergency brakes were turned off because there was a human driver in the seat. Yellow bands are shown in meters ahead. Orange lines show the center of mapped travel lanes. The purple shaded area shows the path the vehicle traveled, with the green line showing the center of that path. (Source: NTSB)

 

Traditionally, the tech industry didn’t have to worry about stuff like this. Code was always relatively harmless. Compared to transportation or medicine, daily interaction with pain, physical harm or death was a lot less likely if you were writing software for apps. The industry was dominated by engineering culture. In order to fulfil the ethical warrant of the profession, all you had to do was make the product work. It was up to other people to figure out the applications or the social mission for your product.

That’s no longer the case.

Software can now propel giant hunks of steel and metal into human flesh, or be turned into propaganda by foreign agents.

The big tech companies have been struggling to handle this, the unintended consequences of their build-it-first mindset. They didn’t expect algorithms designed to maximise advertising spend to end up prioritising lewd content for children watching videos, to amplify mental health issues for teenagers, or to be manipulated into weaponised narratives to influence elections. Facial recognition software was supposed to help customers unlock their phones, and now authoritarian regimes are using it to identify ethnic minorities, or target political dissidents via credit scoring systems. Developers still seem a little baffled when told that the systems they’ve built (systems that are clearly working very well) are corrupting the public sphere.

In part, this is due to the newness of the industry. Doctors for example, are technical people too, but they’ve been trained on a code of ethics developed over a very long time: first, do no harm. The tech sector has had a different ethos: build first and ask for forgiveness later. Now they’re being called to account, as a result of becoming too powerful and too dangerous, too quickly.

The geeks need to do more than acknowledge criticisms, and start taking responsibility for their actions. They need to move beyond their obsession with customer centricity, and start thinking about how to serve society in general. As Laura Norén, a lecturer on data science ethics at NYU says, “We need to teach them that there’s a dark side to the idea that you should move fast and break things. You can patch the software, but you can’t patch a person’s reputation, or their body.

Shipping scrappy software should no longer be an option. In the rush to get product out the fastest way, time-consuming things like user testing, automation, analytics, monitoring, and manual testing get skipped. That’s fine if you’re building a food delivery app, but if you’re building a neural network that lives on millions of devices, or manufactures 5,000 widgets a minute, you can’t afford bad design decisions, or poor oversight. Yes, exponential technologies can radically alter business models — but they also amplify your mistakes.

Blockchain, another nascent foundational technology, is even less suited to Silicon Valley’s traditional mindset. With machine learning, the code can at least be retracted and edited. Blockchain developers have no such luxury. Once a mistake has been deployed, it’s part of the permanent record. In a centralised system it’s possible to fix your bug. In a decentralised one that’s immutable by design, it’s impossible. There is no move fast and break things in a blockchain. If you break things, you lose consistency and the blockchain becomes corrupted and worthless.

The crypto geeks learned that lesson the hard way in the case of the DAO hack on the Ethereum blockchain two years ago. Thrilled at the possibility of using complex smart contracts to run a decentralised corporation entirely on code, they rushed it into production, without proper quality assurance. The code however, turned out to have multiple bugs, including one that a hacker was able to exploit a few weeks later, draining $70 million dollars of funds in a few hours.

The Ethereum community was forced to implement a hard fork, creating an entirely new version of the database with the sole function of returning all the Ether taken from the DAO to a refund smart contract. While the fork solved the problem, it was a rude awakening for a community forged on romantic ideas about the hacker mentality. Part of the reason blockchain development seems to be moving a lot slower these days is that core programmers realise they can’t afford to move as fast, and repeat those kind of mistakes.

0_IKQMCO8B9VN8dWfb.jpg

So how should software development be approached?

Ironically, the answer might be found in industries that have always been criticised for being slow and old fashioned. Engineers can’t afford to build bridges with design flaws. Doctors can’t afford to prescribe the wrong medicines and then fix their mistake a few days later. Airlines can’t launch a product with 90% assurance. Software developers would do well to take a leaf from their book.

Aviation is a particularly good example. In 2017, commercial aviation flew over 4 billion passengers on 38 million flights without a single fatality in a scheduled jet airliner. It’s a remarkable example of what can be accomplished when political will, resources and expertise become focused on reducing accidents and injury. It happened because the industry realised that individual efforts were never going to be good enough. Because real lives were at stake, ever-rising expectations forced airlines to constantly improve. Don’t be evil didn’t cut it, only zero fatalities would do. The standard wasn’t do better, it was be perfect.

Manufacturers had to build better, safer planes with improved design and performance. Pilots improved their skills. Regulators provided improved oversight, and accident investigators generated better analysis of the decreasing number of accidents. Flight attendants improved evacuations, and dispatchers built new tools to make better decisions. Maintenance technicians improved procedures to enhance reliability and safety.

At the core of this success has been a simple tool: the checklist. Technicians and pilots are forced to step through an exhaustive, boring and very predictable set of instructions before every flight. Checklists aren’t sexy, and you certainly won’t see them in the decks of thought leaders at expensive conferences on innovation. But they are effective. As Zach Holman points out in his excellent 2014 talk, checklists remove ambiguity. All the debate happens before something gets added to them, not at the end. That means when you’re about to launch a product (or take off from a runway) you need to worry less about implementation and more about process.

Good dev shops of course, do this as a matter of course. At Apple, they have an internal checklist that goes into great detail about the process of releasing a product from beginning to end, from who’s responsible to who needs to be looped into the process before it goes live. Even before a team starts working on something, they make a checklist to prep for it. Do we have appropriate access to development and staging servers? Do we have the correct people on the team? When you’re done, you check it off. Easy to collaborate on, and easy to understand.

At Github, they approach it slightly differently. While moving fast and breaking things is fine for some features, for others it’s not. Their first step is identifying what they cannot afford to break, for example, things like billing code, upgrades and data migrations. Once they’ve identified these areas the challenge becomes how to leave them untouched, or at least, get 100% assurance on any changes, while still making fast and small edits in other areas. It’s like changing an engine while a car is running (and just as tricky). Upon deployment they then run simultaneous versions of the software. In a nutshell, the idea is running both the old and the new code, and only switching to the new code if it performs at least as well as the old version over a significant period of time.

Ultimately though, these solutions don’t fix the underlying problem. Facebook for example, changed its motto in 2014 to “Move Fast With Stable Infra” (catchy right?) implementing more automated tests, better monitoring and extra infrastructure to help identify bugs as early as possible. None of that helped them when fake news, election hacking and data privacy blew up in their faces. Their problem wasn’t technical or procedural — it was cultural. Engineers at the company simply weren’t able to conceive of use cases where their product could be abused by people who didn’t share their worldview.

0_9vQUAZTvbfYwA2Vc.png

That’s why the next generation of technology engineers needs better training. Medical students spend a lot of their undergraduate years been taught to think critically about the ethical implications of their decisions. The same should be true of software engineers. And the good news is that this does seem to be slowly happening. Microsoft has a Hippocratic Oath for artificial intelligence, and Stanford University, the academic heart of Silicon Valley, is developing a computer science ethics course to train the next generation of technologists and policymakers to consider the ramifications of innovations like autonomous weapons or self-driving cars before they go on sale. As Mehran Sahami, one of the course’s conveners says, “Technology is not neutral. The choices that get made have social ramifications.

Ultimately, the safest bulwark against baking bad ideas and flaws into code is diversity. Groupthink is a deadly enemy in a world of hyper connectivity, exponential technologies, and unintended consequences. Diversity mitigates it. The ability to draw on contrasting worldviews, and to run ideas through teams that differ across gender, age, political affiliation, race, neurology, class, profession and cultural background ends up building far more robust products that are do less harm once unleashed upon the world.

We know these are aims shared by the vast majority of people working in tech. It’s an industry that’s united by a belief that digital technologies are a remarkable tool for improving human lives in a truly transformational manner. The question is: can tech people be better stewards of what they’ve built? Can they learn to move a little slower, and stop breaking as many things, in the interests of building a society that works for everyone?

We’re all going to find out, one way or another.