Teaching machines right from wrong

The trouble with computers: you get exactly what you ask for. Ask a robot waiter to serve table eight’s food immediately and it might just throw it. An AI that’s trying to earn you money might decide the best way is to sell your house. It’s not enough to tell machines what to do; you also have to tell them what not to do.

Learning what we should and shouldn’t do takes time and practice. A great deal of early years education is really moral education, a slow process of teaching children right from wrong: share your toys, don’t lie, no hitting. Can we do the same for our artificial offspring?

Science fiction to reality

Any science fiction fan will recognise this as a cornerstone of many sci-fi films and books. Think of HAL refusing to open the pod bay doors, or the rogue Skynet launching the world’s nukes. Thankfully, the real-world applications of machine ethics to date are less dystopian. We’ve not really needed to teach computers right from wrong in practice, since we’ve used them mostly for specific and harmless tasks.

But the issue of moral machinery is more pressing now that AIs are tackling more complex and ill-defined problems. Even if we’re not yet dealing with homicidal spacecraft, we still want to stop a chatbot from telling a curious user how to make napalm, and prevent generative AIs from, say, creating obscene images of children.

AIs will increasingly need to weigh up outcomes and choose the right ethical course where there are no obvious answers. In certain sectors the stakes are higher still. Medical, military, and security robots may need to make decisions that could profoundly harm others.

The problem with rules

One early thinker on robot behaviour was of course Isaac Asimov, who proposed the Three Laws of Robotics:

A robot may not harm a human being or, through inaction, allow a human being to come to harm.
A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
A robot must protect its own existence, as long as such protection does not conflict with the First or Second Law.

The laws are appealing, but in reality things aren’t so simple. Indeed, many of Asimov’s stories describe how these straightforward laws fall apart in practice. If we insist a robot must never harm a human, we need to start by explaining exactly what we mean by ‘human’ and ‘harm’. There will be some awkward loopholes. Surgery, for example, can save a life, but the minute you make the incision you harm the patient. And what if harm is inevitable? Should a robot break someone’s arm to rescue them from a fiery crash or let them die through inaction? Asimov’s laws prohibit both.

Perhaps we’d conclude that robots should harm humans in some cases. If so, we have to rewrite the rules. Herein lies the problem: rules can never fully cover the complexity of moral decisions. We see this with religious edicts too. Even if you choose to follow, say, the Ten Commandments, they don‘t offer enough detail to guide every human decision. Instead we create thousands of laws to cover the gaps, and even then need lawyers, juries, and judges to interpret these texts.

*Photo by Beatriz Pérez Moya on Unsplash*

Alternative approaches

Instead of strict rules, perhaps we could give AIs looser principles that act as guiding stars. For example, we could ask robots to do whatever would bring about the most happiness in the world, or to take decisions that always improve the lot of the worst-off.

But which principles should we choose? Ethicists can‘t agree upon an answer, so picking just one perspective over another seems arbitrary.

‘Perhaps there is no single answer to the question of whose morality or what morality should be implemented in AI. Just as people have different moral
standards, there is no reason why all computational systems must conform to the
same code of behaviour.’ — Wendell Wallach & Colin Allen.

Another option is to abandon the idea of direct instruction and instead let robots learn by example. After all, this is how AI systems learn today: we feed them massive amounts of data and train them to find patterns. But can anyone really say they‘ve always been a good person? Humans aren’t perfect ethical examples. We cheat, lie, and steal. Do we really want a powerful AI to adopt our worst habits?

Ideal v real

In an ideal world, we’d probably take a cue from how we teach morality to children. We’d put AIs on a short metaphorical lead, keeping close watch over their behaviours. We’d try to instil some fundamental principles, then allow them to practice mundane moral decisions. Can I wear pyjamas to a funeral? Should I mow the lawn at 3am? (Even before ChatGPT, the Allen Institute’s Delphi AI did a decent job at answering these simple questions already.) Once our juvenile AI shows it knows the basics, we can slowly give it more license to take more important decisions, stepping in to correct it where necessary. Perhaps at some point, we might be confident it can make sophisticated ethical choices alone.

However, the course of machine ethics won‘t run this smoothly. Commercial pressures will push AI into non-mundane territory way before we can pursue this idealised training programme. In practice, we may have to cobble together ethical education however we can.

Some top-down instruction may still be useful, perhaps in the form of guiding principles or heuristics rather than hard-and-fast rules. And despite our moral failings, we might still offer up our own behaviours as ethical training data. But more than anything, robot morality will be a matter of trial and error. Even if we do eventually achieve some success, someone will always want to take off the handcuffs. Author Ted Chiang is pessimistic about the prospect for ethical AI: ‘It will always be possible to build AI that pursues shareholder value above all else, and most companies will prefer to use that AI instead of one constrained by your principles’. Regulation will dampen this temptation – the EU, for example, has just agreed its new AI Act – but regulation typically takes longer to settle than we’d like, and is always itself incomplete.

So the path to ethical AI will be steep and rocky, marked by small wins and occasionally terrible mistakes. Teaching machines to behave morally will also force us to ask difficult questions about our own ethical choices. Uncomfortable decisions lie ahead. But with AI’s heady velocity now seemingly irreversible, it’s work we can’t overlook. An advanced AI without moral guardrails or experience wouldn’t just be ineffective: it could be profoundly dangerous.

Featured photo by Clark Tibbs on Unsplash

Teaching machines right from wrong

Science fiction to reality

The problem with rules

Alternative approaches

Ideal v real

Most read articles

Monthly Newsletter

Good reads

Could machines help us be more ethical?

Teaching machines right from wrong

Comments

0 responses to “Teaching machines right from wrong”

Leave a Reply Cancel reply