Explainable AI and the Nature of Understanding

explainable.jpeg

AUGUST 15, 2019

The most uncanny fact about modern AI may not be how smart it is, but how inscrutable. After centuries of scientific progress driven by a hand-in-hand relationship between explanation, prediction, and understanding, we’re now facing a world in which some of the most intelligent systems on earth (in certain narrow domains) are black boxes, illegible even to their programmers. This gap -- between what a system can successfully predict, and what it can explain to human users -- represents one of the major barriers to trust in AI. As Elizabeth Holm writes:

Both an engineer and an AI system may learn to predict whether a bridge will collapse. But only the engineer can explain that decision in terms of physical models that can be communicated to and evaluated by others. Whose bridge would you rather cross?

Driven in part by regulatory pressures, numerous initiatives have grown up in recent years to make AI more transparent to human users. The emerging field of Explainable AI (XAI) aims to find techniques and methods to bridge the divide between human and machine reasoning. This is a substantial technical challenge because, as with removing bias, the goal of interpretability comes at the cost of accuracy. XAI systems either have to be very simple, or they have to constrain their reasoning processes to those that humans can understand. Both methods hobble the system’s predictive power.

This tradeoff matters more in some domains than in others. As Holm notes, black boxes are perfectly acceptable “when the cost of a wrong answer is low relative to the value of a correct answer,” as with targeted advertising. But when a decision has high stakes and a great deal of social accountability, as with where to launch a drone strike or whom to grant parole, explainability becomes paramount. Seen in this light, explanation is primarily a social tool: it is about justifying our behavior to one another, rather than about improving accuracy. Indeed, many human decisions are just as “black boxed” as AI’s, and many of our explanations are just post-hoc fabrications.

But explanation also plays a key role in the process of understanding, which sits at the core of our best reasoning. In fact, many have argued that this is precisely why machine intelligence doesn’t yet measure up to our own. As computer scientist Melanie Mitchell writes, “Today’s A.I. systems sorely lack the essence of human intelligence: understanding the situations we experience, being able to grasp their meaning.” The concept of understanding is notoriously nebulous, but its “psychological core,” argues Philip Johnson-Laird, “consists in having a ‘working model’ of the phenomenon in your mind.” Explainable AI tools like Google’s “What-If” interface attempt to foster this kind of understanding by allowing users to manipulate different aspects of a model to see how they affect the output.

Still, some have worried that the sheer utility of predictive power will eventually make explanation and understanding irrelevant. And on an instrumentalist view of science, prediction is what really matters anyway. The physicist David Deutsch summarizes this view by reference to a hypothetical AI oracle that could predict the outcome of any experiment: “once we had that oracle [they argue] we should have no further use for scientific theories, except as a means of entertaining ourselves.” But Deutsch quickly nips this concern in the bud, pointing out that we’d still need to know what experiments to ask the oracle about, and in what terms. Though the oracle could tell us that particular design of ours would fail, it could not tell us why, or how to improve it. In effect, Deutsch argues, we already have such an oracle: the physical world, which we query through experiment. Predictive AI may make this process cheaper or quicker, but it will not replace the need for explanations. For now, it seems, our role is secure.

Nathanael Fast