Are statistical models wrong?

All statistical models are inadequate representations, but different models are useful for different reasons.
David Spiegelhalter

Chair of the Winton Centre

01 Jun 2025
David Spiegelhalter
Key Points
  • All statistical models are inadequate representations, but different models are useful for different reasons.
  • Useful models often make predictions by simulating possible future scenarios and by suggesting the uncertainty of what may happen.
  • Artificial intelligence needs to be treated like other statistical models and technology; we need to know its assumptions and evaluate its performance.

 

How statistical models work

Photo by fizces

A mathematical or statistical model is quite a tricky concept. It’s like a mathematical representation of one aspect of the world, which is usually put into some sort of computer software. You put something in, and something comes out. They can vary enormously in sophistication and complexity.

For example, if I’m getting an insurance quote for my car or my house or my life, I will put information in about myself on the website. There will be some model in the background doing calculations and coming up with a number. You can think of that as a simple algorithm, but actually it’s based on a model for how the risk factors that I’m putting in influence the chances of my house burning down, me having a crash or me living till 90.

Somebody has collected a lot of data. They’ve done statistical analyses. They’ve built some sort of formula, some sort of model which represents the relationships between these risk factors and the event of interest. That’s put into the software, which we then use routinely on a day-to-day basis. So that’s one of the simplest examples of a statistical model.

All models are wrong

The crucial thing to remember is that this is not the truth. People have said this so often, but it needs to be repeated again and again. The usual phrase that’s used was developed by British statistician George Box. I knew him. He worked in the chemical industry, went to the States, and he invented this phrase: all models are wrong, but some are useful. The phrase has even got its own Wikipedia page, so it must be a good thing to know. It’s a very simple phrase and it is of enormous clarity and wisdom. All models are wrong.

Another phrase that’s quite useful to remember is that a model is the map, not the territory. Now, I love maps. I did a lot of wilderness walking in rough areas on Dartmoor. I want a map that is going to give me every detail about where every bog is, where the territory is going to be wet and where it’s going to be dry. It’s not perfect; I’ll still stumble into some area of wet, but that level of detail is going to really help me.

Other times, if I just want to know where Turkmenistan is, I don’t need that detail. I just want a map of the world with a vague blob where the countries are. These maps are not the world or Dartmoor, but they can be useful for various purposes. Different maps will be useful for different reasons. Different models may be useful for different reasons, but none of them are correct. They’re all wrong. They’re all inadequate representations. It just depends on how inadequate they are.

Why weather forecasting models are useful

Weather forecasting models are highly complex mathematical representations of the atmosphere. They are big three-dimensional models with differential equations. You put in some starting values, run the model forward and it’ll say whether it’s going to rain in a particular place at a particular time in three days’ time. Then you change the starting values a bit, run it again, and you can see whether this time it’s going to rain in that particular place, at that particular time.

These models are not correct. They are not the atmosphere. But people found that if you run this thing 100 times, and 20 times it says it’s going to rain in this area at a certain time, then because they’ve worked so hard on these models, they’ve calibrated them, it really will rain 20% of the time. These are good, reliable predictions, very useful indeed. It takes a lot of effort to do that, and they’ve become better and better.

That illustrates a couple of points. First of all, it shows that models can often make predictions by simulating possible futures, possible ways in which things might turn out, none of which are correct. They’re just scenarios. They’re just possibilities. Also, it shows how valuable it is for a model to produce multiple possible futures, giving an idea of the uncertainty of what might happen: essentially, the probabilities.

Photo by ekapol sirachainan

Creating a statistical model

It can be challenging to understand how a model is constructed. Let me take a simple example: a coin. I’m going to flip it. Is it going to come up heads or tails? That’s a model. I should look at the coin and check it’s not double-headed, and then a model might be that it’s about 50-50. Therefore, I’d predict that if I kept on doing it, about half the time it’d come up heads and half the time it’d come up tails. I can check that. So that’s a simple model; it’s a mathematical representation of this coin.

But if this was not a coin, if this was something else that wasn’t so nicely symmetric, I wouldn’t have a mathematical model. I’d have to just flip it lots of times and find out how many times it did come up heads or tails. Then I’d have a purely empirical model that was based on data, but with enough data, I could go on and say the same thing. If after I flipped it lots of times it came up heads 60% of the time, and it was the same coin and the conditions were similar, I’d say, therefore, if I do it another 100 times, it should come up heads about 60 times. It’s a purely empirical model but still simple. Now, people have also built models that take into account how hard you flip it. People have built coin flipping machines that try to predict individual flips by using the dynamics of the coin and what we know about the physics of the coin to make more accurate predictions.

Defining good statistical modelling

What this means is that to get inside a model can be really quite challenging, although we should be able to do that to work out where they come from and what they’re based on. But there are a couple of elements that should be publicly available about any model that’s making any sort of prediction. First of all, what assumptions are being made? And secondly, how good are its predictions? How good have they been in the past? What is the confidence with which the conclusions are being drawn?

This is what defines good statistical analysis and good modelling. It not only contains within it the claim that it’s making (I think this is what’s going to happen), it contains within it the acknowledgement of the uncertainty about the claim (I think the temperature tomorrow at this time will be between 30 and 40 degrees). It contains the inevitable error about what’s going to happen, some judgement about the uncertainty of the prediction.

Acknowledging uncertainty

Before I flip the coin, if there’s an audience, I’d say, what’s the probability this will be heads? Everyone will say 50%. Then I flip it and I say, what’s the probability this is heads? And everyone’s really unhappy about it. Eventually someone grudgingly will say, 50-50. Then I look at it, don’t show them, and say, what’s your probability this is heads? Then they’re really unhappy, but eventually they might say 50-50 because that’s still the reasonable betting odds for them. I know what it is, but they don’t.

What I’ve done here is actually a deep philosophical trick. I’ve moved from the situation of uncertainty before I flip the coin, which is sometimes called aleatory uncertainty, or chance or randomness: when you can’t know what’s going to happen. But once I flip the coin, the uncertainty has now changed. It’s now a lack of knowledge. It’s our ignorance. It is either heads or tails; I just don’t know. And that’s called epistemic uncertainty. It’s uncertainty due to lack of knowledge. It’s still 50-50 if I don’t show it to you, if you don’t see it. But it’s a different form of uncertainty.

When we’re doing modelling, we have to acknowledge uncertainty about the future: the chance, the randomness, the fact that we can’t know what’s going to happen. We also have to acknowledge what we don’t know: our ignorance. This is absolutely important because all models are based on assumptions. Every assumption is wrong.

AI and algorithms

Photo by Vintage Tone

I used to work in AI 30 years ago, introducing probability ideas into AI, which have been very successful, in fact. But I’m a deep sceptic about the sort of mystical powers that AI is supposed to have to do things that humans can’t. It can do things fantastically well: the algorithms, the cameras to spot faces. And, of course, it can be misled. But it can do all sorts of clever things. It can recommend what film I might like to watch next on Netflix. That’s fine, but that’s not actually magical; anyone who knew me a little bit would be able to tell me that. We have to be very cautious about ascribing great powers to these ideas, because basically AI and machine learning produce algorithms.

An algorithm is something that is a computer program. You put something in, something comes out and that’s it. What goes on inside might be based on some fantastically complicated deep neural network of learning, or it might be a simple regression analysis that adds up points. You get two points if you’ve got this symptom and three points if you’ve got that one.

Questioning the AI hype

Time and time again, those very simple statistical methods are shown to be just about as good as these enormously complicated methods. One of the first things I always ask about any claim about what AI can do is, could it have been done with something really simple to just about the same levels of accuracy? There’s so much hype and so much misleading information around this, so many unjustified claims, that I would be deeply cautious.

At the same time, there are fantastically powerful tools that are incredibly valuable in what they can do in terms of harnessing lots of information and coming up with good judgements, which will often be as good as what experts can do themselves and entirely reproducible. But AI needs to be treated just like any other bit of statistics modelling and technology. We need to know about the assumptions, and we need to evaluate its performance very carefully, particularly when we move it into new areas. We need to know what it does on outlying unusual cases. We need to know whether it’s fair and just, or whether it encapsulates all the prejudices and biases that we know exist in our society. It needs very careful scrutiny all the time.

Discover More About

statistical modelling and uncertainty

Spiegelhalter, D. (2020). Should We Trust Algorithms?. Harvard Data Science Review, 2(1).

van der Bles, A. M., van der Linden, S., Freeman, A. L. J., et al. (2019). Communicating uncertainty about facts, numbers and science. Royal Society Open Science, 6(5), Article 181870.

0:00 / 0:00