I’m using this article on ambulance waiting times from the BBC to illustrate some of the frustrations encountered with certain data stories. There is also a lesson to learn I hope.

Picture: lydia_shiningbrightly, Flickr Creative Commons

This is a short critical analysis. I like the BBC, but they are evidently susceptible to making news out of what many may consider to be non-stories.

Let’s start with the title:

Wales’ ambulance transfer times worst in UK

Now, regarding waiting or transfer times, longer probably equals worse. So this is essentially true.

However, does the longest wait (at six hours 22 minutes) equate to the worst service? What if all their other times were less than 30 minutes? They weren’t but anyway…

The fact is, the numbers tell us nothing about the *distribution *of the data: it gives no idea of proportion.

Normal distribution, or Gaussian distribution as it is sometimes called, is an important concept in mathematical probability and statistics.

The classic symmetrical bell-shaped curve, shown below, is used to demonstrate “normally” distributed data. When dealing with continuous data, the curve indicates all data points (observation) between two limits.

In many areas of scientific study, physical measurements often have normal distribution. This is very useful in science because Gaussian distribution is frequently used for random values where the actual distribution is unknown.

Furthermore, analysis of results becomes more straightforward when the relevant variables are normally distributed.

**The normal “Gaussian” distribution. Picture by Namal Perera **

Applying the bell-curve principle to the story – is it really news if you take one extreme of the curve and make it seem like the norm? Excuse the minor pun.

It’s a point I’ve raised again and again. As journalists, it’s fine to report on the extremes. And often fun – consider the world’s largest hotdog. However, as a serious data journalist, it’s worth taking the time to consider how relevant your data point is in the context of the whole distribution story.

And just btw, the lesson still stands – actually it’s worse – if we consider asymmetrically distributed data.

**Positively skewed data. Picture by Namal Perera **

Here, the end of the curve is even less representative of the data distribution.

In fairness to the reporter(s), not named on the webpage, they do give some points of reference. For example, they state “no service saw its longest wait dip under an hour, with many around the two-hour mark”.

And for comparison, it mentions an ambulance service in the east of England whose longest single wait was 5 hours 51 minutes.

The report isn’t likely to be the full story (it rarely is from what I’ve learned during the Science Journalism MA). However, if you ask for the “*longest waits*” you will probably only be given the longest waits.

Yes it is unacceptable for patients to wait around for lengthy hand-overs, but if it doesn’t lead to harm and the majority of people are treated promptly how big an issue is this?

The article does quote a Welsh government spokesperson saying, “most people were waiting for an average of 20 minutes.”

One A&E consultant I spoke to said, “It’s no surprise. You see this a lot. It’s an easy target highlighting the weak areas. It’s a shame they don’t mention how quickly the sickest [patients] were transferred.”

So the next time you read a story about the longest, shortest, fastest, or healthiest please consider what data you’re looking at, and the context in which it is presented. Consider whether you have the full picture before forming your opinion.