Many people doubt statistics because it has the ability to prove lies and misinform. While statistics may be able to show a correlation between the number of storks and the number of babies in a country, and one could conclude that storks delivered babies, this does not mean that storks actually delivered babies. Examples like this and more have been used to cast doubt on good statistics, statistics that has the ability to do good, like proving the link between smoking and lung cancer in order to save lives and more.
This is why Tim Harford wrote “The Data Detective”, to educate people like you and me about how to think critically and use logic and evidence to better detect statistical truths rather than fall victim to statistical falsehoods. To do so, he lays out ten rules with illuminating examples to make sense of statistics; let’s explore them:
Rule One: Search Your Feelings
It’s easy to be fooled by our feelings and to buy into statistical claims that are not true because when we have an emotional reaction or belief, it is harder to question it. Wishful thinking plays a part in our judgment as well, because naturally when we believe something, we want it to be true.
Sometimes we think through a topic with a specific aim to reach a specific kind of conclusion, something called “motivated reasoning,” which experts themselves are not immune to. Abraham Bredius, an expert on Vermeer, was convinced that a painting called “Christ at Emmaus” was a Vermeer because of the elements and conditions presented in it. However, it was a forgery. His want for it be a Vermeer got in the way of his judgment; this bias towards our preconceptions is quite common.
Our emotional response to statistical and scientific evidence has significant influence over our judgment, but we can control our emotions. So, take note of your reaction when looking at a statistical claim and pause and reflect. You may want to believe or disbelieve because of your feelings, but the facts matter too, so take a moment to ask, “is this true?”
Rule Two: Ponder Your Personal Experience
Sometimes, statistics tell us one story while our own daily personal lives tell us something else. So, we have to be wise about what to believe, which requires questioning statistical sources as well as our own experiences.
The truth is that we need both sides. For example, a study showed that China consumed more cement in 3 years than the U.S. in the entire 20th century; this seems excessive. However, Harford took a trip on a high-speed train through China’s densely populated city of Yangshuo, which filled with concrete buildings one after the next shed light on the statistics that he read in the study.
If we don’t pay attention to statistics, we’ll be mistaken about the world, but if we only pay attention to the statistics, we understand too little. This is why we also need our personal experience and to ask smart questions so that we are not mislead or deceived.
Rule Three: Avoid Premature Enumeration
The whole discipline of statistics is based on counting and measuring things, yet when people approach statistics they forget to ask what is being counted, and what definition is being used. Not understanding how the data is being recorded makes it hard to understand the answer. This is called “premature enumeration”.
For example, a study showed that children who play violent video games are more likely to be violent in reality, but there was no definition of the word “play” or for “how long?” or “what was considered a violent video game?” No one can really understand the results of the study, unless they now the facts regarding the data that was collected.
As thoughtful readers of statistics, we must not rush to judgment, but look for clarity first. Ask what’s being counted and ask for definitions. Understanding definitions is vital if we want to understand what is happening, and to make better decisions.
Rule Four: Step Back And Enjoy The View
The news, and more specifically Breaking News, has a tendency to only shine a light on a narrow amount of information or only tend to share headlines that are shocking or negative as that’s what people tend to engage with.
For example, in April 2018, a newspaper had the headline: “London’s murder rate was higher than New York’s for the first time ever!” However, of you took a step back and looked at the murder rate for both of these cities over time, you would see that they were on a downward trend. London was actually getting safer.
This is why when presented with info, step back, look at the big picture to get context. Just taking a step back, and enjoying the view will give you context and let you see the real progress that was accomplished.
Rule Five: Get The Backstory
The journalism industry suffers greatly from publication bias, where studies with interesting findings that are false are published, but studies that are failures and are true are not. For example, a paper proving that people could see the future made it into the Journal of Personality and Psychology, but similar studies that proved the opposite were not because the journal did not publish replication of studies. Of course no one wants to read about a study that did not prove precognition; that’s not interesting.
Little harm can be done from publication bias if it just slightly distorts our views of the world, but it can be bad for our health. If treatment trials that indicate ineffectiveness go unpublished, those are real lives at stake due to publication bias. This is why it’s important to be transparent about the data, the information, and the clinical trials that go unpublished.
So what to do when consuming research from science journalism or the media? Put how the study fits into the broader picture and ask questions about the study like who it was conducted on, how large was the effect and so on. Good journalism will explain and help you understand.
Rule Six: Ask Who Is Missing
A lot of studies tend to exclude important groups of people or do not collect data in a way that allows us to analyze data and information for specific groups. Examples include not collecting sex-disaggregated data for studies on medication, to polling errors were elections pollsters have failed to collect representative data leading to incorrect predictions for election outcomes.
Using big data and digital data like using what people are saying on Twitter to understand people’s sentiments has been one way to remove sample bias, but still only represents people on Twitter using Twitter, which does not represent everyone.
Data has to be collected by someone and what information is or isn’t collected is the result of human assumptions, preconceptions and oversight. All data will hit at some biases, but there’s not much we can do to eliminate missing data. We can, however, question who and what is missing.
Rule Seven: Demand Transparency When The Computer Says No
Big data that each of us creates every day is used to direct Ads at us, find us love, and even decided if we go to jail. Big data is comprised of small data problems and is a big problem when people in power who don’t understand them use them to make life-changing decisions. When companies and organizations put bad or incomplete data into an algorithm, it only spits out bad results.
For example, the IMPACT algorithm accessed teachers’ performance in Washington, DC, which judged the quality of teaching based on whether the students fell behind on test scores. This is a problem because every student is different and teachers could just cheat and raise test scores. These issues didn’t stop 206 teachers from being fired from failing to meet the algorithm’s standards.
So, what to do? The problem isn’t just big data or algorithms, but the lack of transparency, scrutiny, and debate. When big companies like Google, Amazon and more keep data and algorithms a secret, how can we possible scrutinize them? Making information available can allow debates and discussions on the limitations of data or algorithms. Additionally, the lack of access to data prevents innovation, for other companies to make better products and services, and increase competition.
Rule Eight: Don’t Take Statistical Bedrock For Granted
Countries have organizations that count people, information about people, economics and social statistics that are an important bedrock for journalists and academics to fact check and understand what is going on in a country.
Given the importance of this kind of statistical bedrock, it’s surprising what statisticians have had to do for the truth of statistics to prevail. Andreas Georgiou moved to Greece to run the statistical agency ELSTAT, and after finding out the truth about the country’s budget deficit, was almost jailed for doing so.
Governments should be allowed to collect data to make better and informed policies, but they need more resources to provide better statistical bedrock. However, they also cannot decide that those statistics are of no business to anyone else. With reliable statistics, citizens can hold governments to account and governments can make better decisions. While official statistics are well produced, they will never be perfect, but they are closest thing we have to statistical bedrock.
Rule Nine: Remember That Misinformation Can Be Beautiful, Too
Data visualization can be powerful when used clearly and honestly, however, the effectiveness of data visualization depends on the intent of the creator and the wisdom of the reader. A lot of images are designed to grab our attention, but often mislead us, this is why misinformation can be beautiful.
One famous data visualization is Florence Nightingale’s rose diagram. It contrasted the death toll at Scutari hospital in Turkey during the Crimean War before sanitary improvements and after. Nightingale chose the rose diagram because it was persuasive; a bar chart could have clearly shown the same results, however it also attributed deaths to winter months which was not a part of Nightingale’s agenda. Nonetheless, the rose diagram proved the need for better sanitation in other hospitals and barracks facing similar death tolls, which led the U.K. to pass several public health acts resulting in an increase in life expectancy.
While a catchy graphic, Nightingale’s intentions were good, but not all catchy graphics are. This is why when looking at data visualizations, you should pause and make sure that you understand your feelings and the basics of the graph – what is being counted, is there context, and do you understand what is being done? It’s important to make sure you understand if you’re being persuaded, and whether or not you should change your mind.
Rule Ten: Keep An Open Mind
As discussed in Rule One, pre-existing beliefs tend to have an influence on how we perceive information, but it is important to keep an open mind.
Irving Fisher, one of the greatest economists who ever lived, was led to financial ruin due to the Great Depression, not just because of the crash, but because of his stubbornness. He could have stepped back from the market and would have been fine, but he doubled down on his initial beliefs that the market would turn up again, and even borrowed more money to invest. So was his failure and inability to adjust, or his failure at forecasting?
A project on forecasting found that the people with the best forecasts: 1) compared to base rates, 2) kept score if earlier forecasts had been right or wrong, 3) updated forecasts and new information was available, and 4) had an open-minded personality.
It seems simple, if things are going badly, then adapt. It was hard for Fisher to change his mind since he had a good track record and had a high public profile. Being a man of logic and reason, he believed the future was knowable; he refused to change his mind, because he refused to accept that the world had changed.
These ten rules of thumb may be a lot to remember, so Harford ends the book with his Golden Rule: Be Curious. Don’t trust anyone or anything from the start. Look deeper and ask questions. This will allow us to fill knowledge gaps and understand the world better.