Reviews

The Art of Statistics: How to Learn from Data by David Spiegelhalter

hmandsager's review against another edition

Go to review page

Was too simplistic

thomasnel's review

Go to review page

3.0

At the time this book went over my head. To give it an honest review I would need to go back and read it again as I learned a bit more about statistics from other sources and would now give it the honest effort it deserves. As it stands currently, however, this book to me was just a total slog to get through which is why I give it 3/5 before I make any hasty judgments.

jpmaguire2's review

Go to review page

4.0

This was a solid summary of the field of statistics. It helps the reader understand when to use various types of stat methods and charts. I did find parts of it went over my head. But for the most part, I think this books helped me understand some terms and ideas in statistics that I'd never fully understood in the past.

elctrc's review

Go to review page

4.0

Solid overview (and occasional deep dive) into statistics and probability

jwsg's review

Go to review page

3.0

The goal of The Art of Statistics, is to focus on "using statistical science to answer the kinds of questions that arise when we want to better understand the world". Even as analysis software becomes more sophisticated, making it easier for people to run complex calculations, it is more important than ever that people are data literate and have the skills to understand and critique conclusions drawn by others on the basis of statistics.

I loathed statistics in high school and later in university. I managed to pass my exams through rote memorization but never developed a proper understanding of the principles to problem solve independently. I picked up The Art of Statistics after a friend raved about it and while I think Spiegelhalter did help to clear up some things for me, everything from Chapter 9 onwards is still a little iffy for me, including the Bayesian approach. Still, it's a helpful primer on statistics that also reminds the reader to be on the watchout for the use of inappropriate use of standard statistical methods that limit reproducibility or replication of results, or fake discoveries arising from the intensive analysis of data sets derived from routine data, due to systematic bias inherent in the data sources, and from carrying out many analyses and only reporting whatever looks most interesting (i.e. "data dredging").

INTERPRETING DATA

On framing and presentation: Chapter 1 offered a useful reminder to consider how the presentation of numbers can change the emotional impact, e.g. using a negative framing vs a positive framing, or using a percentage vs absolute numbers, or the order in which information is presented. Spiegelhalter reminds us that using expectancies - i.e. asking what does this mean for 100 (or 1000) people - instead of percentages or probabilities can promote understanding of the data and an appropriate sense of importance. Relative risks tend to convey an exaggerated importance, and absolute risks should be provided for clarity.

On precision in terminology: Spiegelhalter notes that when an "average" is reported in the media, it is often unclear whether this should be interpreted as the mean or median. For instance, if the media reports on how average incomes have changed, we must be clear whether they are talking about "average income" (mean) or "the income of the average person" (median). Likewise, when the media reports on average house prices, are they referring to the average-house price (median) or the average house-price (mean)? The latter can be skewed by the long tail of high end properties.

We need to make a distinction between administrative data and survey data. So if we want to know how many people passed through A&E last year, the data can tell us the answer. But often the questions we have go beyond simple description of data and we want to learn something bigger than just the observations in front of us - whether it is to make predictions (how many will come next year?) or say something more basic (why are the numbers increasing?). Spiegelhalter notes that causation is difficult to establish statistically, but well-designed randomised trials are the best framework. However, randomised trials aren't always feasible and what we have is observational data (e.g. lung cancer cases). To conclude that there is causation, we must have sufficient direct, mechanistic and parallel evidence:
- The size of the effect is so large that it cannot be explained by plausible confounding
- There is approximate temporal and/or spatial proximity, in that cause precedes effect and effect occurs after a plausible interval, and/or cause occurs at the same site as the effect
- The effect increases as the exposure increases, and the evidence is even stronger if the effect reduces upon reduction of the dose
- There is plausible mechanism of action, which could be biological, chemical, or mechanical, with external evidence for a causal chain
- The effect fits in with what is known already
- The effect is found when the study is replicated
- The effect is found in similar, but not identical, studies.

Once we want to start generalizing from the data, we enter into the realm of inductive inference, taking particular instances and trying to work out general conclusions. The example Spiegelhalter uses is if we don't know the customs in a community about kissing female friends on the cheek, we have to try to work it out by observing whether people kiss once, twice, three times or not at all. While deduction is logically certain, induction is generally uncertain.

Drawing conclusions from survey data is an example of inductive inference. Speigelhalter reminds us that "it is no trivial matter to go from the actual responses collected in a survey to conclusions about the whole of Britain….[it may be] incredibly easy to just claim that what these responses say accurately represents what is really going on in the country" but this is clearly not the case. Spiegelhalter breaks down the process of going from raw responses in a survey to claims about the behaviour of the whole country into the following stages:
- What the recorded raw data from survey participants tells us
- What we can concluded about the true number from the sample: this requires to make assumptions about how accurate respondents are
- What we can conclude about the study population - the ones who could have potentially been included in the survey: Speigelhalter describes this as perhaps the most challenging step as you need to be confident that the people asked to take part in the survey are a random sample from those who are eligible. You also need to assume that the people who agree to take part are representative, which may not always be the case.
- What we can infer about our target population: Going from the study population to the target population (e.g. the adult population of the country) should be relatively straightforward, if the sampling is done properly, but it would exclude people in institutions e.g. prisons, nursing homes, etc. Where this is tricky is if, say, our target population comprises people but we were only able to study mice. Or if clinical trials were only conducted on men.

On survey data, Spiegelhalter notes that acknowledging uncertainty is important. Anyone can make an estimate, but being able to realistically assess its possible error is a crucial element of statistical science. In a well-conducted study, we expect the sample mean to be close to the population mean, the sample inter-quartile range to be close to the population inter-quartile range etc. But we must not confuse the sample mean with the population mean and assume, say, that if a survey finds that 7% of the sample are unemployed, it necessarily means that 7% of the whole population are unemployed. The sample size should affect one's confidence in the estimate, and knowing exactly how much difference it makes is essential for proper statistical inference.

Much of the data used today is not based on random sampling or any sampling at all. Rather, what we have is data collected on, say, online purchasing, or administrative data, which we then re-purpose to help use understand what is going on in the world. There is no gap between the sample and the study population. Spiegelhalter cautions that we have to be careful about systematic biases in the data that can jeopardise the reliability of any claims.

THINGS TO BEAR IN MIND

Approaching data collection with the PPDAC Problem Solving Cycle
1. Problem: Understanding and defining the problem
2. Plan: Planning what to measure and how, designing the study, planning how data will be collected and recorded
3. Data: Collecting, managing and cleaning the data. This requires good organizational and coding skills
4. Analysis: Sorting the data, looking for patterns, hypothesis generation and finding appropriate ways to present the data (e.g. graphs, tables).
a. On data visualisation specifically: How does the design of the data visualisation allow relevant patterns to be noticeable? Is the presentation attractive, yet does not get in the way of honesty, clarity and depth?
5. Conclusion: Interpreting the data, drawing conclusions and communicating them
a. On communicating data, Spiegelhalter reminds us to "shut up and listen" so that you can get to know about the audience for your communication, whether it might be politicians, professionals or the general public. We have to "understand their inevitable limitations and any misunderstandings, and fight the temptation to be too sophisticated and clever, or put in too much detail"
b. Know what you want to achieve with the story you are telling with your data. It is inevitable that people will make comparisons and judgements, even if we only want to inform and not persuade.

Questions to ask when confronted by a claim based on statistical evidence
1. How rigorously has the study been done? Check for internal validity, appropriate design and wording of questions, taking a representative sample, using randomisation, and making a fair comparison with a control group, for example
2. What is the statistical uncertainty/confidence in the findings? Check margins of error, confidence intervals, statistical significance, sample size, multiple comparisons, systematic bias
3. Is the summary appropriate? Check appropriate use of averages, variability, relative and absolute risks
4. How reliable is the source of the story?
5. Is the story being spun? Be aware of the use of framing, emotional appeal through quoting anecdotes about extreme cases, misleading graphs, exaggerated headlines, big sounding numbers
6. What am I note being told? Consider cherry picked results, missing info that would conflict with the story and lack of independent comment
7. How does the claim fit with what else is known
8. What is the claimed explanation for whatever has been seen? Be conscious of correlation vs causation, regression to the mean, inappropriate claims that a non-significant result means "no effect", confounding, attribution and prosecutor's fallacy.
9. How relevant is the story to the audience? Think about generalizability, whether the people being studied are a special case, or has there been an extrapolation from mice to people
10. Is the claimed effect important? Check whether the magnitude of the effect is practically significant and be especially wary of claims of "increased risk".

STATISTICAL CONCEPTS AND TERMS

Big data: Data can be "big" in two different ways. First, in the number of examples in the database (large n). Second, by measuring many characteristics, or features, on each examples (large p), like how we can have access to millions of an individual's genes. Today, we can have large n, large p problems, like data on the activities and behaviors of Facebook users.

Inter-quartile range: The distance between the 27th and 75th percentiles of the data; this concept helps describe the spread of a data distribution and is not affected by extremes.
Standard deviation: Measures the spread of data and is "only really appropriate for well-behaved symmetric data since it is…unduly influenced by outlying values".

Margin of error: After a survey, a plausible range in which a true characteristic of a population may lie. These are generally 95% confidence intervals, which are approximately +/- 2 standard errors, but sometimes error-bars are used to represent +/- 1 standard error

Normal distribution: Theory shows that the normal distribution can be expected to occur for phenomena that are driven by large numbers of small influences, for example, height and cognitive skills, which are complex physical traits that are not influenced by just a few genes. In a normal distribution, roughly 95% of the population will be contained in the interval given by the mean +/- two standard deviations. Other, less natural phenomena may have population distributions that are distinctly non-normal and often feature a long right-hand tail, income being a classic example.

INTERESTING SNIPPETS

Spiegelhalter adds some interesting background info along the way, like how scientist Francis Galton observed the phenomenon of "regression to mediocrity" (or what we now call regression to the mean), where more extreme responses revert to nearer the long-term average, since a contribution to their previous extremeness was pure chance. For instance, tall fathers tended to have sons slightly shorter than them and shorter fathers tended to have slightly taller sons. Any process of fitting lines or curves to data came to be called "regression".


elliotmjones's review

Go to review page

5.0

An excellent (re)introduction to statistical thinking, without getting bogged down in formulae. It has given me better intuitions for concepts like the Central Limit Theorem than I got at A level or in an Economics degree.

jherta's review

Go to review page

5.0

I think I understood like 70% of this, but it’s definitely an accessible primer. Really glad to have read it considering how important statistical models have become even in the past few months.

davidgilani's review

Go to review page

5.0

Brilliant book! Not going to be everyone's cup of tea... but as an aspiring 'researcher' it was perfect for me. I think great for anyone who's interested in research, maths, or just how to better understand the claims that are made about the world.

It has a good amount of humour. The graphics are brilliant at helping with accessibility. The language is clear and concise. The scope is well-defined. The structure of the chapters flow well and build on each other / deconstruct on what you've learned before. It's practical in its conclusions.

Loved it!

benrogerswpg's review

Go to review page

3.0

The Art of Statistics: How to Learn from Data

I have always been a fan of mathematics and data, algorithms and programming. This book was a good combination of the first 3.

I didn’t find it as compelling as the [b:The Art of Computer Programming, Volume 1: Fundamental Algorithms|112247|The Art of Computer Programming, Volume 1 Fundamental Algorithms|Donald Ervin Knuth|https://i.gr-assets.com/images/S/compressed.photo.goodreads.com/books/1388242904l/112247._SX50_.jpg|108080] book, but I still got a lot out of it!

Would recommend for people who love data like me!

3.7/5

snay's review against another edition

Go to review page

informative inspiring medium-paced

5.0