Sunday, June 17, 2012

How should statistics be taught? Some thoughts.

Inspired by Timothy Gowers' recent post on how mathematics should be taught to non-mathematicians, I thought it might be prudent to ask how statistics should be taught.  If you need to be motivated as to how important teaching statistics is, watch Arthur Benjamin's short TED talk.

My own experience of statistics up to GCSE was generally one of boredom: in the main it seemed to involve drawing histograms and line graphs, with a peculiar level of attention applied to getting the axes perfectly right, and choosing a suitable title.  This is not to say that labelling axes is unimportant, but it's not exciting.  More useful, and also more fun, is to explain why bad graphics are such a problem.  Here is pretty bog standardly horrific example of a 3D pie chart.  It wouldn't be hard to explain to school children why not to use this sort of presentation, and perhaps it would eventually filter up into the upper echelons of corporate management (I live in hope).

School probability, perhaps understandably, involved simply counting events (there are 3 blue socks and 4 red socks in a drawer, I pick 2 at random, what's the probability they're both blue?), with the examples tending to be rather implausible and dry.  This sort of thing is again important, but the examples can easily be jazzed up a bit; much like socks.

At A-level things perked up a bit: hypothesis tests actually made me feel that I could do useful things with statistics - I even did a bit of coursework on UK election results.  Box and whisker plots and finding the median of a bunch of numbers was somewhat less thrilling.

The view I've come to form recently is that applied statistics should, at all levels, teach two essential things, let's call them the cornerstones.

  1. 'Basic' probabilistic and statistical intuition.  People just aren't innately very good at dealing with probability and risk, even those of us who are trained to do so.  Many statistics courses do cover these issues, but it has to be drilled into people again and again.  Here I'm thinking of the base rate fallacy, prosecutors fallacy, regression to the mean, etc; don't worry if you don't know what these terms mean yet, but see some of the examples below.
  2. Construction of statistical methods.  The thing I don't like about most applied statistics courses (including ones I have led myself) is that they involve teaching a long list of hypothesis tests or other methods which are useful in certain scenarios, and then providing some contrived examples to which we can apply them.  But what if the situation you actually find yourself in doesn't lend itself to a t-test, an ANOVA test, or a generalised linear model?
For example, to my mind it isn't so important that a student remembers what Cook's distance or leverage is, as long as they understand what statistical inference is, and therefore why it might not be a good idea for your estimate to be very unstable when a few data points are removed.  After all, that lesson applies in almost any statistical model, whilst Cook's distances do not.

A Case Study Approach

I think that an approach based on the study of real (or at least believable) examples could help to ensure that the cornerstones are the main message of the course, rather than the less important technical details. Of course, the suitable level of rigour in a statistics course need not be constrained in any way by the use of these sorts of examples, and will depend strongly upon the intended audience.  Horses for courses, as they say.

Below are some interesting questions that could be answered with probability and statistics, and a short explanation of how they relate to the cornerstones.  Please do let me know what you think!

1. Are men taller on average than women?

Leaving aside questions of gender identity, the point here is that we can't measure every man and woman to get a definitive answer.  We might therefore think about sampling some men and some women, in the hope that the average of the sample is related to the average of the population.  If, for example, men are taller than women in our sample, we would also need to consider whether this could have happened by chance.  All this motivates the law of large numbers and the central limit theorem; these are weighty theorems, and it can be hard to communicate both their content and importance.  For less advanced students, it can be used to give an intuitive idea of statistical significance.  The key point is not to just teach students how to use a two-sample t-test.

2. How would you go about determining the number of people in Cambridge who like to watch basketball?  How about in China?

This problem also uses the idea that sampling is somehow relevant to the population.  A bigger question is how on Earth one could obtain a random sample of people from China, and what a polling company might do to try to compensate for this difficulty.  The binomial distribution is obviously relevant here, but again we don't just want to do 'inference by formula'.

3. A doctor gives you a test for a rare disease: approximately 1 in 1000 people have the disease.  The test is very accurate; it is correct 99% of the time.  If the test is positive, what is the probability you have the disease?

This is the simplest sort of base rate fallacy; the test is so accurate that it seems almost certain you have the disease.  In reality, assuming that you have no particular reason to think you had the disease before the test (e.g. symptoms), it is still much more likely that the test is a false positive (probability about 91%) than that you actually have the disease (about 9%).

4. A council collates the number of road accidents on 500 similar stretches of road.  It finds that 40 of these have had at least 7 accidents in the last 5 years, whilst the average is just 4.  It decides to put a speed camera at these 40 locations.  5 years later, the average number of deaths at the camera locations is just 4.  Did the speed cameras do their job?

This is a dressed-up 'regression to the mean' example.  Imagine that really there's no difference between the roads; some will have more accidents in the first 5 year period by chance, and then because they're not actually more dangerous, the number of accidents will probably be lower in the second period.  So we can't tell whether the speed cameras work or not from this sort of data.  The important thing is to consider what sort of experiment the council could use to answer the question.  [I don't have anything against speed cameras, by the way, but it's a realistic sort of example.]

5. A PSA test is used to diagnose prostate cancer in men.  The measured PSA level is generally higher if the volume of a tumour is higher.  How should the test be used?

This is much more complicated, and takes into account the relationship between PSA and tumour size (linear regression?), what PSA looks like in healthy patients, the relative problems of false positives and false negatives, etc.

The last two examples are taken from Professor Sir Timothy's post:

6. In September 2009 the same six numbers were chosen in two consecutive draws of the Bulgarian State Lottery. Was this conclusive evidence that the draws were manipulated?

We see silly newspaper stories about this kind of 'coincidence' all the time, and it's not always obvious how big a coincidence it is.  These problems are quite challenging (see also this).

7. In 1999 a solicitor named Sally Clark was convicted for the murder of her two sons... Roy Meadow, a paediatrician, argued for the prosecution as follows. The probability of a cot death is approximately 1 in 8500. So the probability of two cot deaths is roughly the square of this, or 1 in about 73 million. Therefore it was overwhelmingly likely that the deaths were not due to natural causes. Is this argument valid?

A sad and infamous example of the prosecutor's fallacy, the Texas sharp shooter fallacy, and various other statistical crimes, with real and terrible consequences.  Much has been said about this already, so I won't add more here.

No comments:

Post a Comment