Friday, November 22, 2013

Just Instrument It

(With apologies to macro-economists.)

The amount of time you spend in education predicts your earnings quite strongly, and it's generally agreed (Simon Cowell aside) that if you want to do well in life, staying in education for longer is a good idea.

But how much effect does it have?  We could look at a survey of people's incomes and group them by education level, but this doesn't give a causal effect.  It might tell us that people who have a masters degree earn more during their lifetime, on average, than those who don't.  This could be because people from wealthy backgrounds can afford the tuition for a masters degree, and also have pals in the city who can help them get a big salary afterwards.  Or perhaps people who do masters degrees work harder than the rest.  We can't easily tell the difference: this problem is called confounding.
We can't tell whether time spent in education level causes earnings
to increase, or there's a third factor which affects both.

Thursday, November 14, 2013

But is it causal? Defining causality

As I alluded to in my last post, defining what it means for $X$ to cause $Y$ is no simple task.  It is not an idea that can be defined in purely probabilistic terms, because it says something about the mechanisms underlying the system we are studying, and what will happen if we interfere with that system in some way.

Consider the example given at the end of the last post.  The headline was:
How a short nap can raise the risk of diabetes
The implication of this is that the risk of diabetes increases because of the nap.  But what does this mean?

Friday, November 8, 2013

If correlation isn't causation, then what is?

I've started to get a little tired of writing entirely about media shenanigans, and in all likelihood so, dear readers, have you tired of reading about them.  So today I'm going to provide one of the 'educational pieces' alluded to in this blog's description; specifically I'm going to start talking about causal inference, which is the driving force behind the research I'm lucky enough to be paid to do.

We hear a lot in statistics classes (if you're inclined to go to them) and places like this blog, that "correlation is not causation."  Packaged within this pithy but stern warning is some very sound advice: just because you observe a relationship between two things doesn't tell you anything about the mechanism (if any) which creates that relationship.

Friday, September 13, 2013

The Monty Hall Problem

It's great to see pieces on familiar mathematics puzzles in the mass media, so I was pleased to see this article about the Monty Hall problem on the BBC News website.  However, I'm moved to write a little piece about this, because I think it uses a rather carelessly analogy.

A brief summary of the problem.  You're in a game show, and there are three doors; behind one of the doors is a prize, behind the other two is nothing (or possibly worse, a goat).  The procedure is as follows every time the game is played:
  1. you choose a door, say number 1 (but don't see what's behind it);
  2. the game show host, who knows where the prize is, opens one of the other two doors, say number 2, and reveals nothing;
  3. the host gives you the opportunity to either stick with your choice or door 1, or move to door 3;
  4. your choice is revealed.

The question is - should you stick, or switch, or does it make no difference?

The BBC video (with Marcus du Sautoy and Alan Davies) is very clear, and so is most of the article.  However the beginning includes a reference to Deal or No Deal: the reason I don't like this is that in Deal or No Deal the banker doesn't know where the money is.  As we will see, this point is absolutely critical to getting the right answer.


Wednesday, June 5, 2013

Twins, or In which newspapers mess up odds calculations (again)

This BBC piece interviews a couple who've been 'blessed' with three sets of twins (rather you than me).   The caption states that
Doctors told them the chances of having three sets of twins was 500,000-1.
Doctors that don't know much about genetics, perhaps.

Friday, April 19, 2013

Extreme Comparisons

This is just a fairly short note on what has been covered in other cases elsewhere.  The BBC reported today that Tameside in Greater Manchester is the "UK's heart disease capital".  This is on the basis that the rate of deaths from heart disease between 2009 and 2011 was higher than anywhere else, at 132 per 100,000 people.  The data come from the British Heart Foundation, and I couldn't immediately see how to get at it.

The article mades a particular point of noting that this rate is three times higher than that of Kensington and Chelsea.  This sounds very dramatic, but it really isn't, so don't go moving just yet (even if you can afford a house in Kensington).

Friday, March 22, 2013

Smoking rates: even the 'good guys' shouldn't be trusted

The Guardian reports today that the number of children under 16 (more precisely aged 11-15) taking up smoking has has risen by 50,000 in a single year.  This is a slightly irritating headline simply because the number is without context, but the sub-headline is more helpful: from 157,000 to 207,000, which is quite dramatic: a 32% increase.

First obvious question - is this statistically significant? Well, let's have a look at the research quoted by the Guardian, which comes from Cancer Research UK.  Their Figure 6.10 immediately arouses suspicion.  There does appear to be an uptick in the figures from 2010 to 2011, but they remain below the 2009 figure!  The overall trend in the numbers over the past 10 years is clearly downwards.  So either we believe that the number of children taking up smoking fell and then rose dramatically in consecutive years, or else we might just be witnessing a bit of noise in our data.