An RCT debate: Abhijit Banerjee versus Angus Deaton

11 June 2012

I have taken some time to address the value (or lack of value) of RCTs on this blog. I am not against them in principle, but I consider that they must be used with care and when necessary and that therefore some of the claims being made (that they can tell us what works, for instance) need to be taken with a pinch of salt. Or not at all.

This very interesting debate between Abhijit Banerjee and Angus Deaton for NYU’s Development Research Institute is worth reading (and viewing): Deaton v Banerjee on RTCs.

Banerjee’s main argument is that RCTs force researchers to be more rigorous:

Just thinking about designing an RCT forces researchers to grapple with causality, responded Banerjee. And Angry Birds-style trial and error isn’t a realistic way to create policy

However, trial and error is in fact a realistic way to create policy. In the real world, where information is not complete, trial and error is the only way forward. And what is and RCT but a test of a trial -one which may very well end in error. The fact is that most RCTs are developed based on a theory or a hunch -otherwise there would be no need to test it using an RCT. Here, think tanks play a critical role: they can be quick to make suggestions that may go beyond what the evidence allows, point at the errors and recommend adjustments to the policy helping steer it in the right (if there is such a thing) direction.

Deacon goes further and argues that RCTs could tell us what worked but not what will work:

Angus Deaton responded that RCTs are of limited value since they focus on very small interventions that by definition only work in certain contexts. It’s like designing a better lawnmower—and who wouldn’t want that? —unless you’re in a country with no grass, or where the government dumps waste on your lawn. RCTs can help to design a perfect program for a specific context, but there’s no guarantee it will work in any other context.

I liked this particular analogy. But an even better is this one:

RCTs may identify a causal connection in one situation, but the cause might be specific to that trial and not a general principle. In a Rube Goldberg machine, flying a kite sharpens a pencil, but kite flying does not normally cause pencil sharpening.

In other words, may be a useful method to improve a particular intervention but not to take that intervention beyond its original scope.

It is true that RCTs can lead to more rigorous analysis. However, this should not lead to claims that this is therefore better than any other kind of analysis. When I worked at the Universidad del Pacifico in Peru in the early 2000, we calculated that certain food and nutrition programmes had leakage levels of over 80%. We did this by simply comparing the people who received the food from these programmes against those who were supposed to, according to the programmes’ design. This took about 30 minutes to calculate using the national household survey. It did not take too long to find out, by means of quantitative research, that this leakage was due to the way the programmes were targeted and implemented (relying on grassroots in many cases). This, coupled with the fact that under 5 malnutrition had barely been reduced by a single percentage point in over 4 years (and US$1 billion spent), led to a very convincing argument in favour of a reform of food and nutrition programmes. (We also calculated overlapping levels between programmes which led to suggestions of which programmes to cut.)

This is not to say that an RCT would not be useful in designing food and nutrition programmes, but to say what was working and not working we did not need to spend hundreds of thousands of dollars. A few hours on STATA and some good old research (interviews, focused groups, site visits, and even testing and tasting the food provided) were sufficient.

While Banerjee is right in that RCTs force researchers to ask themselves lots of questions regarding causality so does any good research.

Of lately, I have come across some funders requesting an RCT approach to assessing the impact or influence of the think tanks they fund (or the impact of their support on those think tanks). Some organisations, keen to win the contracts to undertake these evaluations, appear eager to please them instead of reflecting on what is possible or not. I think this is misleading and a unnecessary waste of resources. Just like Deaton suggests, it is impossible to create randomness or causality: two crucial components of any RCT. There are suggestions that quasi-experimental RCTs can be developed instead. Quasi-experiments do away with the randomisation aspect of the RCT, but cannot do away with the need for causality. In fact, it makes reaching conclusions about causality even harder.

Furthermore, there aren’t sufficient cases to study. A few think tanks per country, all different from each other, targeting different policy communities, by different means, and in different circumstances cannot be appropriately pooled together nor controlled.

But most importantly, a full-on experimental or quasi-experimental RCT won’t offer each think tank anything of use to them and their unique circumstances. And it seems ironic that an effort to support think tanks would be side-tracked by an evaluation that offers them little of value.