Should researchers get to pick who wins and who loses?

8 August 2019

This is article is challenge statement. It is meant to elicit a reaction. Please, do join the discussion.

Evidence Action recently announced that it has decided to close down a project, No Lean Season, in Bangladesh. The project was not delivering its expected outcomes, it was fraught with implementation challenges, it was not working. “This is a good thing”, have said some of their funders. In fact, Evidence Action has been praised for shutting it down and, supposedly, accepting its mistakes: if only other NGOs did the same.

But I think there are two problems with this praise. First, this isn’t what the story is about – or what it should be about. Second, Evidence Action don’t seem to have accepted responsibility for what went wrong – and I think they (and others that follow their approach) have failed to see the deeply troubling ethical dilema that their approach implies.

I should say that this article is not about Evidence Action. They are one of many other initiatives pursuing a research as decision maker approach that I find troubling and in need for greater discussion. Their’s is simply a recent case and one that has been relatively well documented -by them.

The story really is about the roles of researchers and their accountability

Who is responsible for the outcomes of policy mistakes? What happens when researchers get involved in the implementation of policy ideas that go wrong?

This is a conversation that I have had with thinktankers and their funders across the world who, for many different reasons, are either motivated by the opportunity or concerned by the pressures they face for greater involvement in the implementation of policy ideas. Smart think tanks do no take this lightly: CIPPEC has struggled with this in relation to political reform in Argentina.

This discussion is linked to the often-abused “think and do” label, to the rise of the “we know what works because an RCT says so”mantra and to funding practices that increasingly bundle up research with implementation.

Back in the late 2000s researchers at ODI, where I worked, engaged in a discussion about whether the think tank should get involved in the delivery of projects funded by Aid agencies or if it should limit its work to undertaking research, evaluations and providing independent advice. Those against the move into implementation argued that ODI would lose its intelectual autonomy. Those in favour argued that it would help ODI learn more about implementation and it was also a way of saying: we put our money where our mouth is (only, it was not our money). But for ODI this meant delivering capacity building projects, managing research programmes in country and, maybe, actively advocating for a policy recommendation beyond traditional research communications.

This was the direction that most funding had been taking – for think tanks in developing countries or for policy researchers in the development field- and continues to. Funders are not so interested in the generation of knowledge alone. They want the knowledge they fund to inform, influence and even shape policy design and its delivery. They want to measure the income raised or count the children who graduate from high school. Simply suggesting how governments might achieve that is not enough.

As a consequence, they increasingly design or are driven to programmes that include research as a means or a support for delivery – but rarely research as the main purpose.

I had conflicting thoughts about this. At the time, I sided with the implementers at ODI. I have not entirely changed my mind but I have learned that stepping over the line ought to come with greater responsibility and accountability.

This is having an important effect on the nature of researchers’ work. In my opinion, when researchers get involved in these efforts they risk losing the autonomy to study and discuss them critically and openly. Their policy ideas are no longer “just ideas”; they become inputs into a narrative that supports the interventions. They are accessories to their potential success or failure. They are complicit in their welfare gains and loses.

This has never been more relevant than in the context of the experimentation agenda that has been rapidly adopted by funders, governments and researchers alike (well, by some researchers).

Experimentation is a great tactic to convince doubtful policymakers that a policy idea is in fact a good idea -one that is worth putting a lot of resources behind. When advising think tanks on their communication strategies we recommend that their policy arguments should use the evidence generated by others -including evidence of successful pilots or from the full implementation of the policy idea elsewhere.

This is a significant source of power in any policy argument. Peruvian policymakers were happy to copy Chile’s private pension provider model: if the Chileans, who we secretly envy and aspire to emulate, think this is a good idea, then we should think so, too. (Only, maybe, it wasn’t such a good idea.) Discussions about education reform in the UK are peppered with references to the successes of the US or Scandinavian models. (Brits are secretly in love with Americans and openly in love with Scandinavians.) Only to their successes, of course. Failures are never included in the op-eds or TED talks used to argue for reform.

But positive experiments to refer to require someone to try them out first. Someone has to develop the idea, test it, learn from it, scale it, record it and promote it elsewhere. Traditionally, this has been the role of government, with researchers and think tanks playing minor, supporting roles – developing the ideas and concepts, nudging, informing, responding, challenging, pointing out mistakes, sharing successes and lots of other small interventions that, collectively, may make a small difference but never fundamentally resolve an issue.

This is because generating positive experiments involves taking on a big political risk – by their nature, you might end up with many failures before arriving at a success. The costs could be political, social and economic: incorporating changes to the school curricula could alienate a party’s traditional constituency; a mistake in the delivery of new protocols for water safety could lead to unintended illness by poisoning and, rightly, to criminal charges and the figurative death of a few political careers – if not to the death of individuals; and the introduction of a new system to simplify SME registration could lead to unexpected backlogs and costs -which could cost jobs.

So encouraging governments to take on a policy innovation, test it, scale it and test it again has always been hard. And it should be!

RCTs offer funders a very powerful tool to generate these positive experiences and avoid many of the barriers that governments face. An RCT can be done by researchers with some or no collaboration from government at a manageable scale. Positive results could be scaled beyond the country by the funders’ and the researchers’ networks -even if the governments involved in the pilots themselves did not scale up the experiences.

Governments can remain free from the backlash of failed experiments. And funders and researchers can claim that their interventions are based on evidence – even if they lack political legitimacy.

In essence it is a partial privatisation of policymaking.

But this new context raises two fundamentally moral questions: who is responsible for a private intervention gone wrong and how can they be held accountable?

This is no longer the traditional relationship in which a “researcher provided a policy idea and the policymaker was free to act on it”. In this old relationship, the researcher could remain (and was right to) at arms-length of the implementation of the idea. Researchers could even stay clear of the welfare implications of a political decision. Sure, their recommendation may negatively affect low income families, but the decision to actually do so was not the researcher’s to make. It was the policymaker who chose, freely, to adopt the recommendations.

This is a new relationship in which the researcher adopts part of the role of the policymaker. Critically, not as a consultant, who merely delivers what the policymaker requests via well defined terms of reference and through a contractual relationship that would normally involve the consultant securing liability insurance for when things go wrong and no autonomy to decide if the intervention goes, is halted or stopped.

What we now have is a new relationship in which the researchers retain a significant level of agency and have been empowered, by their funders or by willing policymakers, to make choices about the public and their welfare.

If things go well, researchers will be quick to claim success. Well, equally, when things go wrong they should quickly accept responsibility.

They may think twice about it, though. When things go wrong in public interventions people suffer. They lose power. They may lose income, or see their health affected, or they may experience a disruption to a service they depend on.

The recent case of Evidence Action brings this discussion to life. Evidence Action had an idea: seasonal migration could increase the income of rural families in Bangladesh. A small grant could help families take the first step and send a family member to a nearby town or city. They would help raise the income of their families and after a positive experience the process would continue in the future without the need of a grant. This may also help discourage permanent rural to urban migration. This idea was backed by some evidence but it had to be tested on the ground.

There were expected risks. When this idea was presented to the Peruvian Ministry of Development in 2018 as a successful intervention (although it was still being tested in Bangladesh) which the government could apply in the rural parts of the Amazon region, local researchers were quick to shoot it down. This would certainly involve, among other things, the risk of family/community breakup and women and children (who would no doubt be among the migrants) falling victims of human trafficking. The most vulnerable in their communities could very well be put at risk of death. In their opinion this was a risk too high to accept. The government agreed.

In their explanation of why the project was closed, Evidence Action acknowledges that one of the risks of the intervention was that families would choose minors to migrate in search of work. According to a recent statement, they considered that the risk could be minimised – yet not eliminated. I am sure they did their best to avoid the encouragement of underage migrants but they knew that this was impossible to guarantee. They calculated, though, that the benefits would outweigh the costs.

Unfortunately,

in January 2019, one of the known risks of the program came to bear in a tragic manner: an overloaded truck fell onto a temporary shelter in which several seasonal migrants who had migrated to work at a brick kiln were sleeping, killing 13 individuals, five of whom were affiliated with households that participated in No Lean Season. Moreover, four of the five were underage males aged 15-17. We were deeply saddened by this incident and the implications for these five No Lean Season households.

And this is the issue. They, researchers, not elected or appointed policymakers, made that calculation: a few lives (maybe only just one) against the potential income increase for many, many others.

Minors are logically exposed to greater risk when they travel on their own. And a politician would know that he or she would be held accountable for the deaths of underage migrants if his or her department was running the programme. Politicians may avoid criminal charges but they would certainly face political costs. Policymakers running the programme could very well be charged with negligence. And if there was any negligence on the part of a front-end civil servant or private contractor, they could face criminal charges.

But, in the case of Evidence Action, this kind of accountability is not being considered. Are they responsible for the deaths of the underage migrants who used their project grants to migrate?  How can they be held accountable?

Whose fault was it?

Evidence Action argue that the idea works but it was poorly implemented. This is why the results of their intervention have not been positive (have they been negative, then?) and that the deaths of underage migrants are a tragedy, of course, but something they worked really hard to avoid.

However, unlike the traditional researcher who offers advice for others to consider, they cannot claim that this is beyond their responsibility.

We could draw a parallel with clinical trials – which is fitting given the origin of RCTs. Test cases assume a certain degree of risk. A person with an illness accepts to participate in a trial aware of the risks – which are informed to them and they have to sign lots of release forms that leave very little doubt about the process. A patient may react poorly to a new medicine, get worse or even die. If this happened the researchers would certainly feel sorry but could not be held responsible for the death. This is why the trial exists and, surely, before the medicine was tried on humans it would have gone through other research phases including lab and animal testing. So, really, they would have done everything possible to avoid this outcome and everyone would have been aware of the risks.

Furthermore, testing on humans would not just happen in a policy vacuum. This is a highly regulated field.

But what would happen if the dose provided to a test case was incorrect or if the treatment was botched by the researcher, medical doctor or nurse charged with administering the medicine? Could they claim that the subject had waived their rights? No. They (and the hospital or medical facility involved, even) would be responsible. And family members could certainly sue.

Evidence Action appear to argue that they are not responsible for poor implementation (which they claim to be the reason for the poor results – although it is not clear if this is a reason to halt the project) of the intervention. But, in my view, they are. They were in charge (sufficiently so to decide to shut down the project): the partners they refer to were only a local contractor, contracted, by them, to deliver on their behalf. They were not partners with their own agency over the project.

This isn’t an arms-length relationship. This is a formal contractual relationship with one side calling the shots.

Since they would have been quick to claim ownership for a successful outcome, in my opinion, they are responsible for what happened. At the very least “politically” responsible.

But how could they be held accountable?

In business, people say that they put their money where their mouth is. This makes sense when you risk your own money. If your business model was wrong the market will tell you by driving you out of business.

Too-large-to-fail businesses have gotten away with putting other people’s money (via bailouts) where their mouths’ are. To some degree, too, this happens in the Aid sector.

Failure is either brushed under the carpet and hidden or accepted as something expected of the complex nature of development and even celebrated as a learning opportunity (and, by the way, we are all for learning). This is partly because some funders have been too quick to celebrate any idea which promises too good to be true results.

Rarely does one hear of a researcher, a think tank, an NGO or a consultancy that has been effectively punished by the sector – the market- or directly by their funders.

A very delicate balance must be struck, of course. We do not want to stifle innovation in research and research funding but we do not want to foster of culture of impunity.

In this context, funders play (or should play) a critical role. If they are going to support an intervention on the lives of the public (and whether they should is a another issue) they must demand that their grantees acknowledge the level of risk and responsibility that they are willing to take and their funding conditions should reflect that. For example: if you are comfortable making a call on the lives of others (especially vulnerable people) then accept that failure should come along with significant business, career and/or reputational costs.

Furthermore, experiments should not be presented as successes until they have been proven, beyond doubt, to work – and have been replicated many times over. Researchers and their funders have been too quick to celebrate an interesting idea before it has been proven right: this was the case with No Lean Season which was advocated for across the world in Peru at the same time as the trial was still going on.

Failures should be quickly reported, too, and not used to promote greater scale of an idea. Peru’s MINEDULab has been hailed in conferences all over the world as a successful model to scale interventions piloted at the local level. Since it’s inception it has piloted several interventions through a partnership with JPAL and IPA. The RCTs have been successful – we trust. But according to a senior member of the team, none have been scaled. There has been zero demand form the Ministry of Education or any other public body to scale any of the successful pilots.

Since this is the whole purpose of the MINEDULab, it ought to be safe to say that it has not delivered what it was designed to do. For whatever reason: poor design, poor delivery, poor strategy, etc. But this is not how it is described by its managers, funders and campions. And in the meantime more public funds are being allocated to it and other countries are encouraged to copy the model.

“It did not work, but it works”, seems to be the message from Peru and Bangladesh.

Only, it hasn’t. How much longer does it have to not work as expected before it is brought to an end? How many more underaged migrants need to die for researchers to finally judge that their innovation should not be allowed to fail again -and the idea should be put to rest? Or how many more lives or careers need to be interrupted by experimentation for researchers to recognise that this is a risk to big to take -or, if they wish to take it, to accept the responsibility if they fail.

Have they accepted their mistakes? Have they learned from them?

As I wrote above, I do not want to pick on Evidence Action. Their’s is not the only case (and the world of social entrepreneurs is full of similar situations that merit greater scrutiny) but it is one that has been slightly better documented and advertised in recent months. So I hope it serves as a source of important lessons – as I am sure they would want.

It is easy to hide behind jargon and a narrative of good intentions and learning. Here are some excerpts form Evidence Action’s statement that I wish to unpack to address whether there have been any lessons learned so far.

When I first read it I was left with an uneasy feeling and I shared it with others to make sure I was not exaggerating. The language was highly technical, über-economic, legalistic, turning individuals, families and communities into test subjects. It was worried further when I noticed that several of their supporters had joined their argument focusing on the act of accepting that the experiment had not worked and closing down the project and missed, altogether, the people at the centre of the story.

In the first excerpt, below, they weigh up the purpose of the project and their organisational vision. The organisation wins. It isn’t about you, they say, it is about us and what we want to achieve.

Importantly, we are not saying that seasonal migration subsidies do not work or that they lack impact; rather, No Lean Season is unlikely to be among the best strategic opportunities for Evidence Action to realise our vision.

I would have expected, instead: ” …rather, No Lean Season is unlikely to be the best strategic opportunity to help poor rural communities”; but these poor rural communities in Bangladesh where never the focus of their intervention. They were chosen for their statistical significance not their moral desert.

In the following two excerpts they go on to explain how the death of underage migrants was “a known risk”, how the underage migrants who died where part of the project – and traveled with grants provided by the project, but how “there is no evidence to suggest such a causal relationship.” But the causal relationship they refer to is a technical trick to fool the reader (and a morally dubious use of a research method). They argue that they cannot say that their grants led to more underage migration. More, is the key word. However, the fact remains that their grant funded the migration of the young people who died.

Statistically speaking a certain number of deaths will be caused by dangerous driving every year, all things equal. One single driver driving dangerously for whatever reason), and killing one person, may not change the overall number of deaths, but would still be responsible for that death. Running over someone is a known risk of driving. We all accept this and assume the consequences.

Separately, in January 2019, one of the known risks of the program came to bear in a tragic manner: an overloaded truck fell onto a temporary shelter in which several seasonal migrants who had migrated to work at a brick kiln were sleeping, killing 13 individuals, five of whom were affiliated with households that participated in No Lean Season. Moreover, four of the five were underage males aged 15-17. We were deeply saddened by this incident and the implications for these five No Lean Season households. Although this incident did not drive our decision to end the program, we want to share an update on it because the investigation we launched into the incident has recently concluded.

In parallel, we analyzed both program administrative data and data from past and ongoing RCTs of the program to understand the scope of underage migration in the program and, importantly, the causal impact of the program on underage migration. We recognize that underage migration occurs in Bangladesh, where severely limited options may lead a deeply impoverished family to believe that sending a teenage son to the city to find work is their best available option. As such, we analyzed the RCT data to understand whether the program may have inadvertently led to additional underage migration beyond that which would have occurred in the absence of the program. We have concluded that, across RCTs conducted in 2014, 2017, and 2018, there is no evidence to suggest such a causal relationship.

Finally, they appear to whitewash the seriousness of what happens and use two old excuses: “oh, but we mean well” and “it is for the greater good”. Well-meaning people cannot get a free pass to take other people’s lives in their hands and not face up to any serious consequences, and, I think we can all agree by now, the end does not justify the means.

We also want to ensure that the takeaway is not that if a program faces challenges, an NGO should walk away from doing work that measurably improves the lives of tens or hundreds of thousands of people.

We are not talking of a training programme that failed to deliver more pass grades at a local school or a text message scheme to get more teachers to attend class. They can be disruptive of people’s lives but certainly not life threatening. This was a grant that was meant to induce a behaviour which involved expected risks to the health and safety -and lives- of minors.

This is not  just a challenge researchers face on a regular day. In this experiment people could die and some died. An NGO should walk away from this. And it’s funders should walk away, too. And they should think hard about what they do and why they do it. These decisions, about who wins and who loses, should not be for private agents to make without serious consequences.

So what?

I am not advocating for putting an end to experimentation. It can serve very good purposes – but not all. Neither am I suggesting that the Evidence Action should team should be locked up, by the way.

What I am advocating for is greater clarity in the roles and responsibilities of private funders and private agents (research centres, think tanks, consultancies, etc.) and for an adequate level of accountability for their choices.

I am also advocating for a greater demand of evidence before supporting and celebrating the potential of an idea.  Funders can play a key role here. Don’t celebrate them because they are your grantees. Rather, challenge them and hold them to account.

But governments, local academia, think tanks and civil society need to step in, too. Most of these cases are being led by international (Northern based) teams. They need to be better vetted, monitored and evaluated.

I think that think tanks should pursue the implementation of their ideas. But they must do this by working with and strengthening other institutions to do their jobs and not by taking over from them. Deciding who wins and who loses, especially among the most vulnerable, is certainly not their job.