News & blog

Blog: Let’s stop chasing unicorns

By Patty Lozano-Casal, Evidence into Action Manager, Evaluation Support Scotland

I’ve always had an active imagination.  I remember as a child believing that fairies spoke to me through the sound of the wind.  Children believe in things that they’ve never seen before like fairies, superheroes, dragons and unicorns. They don’t have any evidence of their existence, yet they think they’re real.  As adults we no longer search for the philosopher’s stone; instead, we look for robust evidence before we form a view or make a decision. So I find myself wondering, at what age do we lose the ability to believe in what we can’t see?  And is that a good or a bad thing? 

In the ‘evaluation world’ some people speak of Standards of Evidence, which are frameworks to help us determine how confident we can be that an intervention is having a positive impact.  The first Standards of Evidence for the UK were developed by Dartington Social Research Unit (DSRU); others like Project Oracle and NESTA use adaptations of these.  

So there are different sets of standards out there, but do they work in practice or are they mythological creatures like unicorns?

Let’s look at some evidence.  Project Oracle offers their funded projects to have their evaluation validated against their own set of standards for £400-£750, depending on what standard they choose.

So far, Project Oracle has done 296 validations, of which they found no robust evidence of ‘model’ and ‘system’ ready.  Most evidence provided by the 296 projects fell under standards 1 and 2 (i.e. ‘project model and evaluation plan’, ‘indication of impact’).  So what does this tell us?

I was recently invited to share ESS’s experience on this topic at a panel debate around the Standards of Evidence at the Realising Ambition conference on 1 June 2016.  What I heard that afternoon resonated with what ESS learned from supporting third sector organisations and funders in Scotland for over 10 years.

Realising Ambition funded projects made very clear that existing standards of evidence are too theoretical and not always applicable in practice, particularly for small third sector organisations with limited capacity, expertise and/or resources.  So could the Standards of Evidence as we know them really be dead?  Louise Morpeth, DSRU’s Chief Executive, was open and honest about the limitations of the Standards, one being their linearity, which does not translate to reality – generation of evidence is actually an iterative process.  Jonathan Breckon, Head of Alliance for Useful Evidence, noted that “not all evidence is born equal” and Tim Crabbe, using holiday-planning as an analogy for how decisions are made in practice, said that sometimes we use other than the standards of the market (e.g. TripAdvisor) to decide where / how to go for holiday (read Tim’s entertaining blog for more details!).

So if the Standards of Evidence were to be dead, what could we use to assess the quality and robustness of our evidence to make informed decisions about service delivery and policy-making? 

ESS suggests the following:

  1. Rather than an absolute standard, consider:
  2. Why do we need evidence?
  3. What will we use it for?
  4. What do we already know?
  5. What evidence is good enough (for the action we need to take)?
  6. Don’t look just for evidence of whether something works, but how it works, how long it takes and how much it costs.
  7. Triangulate your evidence – Good evidence informed decision-making needs evidence from a mix of sources (so not a hierarchy!) including:
  8. What people on the front line tell us – i.e. tacit knowledge
  9. What self-evaluation evidence tells us
  10. What formal research tells us.
  11. Co-produce and involve the service user when generating evidence
  12. Use the following framework, adapted from Levitt et. al. (2010), to judge whether your evidence is ‘good enough’:
  13. Transparent: clear methods, acknowledged limitations
  14. Relevant: up-to-date and appropriate
  15. Enough: strength of evidence vs proportionality
  16. Believable: accurate, representative and reliable
  17. Legitimate: coming from the right sources.

If after reading this you still think you and / or your organisation are chasing unicorns why not read the Evidence for Success guide and Realising Ambition programme insights on what works in replication and evidence use?

I better go now.  Hold on, where’s my dragon…?

Patty Lozano-Casal, Evidence into Action Manager, Evaluation Support Scotland


Jonathan Breckon, The Alliance for Useful Evidence

I don’t think we can even begin to suggest that standards are dead.  It’s just empirically not the case!  Indeed, we have the opposite problem of too many and continuing to grow.   If they are unicorns, then they are everywhere and breeding fast!

For instance, we have the DFID guidance on evidence standards, Nesta, Early Intervention Foundation, NPC, UK Active, Project Oracle, Education Endowment Foundation, Cabinet Office Centre for Social Action, BIG Lottery pilot, etc. 

What’s really interesting is they are starting to get more nuanced and sophisticated e.g. the College of Policing standards uses context as one criteria and has more of a nod to realist evaluation (the Public Policy Institute for Wales/What Works Centre for Wales is also likely to take up a version of this standards). The UK umbrella body for international development NGOs – BOND – has some Evidence Principles that v popular with evaluation community (including ethical and inclusion categories too). 

These are being used right now on the ground.  And that’s just in the UK! GRADE, Maryland Scale etc. continue to dominate – and not just in medical field. Now I can see a criticism of those as too narrow, but I can’t say they are dying or illusions.

The Dartington standards have also, for me, evolved in a fascinating way that make them even more useful and influential, as seen in this webinar that Louise Morpeth gave.  There seems to me, however, a clear consensus amongst us all that we need to reform what’s currently out there and have more alignment amongst the growth in ‘standards’. As somebody in RAND Europe told me, it feels like an ‘arms race’ of new standards launching.

The issues these ‘standards’ confront are also as vital as ever and growing (perhaps as there has been a culture change towards more arguments using evidence, but nobody has measured that). 

Even without the frameworks (the British Standards Institution has reminded us that we don’t have ‘standards’ in any official or legal sense, only private frameworks), there is still a vital principle of ‘not all evidence is born’ equal, and some designs/methods/data collection approaches are more appropriate, proportionate and relevant than others. Decisions have to be made about what is better than others.

For instance, it was fascinating to hear ASH Scotland show how important the quality of evidence was regarding plain packaging of cigarettes – the tobacco lobby were trying to fight it using their version of evidence – we needed the breadth of longitudinal, cross-country comparisons, RCTs, QED, qual interviews, to make the case for plain packaging – and avoid just one monolith of ‘evidence-based policy’ as if all evidence is of the same strength and relevance. 

Hope that is helpful in some way and there can be some recognition of they are very much the opposite of dead!

Patty’s response to Jonathan’s:

First of all, thank you for taking the time to write your response.  I agree that people are using different versions of The Standards and I guess what I’m saying is that ‘one size does not fit all’ and that a ‘standard’ approach is kind of like a unicorn (most third sector organisations will never generate the ‘gold standard of evidence’ and neither should they as it wouldn’t be proportionate at all).  So, I agree that people use sets of standards but I often wonder whether these are created based on a hierarchy of ‘quality of evidence’ or the robustness of the methodology.  Don’t get me wrong, I understand that the chosen method has a lot to do with the quality of evidence generated, but I also know that there are multiple parameters that can affect the efficacy of a method (particularly when it comes to sampling) and the higher levels of evidence require specific sets of skills, experience and quite a lot of money and time.  I think The Standards give people a framework to think their evidence generation through and that’s great but, ultimately, we need to generate evidence that is proportional, relevant, useful and answer the questions we ask.  We held an event for funders in Scotland yesterday and it was very clear from the discussions that ‘trust’ is a big element when it comes to the evidence they’re willing to accept from funded projects.  Most funders involved in these discussions around evidence didn’t know about the existence of The Standards so perhaps what I mentioned in my blog only applies to Scotland.