You may not be able to.
A rather extreme statement, particularly from an organization that does program evaluation. Nonetheless, we stand by it. It’s hard to get too far into a discussion of the notion of proof without talking at least a bit about Karl Popper’s philosophy of science and his key idea of falsifiability. For Popper it is always easy to find evidence (legitimate evidence, I will add) to support a general idea (in this case that your program works). If a single instance can be found where the idea doesn’t hold up however, the idea will be disproven. The problem is that in principle, it is impossible to test every possible instance. There is an infinity of them. We can sometimes try to get around this through the use of inferential statistics, all those t-tests and ANOVAs you remember from your stat course that tell you that 95% of the time, the results you found would not have been due to chance. But what about that remaining five percent? You’re out of luck. For Popper, science, knowledge, only advances as ideas are repeatedly tested by scientists (or evaluators, though the field was in its infancy when Popper wrote The Logic of Scientific Discovery) and remains, in any event, always provisional. A recent article in the New Yorker in fact shows that a few scientists are beginning to question even this. And then there’s this one from the New York Times.
So how does policy advance? We were involved recently as the local evaluator on a federal project designed to assess the effectiveness of providing certain kinds of support to low income married couples designed to help them keep their marriages together. There were a little over a dozen sites scattered throughout the country each of which had a local evaluator. A national evaluation organization was tasked with the job of integrating the findings of the local evaluations and producing a report on the overall efficacy of the idea. We believe that’s how funders should conceive of evaluation but they almost never do.
Funders certainly have a legitimate responsibility to fund organizations that do good work (as opposed to those that do good works). The problem is that they have somehow come to believe that good work and statistically significant outcomes based on randomized control group trials are the same thing. How this came about would make for a fascinating study. Such a study aside, the fact is that most well managed social programs have positive outcomes. But they have them only for some of those they are designed to serve, and then only under very specific circumstances. Evaluation is useful when it can help nonprofits determine for whom and under what circumstances.
Let’s put it another way. Which organization would you rather fund, the one that can document good average results, or the one that recognizes that successes are always provisional, that there is a great deal to learn from occasional failure and that ongoing data collection and assessment is the best path towards continuous improvement?