Development programs and policies are intended to improve people’s quality of life. Evidence-based policy making aims to identify the most appropriate intervention, track a program’s implementation, and measure whether the intended impacts were actually achieved. Generating rigorous evidence, in turn, can enhance accountability, inform budget allocations, and guide policy decisions. To help facilitate the implementation of the development effectiveness agenda, the IDB has institutionalized the practice of evaluating IDB-financed operations through a variety of evaluation methodologies.
Most of the project evaluations carried out by the IDB fall into one of the following three categories: cost-benefit, cost-effectiveness analysis, or impact evaluations. A cost-benefit analysis quantifies the costs and benefits of a program in monetary terms, while a cost-effectiveness analysis compares the cost of similar interventions for achieving a desired outcome. Impact evaluations assess changes in outcomes that are directly attributable to a program. The emphasis on causality and attribution is its hallmark.5
Some 44.5 percent of IDB public sector projects approved by the IDB’s Board of Directors in 2015 planned to use impact evaluations (experimental or quasi-experimental methodologies) to evaluate some component of their interventions, while 47 percent proposed cost-benefit and cost-effectiveness analysis (Figure 3.1).6
As mentioned, impact evaluations are distinct from other forms of program evaluation because they can provide empirical evidence of the causal effects of programs or policies on important outcomes. Only impact evaluations allow policy makers to verify the attribution of their programs or projects on specific outcomes: in other words, they can verify whether programs are fulfilling their objectives. The portfolio of impact evaluations to date consists of a total of 478 evaluations, which includes not only evaluations of IDB-financed operations, but also evaluations of non-IDB supported programs. The IDB is often approached for its expertise to evaluate interventions carried out by its strategic partners.
As of December 2015, 46 percent (221 evaluations) of all evaluations were in the design stage and 16 percent (76 evaluations) had been concluded (Figure 3.2). About 2 percent of impact evaluations have been cancelled. As discussed in the DEO 2014, difficulties and unforeseen circumstances throughout program implementation have been the main reason for cancellation.
In terms of sector priorities, the majority of evaluations have been carried out by the Social Policy for Equity and Productivity sector (50 percent), followed by the Institutions for Growth and Social Welfare sector (29 percent) (Figure 3.3).
Since the IDB established its Development Effectiveness Framework (DEF) in 2008, the percentage of approved public sector projects that have been or will be subject to an impact evaluation has increased considerably (Figure 3.4).
To date, 351 impact evaluations of IDB-funded projects have been designed, completed, or are under way7. Of the 83 sovereign-guaranteed loan operations approved by the IDB in 2015, 36 (43 percent) planned an impact evaluation, compared to only nine projects (9 percent) approved during the first year of the DEF in 2008. In the past two years, the percentage (and number) of approved projects that are subject to an impact evaluation has decreased.
Since impact evaluations are costly, it is important to direct those resources toward projects for which they are most beneficial. Projects where substantial knowledge gaps have been identified, pilot projects that could eventually be scaled up, and large projects where accountability is critical, merit evaluations. The design of an impact evaluation needs to take into account not only the program’s desired impact, but also factors such as data availability, timing, logistics, and cost.
After the DEF was established, both the Bank and its country counterparts have systematically considered the possibility of carrying out an impact evaluation early on, during the design of each project. This has increased awareness of the benefits of impact evaluation, and, by ensuring an appropriate evaluation methodology, has strengthened the commitment of all parties involved. For example, some evaluations need to identify the comparison groups before a project even begins. Moreover, the decision on how, when, and what to evaluate can vary from project to project. For example, not all the evaluations supported by the Bank and its borrowing member countries can use randomized control trials, considered the most rigorous evaluation methodology.
Randomization involves assigning potential beneficiaries to a treatment or comparison (control) group. This ensures that groups are on average the same, and that other effects attributable to factors beyond the scope of the project can be separated from the effects that can be attributed specifically to the project. In some of the projects that are featured in the following pages, excess demand for the services being offered facilitated the use of randomization. The evaluation in Chile, for example, randomly offered admission to an after-school program for children ages 6 to 13, while in the evaluation in Venezuela early admission was offered to the country’s most renowned youth music program.
Quasi-experimental methodologies including regression discontinuity design and propensity score matching, provide alternatives to randomization in order to identify a control group.
A regression discontinuity design employs a specific program requirement to distinguish the treatment group from the counterfactual: that is, a comparison group to allow evaluators to estimate the outcomes, whether positive or negative, of what could have happened had the program not been implemented. For example, an agricultural program aiming to eradicate fruit fly plague was implemented only in select geographical areas of Peru because of funding limitations. A control group was identified from among farmers in neighboring areas who were confronted by the same plague and grew the same crops, but were ineligible for the program because they did not reside in the area where the program was being implemented.
In the case of propensity score matching, the control group is constructed by searching among a sample of non-beneficiaries whose observable characteristics are similar to those of program beneficiaries. In other words, based on the available information, a non-beneficiary “clone” is matched to each beneficiary of the intervention. For example, firm characteristics such as size, employment, and sector helped identify “clones” in an evaluation of a productive development program in Argentina, but firms differed in characteristics such as their motivation to participate in the program.
All the knowledge the IDB has acquired on program evaluation since 2008 has been centralized on a web platform known as the IDB’s evaluation portal (www.iadb.org/evaluationportal). This site contains a variety of resources, tools, and guidelines which users can consult and adapt to their specific needs. In addition, in 2015 the IDB developed a module within the Bank’s platform to monitor sovereign guaranteed (SG) operations to track all inputs of impact evaluations. All working papers related to impact evaluations are published online on the IDB’s publication site. The IDB has also developed a series of seminars and courses, known as the Development Effectiveness Series, to foster knowledge sharing on program evaluation among IDB employees and government authorities. IDB specialists incorporate all this knowledge into IDB operations when designing future projects.
After all these years implementing program evaluations and documenting and sharing the findings, recurring obstacles have been identified.
Teams are often faced with serious challenges when conducting an impact evaluation, and different factors influence the success of an evaluation. Box 3.1 presents five lessons that have been learned about how to best evaluate the impact of a development project.
This chapter showcases a representative selection of 12 recently concluded project evaluations on a wide variety of topics that use different methodologies in specific contexts. They highlight the shared interest of both country authorities and the IDB in learning what works and what does not work in order to improve the effectiveness of the development interventions they support as well as related policymaking. The first evaluation is from Haiti, where the project team faced several challenges in performing a rigorous cost-benefit analysis of an infrastructure project. It is followed by accounts of 10 experimental and quasi-experimental impact evaluations, starting with an evaluation of an intervention to support Argentine firms and ending with an evaluation of a widely recognized music program in Venezuela. Chapter 3 closes with an evaluation that uses a different type of methodology – a meta-analysis – to assess the use of technology in the classroom. This example shows how analyzing rigorous impact evaluations on a particular topic, but in different contexts, can yield recommendations for best practices for designing and implementing a program. In other words, these stories illustrate not only whether a program has an impact, but also how to make them more effective.