Impact/outcome evaluation designs and techniques illustrated with a simple example

· Uncategorized
Authors

Introduction

[Note: This article is still being developed. Please post any comments for improving it at the end of the article].
This article works through a simple illustrative example to show the range of possible impact/outcome evaluation designs and techniques for improving the similarity between comparison and intervention groups in impact/outcome evaluation. Impact/outcome evaluation is one of three types of evaluation. It helps work out whether or not changes in high-level outcomes can be attributed to a particular program or intervention; in contrast, formative/developmental evaluation helps to optimize program implementation and process evaluation is aimed at describing the course and context of a program. Whether or not an impact/outcome evaluation is appropriate, feasible or affordable in the case of a particular program needs to be carefully considered by taking the characteristics and context of the program into account. (See: Impact evaluation: When it should and should not be used). For how to select which type of impact/outcome evaluation design to use see: Selecting impact/outcome evaluation designs: A decision-making table and checklist approach); for descriptions of the seven possible impact/outcome evaluation design types see: Seven possible impact/outcome design types; for techniques for increasing the similarity between comparison and intervention groups see: Techniques to improve matched comparison group impact/outcome evaluation designs).
In the case of each impact/outcome evaluation design, a checklist is provided which sets out some of the important aspects needed in order to assess the appropriateness, feasibility and affordability of the design. These checklists are from the article: Selecting impact/outcome evaluation designs: A decision-making table and checklist approach
It should be noted that not all of these designs (particularly the last two) are accepted by all stakeholders as providing robust information to attribute changes in high-level outcomes to particular interventions, however some stakeholders in some circumstances accept them as providing useful information.
 

The example

The example used here is a very simple example to make it easy to clarify the different types of impact/outcome evaluation design and techniques which can be used within some of those designs. The restaurant theme used in the example originated because the example was developed by the author while out at a dinner during an evaluation clinic at which he was an expert evaluation adviser. The fictitious example is:

A number of restaurant patrons are banging their heads as they leave particular restaurants which have low door arches. This is obviously a problem for restaurant owners because patrons are going way from their restaurants with headaches. An educational program is being introduced into a restaurant to decrease the number of patrons who bang their heads. The educational program consists of staff showing a group of patrons a video of the correct procedure for bending their heads when going through the restaurant door. It has been decided that an impact/outcome evaluation should be undertaken. For the purposes of this example it helps to assume that the same patrons dine at the restaurant every night.

Each of the seven impact/outcome evaluation designs and four techniques for making comparison groups more similar to intervention groups have been applied to the example and are set out below.

True randomized experimental designs 

Setting up a true experiment for the example would consist of taking the following steps:

  1. Select a group of patrons from the total number of patrons in the restaurant using a random selection method (e.g. by putting the names of all of the patrons in the restaurant in a hat and selecting the number needed). The use of the random selection method is to make sure that the patrons in the overall sample are likely to be similar to the average patron. This overall sample would then be divided in two, again using a random selection method. There would then be two groups which should be similar to each other and also similar to the average patron.
  2. Give the intervention group the intervention (e.g. take them into a separate room and have a staff member show them video on avoiding head-banging). Ideally the intervention should be observed by a process evaluator to make sure that the patrons actually receive the intervention (e.g. does the staff member bother to turn the video on? Do the patrons actually watch it, or do they just talk amongst themselves?).
  3. When patrons leave, have someone at the door observe how many of the control and intervention groups bang their heads. Compare the results for the two groups. Use statistical testing to work out if the results are likely to have occurred by chance or because there actually is a real effect from the intervention.
  4. A variation on the true experiment is to use what is called a ‘waiting list’ or ‘pipeline’ design. This is where exactly the same procedure is used as in 1-3 above. However the intervention is only withheld from those in the control group for a limited period of time (the time they spend on the ‘waiting list’). This is in contrast to a true experiment where the control group would never get the intervention. This design is often regarded as more appropriate (because it is more ethical in that the control group does not miss out on the intervention, and because the control group is less likely to suffer from effects of being in the control group e.g. getting depressed at missing out on the intervention, which could potentially influence their outcomes) and more feasible (because participants and stakeholders are more likely to accept it) than true experiments. The problem with the design is that you need to be able to measure the effect of the intervention in the time between the intervention group getting the intervention and the control group getting the intervention. This would be possible in this example, however in the case of other types of interventions (and interventions which take a relatively long time to improve outcomes) there can be major problems with a waiting list/pipeline experimental design.
A number of factors need to be taken into account in an experimental design and a number of the other designs described below. These are:

Treatment fidelity

Treatment fidelity is the extent to which the program has been faithfully implemented in a particular evaluation. Imagine that instead of showing the video on avoiding head-banging, the staff member just discussed with the patrons something irrelevant to avoiding head-banging, or the patrons did not bother looking at the video but talked with each other instead. The experiment would then not be a true measure of the effectiveness of the intervention because the intervention had not been implemented properly. Intervention fidelity is assessed by having someone observe the intervention which is being given. Sometimes this is undertaken as part of what is called a process evaluation component of an overall evaluation.

Placebo effect

In the case of some programs (not so much this particular example) the mere fact that a person has received an intervention of any type can improve their outcomes. The best example of this is when patients are given drugs as part of a drug trial. It has been shown that merely being given a drug (even if it is a sugar coated pill) can, in a number of cases, result in the patient getting better, or at least feeling better. If this was thought to be an issue in the case of the example program being used here, the control group could be given a placebo. For instance, this could take the form of them watching a general video by a staff member which did not mention avoiding head-banging but which would be perceived by the patrons as an intervention. Alternatively a three-group experimental design could be used where there was an intervention group given the intervention, a placebo group given just the placebo, and another control group not given anything. This would mean that the size of the placebo effect could be worked out (by comparing the results for the placebo group with the control group results).

Control group compensatory treatment

This occurs where, because the experiment is being conducted, either the control group itself, or someone else, gives them a version of the intervention because they think that it is unfair for the control group to miss out on what the intervention group is getting. In the case of this example, it could be that a staff member thinks that the intervention works well and feels sorry for the patrons in the control group – because they are more likely to bang their heads. So the staff member quietly offers to show the patrons in the control group the video so that they also will avoid head-banging. It is the purpose of a process evaluation to help work out whether this is happening in the case of a particular intervention.

Statistical significance testing

Just finding that there is a difference between the number of head bangs in the intervention versus in the control group, does not, in itself, mean that it can be concluded that the intervention reduced the number of head bangs. This is because of the role of chance in determining whether or not someone will bang their head. In many situations such as this example, it may just be a chance occurrence that the intervention group, on the occasion that they are being measured for the evaluation, just happened to have banged their heads less. This issue is dealt with by using statistical significance testing. This is done by using statistics to work out the chance that the observed results occurred only by chance. The smaller this chance is, the more reasonable it is to conclude that the head-banging results which have been observed on the particular occasion are actually a result of the effect of the intervention, not merely because of chance. An alternative way to deal with this is to repeat the experiment many times and when it is feasible and affordable to do this, this can be done. However, often it is not appropriate, feasible or affordable to repeat the experiment many times. In these cases (usually the majority of cases) there is a convention that, if statistical significance testing shows that the chance of the observed results having occurred by chance is less than 5% (or being more rigorous, less than 1%) then it is accepted that the results have been caused by the effect of the intervention. (These probability levels are usually described as 0.05 and 0.01 respectively rather than as percentages).
In general, the larger the sample size the easier it is to work out the extent to which the observed result may have occurred by chance. A technique called statistical power analysis is used to work out what would be an appropriate sample size to allow a reasonable assessment of how much chance was involved. Statistical power analysis provides a figure for the size of the sample which should be used in an evaluation using an experimental (or some of the other) designs.

Points about these particular designs

Points about the appropriateness, feasibility and affordability of this design are captured in the following checklist.

Regression discontinuity designs

Setting up a regression discontinuity design for this example would consist of the following:

  1. The patrons would be ranked according to a measure on which they could be put in a descending or ascending order. For instance, they could be ranked on their physical height.
  2. Measurements could be taken before the evaluation started of the number of times over a week that patrons of different heights banged their heads. These measurements could be graphed and probably would show that the taller someone is the more likely they are to bang their head. This could be represented by a line graph of increasing likeliness of a patron banging their head the taller they are.
  3. The patrons could then all be lined up against a wall from the shortest to the tallest. The intervention group would be those who are above a certain height and the intervention would only be given to them. (This design is a good design where there is a limited budget for the intervention because (in comparison to a normal random experiment design) those most in need of the intervention (in this case tall people) are the ones who get it).
  4. The number of times that people banged their heads over the next week as they left the restaurant would be recorded.
  5. If the intervention worked, it would be expected that those in the intervention group (the tallest) would have a reduced likelihood of banging their heads.
  6. When the results were graphed, the results for the tallest (i.e. those who received the intervention) would have shifted down below where they would be expected to be without the intervention. This is the ‘discontinuity’ referred to in the name of the design.

Points about these particular designs

Points about the appropriateness, feasibility and affordability of this design are captured in the following checklist.

Time series analysis designs

Setting up a time series analysis for the example would consist of the following step:

  1. The number of times that people banged their heads could be recorded for each night for a period of time (say three months). Then the whole restaurant would be given the intervention and the number of times that patrons banged their heads for the next three months would also be recorded. If the intervention had an effect, there would be a clear drop in the graph at the point in time when the intervention occurred.

Points about these particular designs

Points about the appropriateness, feasibility and affordability of this design are captured in the following checklist.

 

Constructed matched comparison group designs

Setting up a constructed matched comparison group design for the example would consist of the following steps:

  1. A ‘comparison group’ of some sort would need to be constructed. There is a wide variety of ways of in which this could be done. For instance, if the intervention is taking place in one restaurant, another restaurant, where the intervention is not taking place, could be used as a comparison group. Alternatively, patrons within the same restaurant could just be asked to volunteer to be in the intervention group (as opposed to a true experiment where the evaluator would have control over who went into the intervention and went into the control group). Another approach would be to select a group of potential participants for the intervention. One week half of them would be put through the intervention and several weeks later the other half would be put through the intervention. The number of time patrons banged their heads would be compared for the two groups: the intervention group and the ‘waiting list’ group. This is called a ‘waiting list’ or ‘pipeline’ design.
  2. Where random selection has not been involved it is often the case that the comparison group differs from the control group in important ways (e.g. the two restaurants being compared above may serve people of different heights, or those who select to be in the intervention group within the one restaurant may be taller, and hence more likely to bang their heads; or because they may be more cautious people they may decide to do volunteer for the intervention but this may also mean that they are generally less likely to bang their heads). If any of the ways in which the comparison group differs from the intervention group is related to patrons banging their heads, this can cause a problem because any difference between the intervention group and the comparison group in the number of times patrons bang their heads may just be as a result of the differences in the two groups, not because of the intervention. In technical terms, the intervention and comparison groups are said to differ on key variables and if these variables are also related to outcomes they are described as ‘confounding variables’.

Techniques for dealing with differences between intervention and comparison groups

There is a range of techniques for dealing with differences between intervention and control groups. (See:
Techniques to improve constructed matched comparison group impact/outcome evaluation designs). Applied to this example these techniques could work in the following ways:

Difference-in-Difference

Using this technique, imagine that patrons normally learnt to reduce their head-banging over a period of weeks from the time they first came to the restaurant and that the head-banging of all patrons was tracked over time. Say that volunteers had been called for to enter the intervention group and it was shown that these people started off with higher rates of head-banging (that is one of the reasons they were more likely to volunteer to be in the intervention group). The reduction in head-banging of the intervention group (even though it started off higher) could be tracked and compared with the reduction of head-banging in the comparison group using statistical techniques. This could be used to establish that the intervention group improved more than one would have thought they would have improved when compared to the lesser trend of improvement in the intervention group.

Instrumental variable

Using this technique, an attempt is made to find a sub-group of people within the potential comparison group who are likely to be similar to the intervention group. The problem this technique is trying to solve is, for instance, the self-selection problem that patrons who self-select may be just those who have banged their heads recently. Such patrons will obviously be more concerned about banging their heads and so be more likely to want to participate in the intervention. The self-selection problem with this group is that if a patron has banged their head recently, it may also be the case that they are less likely to bang their head in the near future because they are now aware of the hazard. If the outcomes for the intervention group (made up of those who have recently banged their heads) are better than those for the comparison group, this may just be because the patrons in the intervention group are less likely to bang their heads because they have become aware of the hazard because they have recently banged their heads. This is in contrast to the intervention actually having had an effect on the rate of head-banging.
An instrumental variable technique can be used in a case like this to attempt to find a group of patrons in the potential comparison group who are not affected by the self-selection problem. One way of doing this, where there are enough patrons available to select from to go into the potential comparison group, is to try and find a sub-group within the comparison group of those who are likely to be similar to the patrons in the intervention group. This is done by finding patrons who have not gone into the intervention group due to a different reason than just self-selection related to the outcome (i.e. in this case, because they have banged their heads recently). Instrumental variables are often things such as a practical reason why people have not gone into the intervention group. An example in this case could be people who are wanting a quick meal because they are going to a movie after they have eaten. Such patrons are not going to be interested in being in the intervention group because it requires them to go into another room and watch a video which will take up some of their precious dining time. Comparing the amount of head-banging for this group of movie goers with the intervention group may be a way of avoiding a situation where the comparison and intervention group differed in regard to whether they banged their heads recently (and hence outcome differences could be caused by this self-selection variable).
Of course, the evaluator needs to be confident that there is not some other difference between the new comparison group (those just wanting short meals because they are going to a movie) and the intervention group which may affect the result. For instance, those just wanting short meals may rush out of the restaurant when they have finished and be more likely to bang their heads on the way out; or, alternatively, they may drink less and hence be more careful on their way out. Looking for instrumental variables is often difficult and can be controversial in that the evaluation results can be challenged by those who do not agree that the instrumental variable is not related to outcomes.

Propensity score matching

Another technique, often used in areas such as labor market evaluations (because of the availability of data on the labor market outcomes of groups in the population) is called propensity score matching. This is where general ‘propensity’ of people (or units) of different types to have particular types (or levels) of outcomes (without the intervention) is calculated. If this can be done well, then the propensity score (the predicted outcome for people of a particular type) can be compared to the actual ‘score’ of people in the intervention group.
Propensity score matching, as in all matching, relies on the ability to capture in the information used for matching, all of the important characteristics related to the outcomes. The characteristics which are used for matching are called observables. A problem arises when there are unobservable differences between the intervention and comparison groups which are not picked up by the observables and hence included in the calculation of the propensity score.
In the case of this example, propensity score matching would work in this way. Data would be collected on head-banging and a wide range of other information about patrons. (e.g. age, gender, amount of alcohol drunk, height, risk-taking behavior (ideally), whether they were talking to other patrons as they left the restaurant etc.). Ideally this data would cover the things most likely to be related to head-banging. (In reality the data on ‘observables’ collected in such evaluations is often not that comprehensive). Statistical techniques would be used to work out how likely it was than any particular type of person would bang their heads. This would provide a ‘propensity score’ for anyone about whom you are able to collect the same data (e.g. age, gender, amount of alcohol drunk etc.). The intervention would then be given to a group of patrons. Next, the ‘non-intervention’ head-banging results would be worked out for patrons who received the intervention using the formula discussed above which would provide a propensity score for each of them. Lastly their actual level of head-banging following the intervention would be measured and compared to their ‘predicted’ non-intervention head-banging to see if the intervention had an effect.

Case matching

In this design, data would be collected on every person in the intervention group (e.g. age, gender, amount of alcohol drunk, height, risk-taking behavior (ideally), whether they were talking to other patrons as they left the restaurant etc.). The waiting staff would then go around the restaurant and collect data on other patrons. When they found someone who was very similar to a person in the intervention group in terms of the data described above, they would then ask them to go into the comparison group. The head-banging results for the intervention group would then be compared with the head-banging results for their ‘matched cases’ in the comparison group.

Points about these designs

Points about the appropriateness, feasibility and affordability of this design are captured in the following checklist.

 

Exhaustive causal identification and elimination designs

Setting up an exhaustive causal identification and elimination design for this example would consist of the following:

  1. Whether or not outcomes have improved for those in the intervention group would be measured. However, the mere fact that outcomes have improved does not, in itself, necessarily mean that the improvement has been caused by the intervention [1]. In this case the measure would be a reduction in the number of times that patrons banged their heads (say in the week before the intervention and in the week after the intervention). The causal issue is whether it is the intervention, or some other factor, which caused the head-banging to reduce.
  2. The evaluator would identify all of the possible reasons as to why head-banging could decrease. E.g. raising the height of the restaurant door arch; publicity in the local newspaper about the risks of head-banging; some other restaurant running programs about reducing head-banging.
  3. The evaluator would then interview a sample of those who underwent the intervention and in such in-depth interviews they would explore with the patron whether or not they were aware of any of these other potential causes. In some designs they may also collect additional information about the causes (e.g. in cases where the patrons may not have full information about the possible cause – the raising of the door arch may be such a case). They may also, in some cases, interview others who did not receive the intervention.
  4. This work should enable the evaluator to rule out alternative causal explanations and hence, by a process of elimination, allow them to conclude fairly robustly that it is likely that the intervention led to a reduction in head-banging.

Points about these designs

Points about the appropriateness, feasibility and affordability of this design are captured in the following checklist.

Expert judgment designs

Setting up an expert judgment design for this example would consist of the following:
  1. Identifying an international or national expert on programs for reducing head-banging in restaurants.
  2. Contracting the expert to visit the restaurant, undertake what investigations they wished (they may do elements of some of the other designs above, particularly the exhaustive causal identification and elimination design) and make a judgment on whether or not they believe that the intervention led to a reduction in patron head-banging.

Points about these designs

Points about the appropriateness, feasibility and affordability of this design are captured in the following checklist.
 

Key informants judgment designs

Setting up a key informants judgment design for this example would consist of the following:
  1. Identifying a group of people who are close to the program and would have information which could lead them to make a coherent judgment as to the effectiveness of the intervention.
  2. Interviewing the key informants and asking them to make a judgment on whether or not they believe that the intervention has led to a reduction in patron head-banging.

Points about these designs

Points about the appropriateness, feasibility and affordability of this design are captured in the following checklist.
 

Intervention logic (Program Theory/Theory of Change) Based Designs

In intervention logic designs, the attempt is first made to establish a credible ‘intervention logic’ for the program or organization. This logic sets out the way in which it is believed that lower-level program activities will logically lead on to cause higher-level outcomes (this can be done in the form of a DoView Outcomes Model or similar visual logic model more). This logic is then endorsed either by showing that previous evidence shows that it does work in cases similar to the one being evaluated, or by experts in the topic endorsing the logic as being a credible logic. It is then established that lower-level activities have actually occurred (relatively easy to do because they tend to be controllable) and it is then assumed (but not proven) that they did in this particular instance, in fact, cause higher-level outcomes to occur.

Conclusion

This article has set out a simple example to illustrate different designs and techniques in impact/outcome evaluation.

Please comment on this article

This article is based on the developing area of outcomes theory which is still in a relatively early stage of development. Please critique any of the arguments laid out in this article so that they can be improved through critical examination and reflection.

Acknowledgment

This article was written after the author’s involvement as an expert evaluation adviser in the  Youth Employment Network (YEN) (a partnership between the International Labor Organization, the World Bank and the United Nations) Evaluation Clinic held in Damascus, Syria on 19-20 July 2009. In particular he benefited from discussions with other expert advisers on techniques to improve constructed matched comparison group designs: Rita Almeida (Economist, World Bank), David Newhouse (Economist, World Bank), Mattias Lundberg (Senior Economist, World Bank), Alexandre Kolev (ILO/International Training Centre (ITC)); YEN staff, Suzanna Puerto (Technical Officer), Marcus Pilgrim (Manager) and Drew Gardiner (Technical Officer); Nader Kabbani (Director of Research, Syria Trust for Development); and as a result of working with clinic participants on possible impact/outcome evaluation designs with participants. This particular article originated from an attempt to explain in the simplest way possible the different possible designs and techniques used in impact/outcome evaluation.

Citing this article

 
Duignan, P. (2009). Impact/outcome evaluation designs and techniques illustrated with a simple example. Outcomes Theory Knowledge Base Article No. 257. (https://outcomestheory.wordpress.com/2013/02/02/257a)
[If you are reading this in a PDF or printed copy, the web page version may have been updated].
[Outcome Theory Article #257]

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

  1. The building-blocks/types of evidence used in outcomes systems (Redirect)
  2. Types of claims able to be made regarding outcomes models (intervention logics/theories of change) (Redirect)
  3. Reconstructing a Community – How the DoView Visual Planning methodology could be used (Redirect)
  4. Simplifying terms used when working with outcomes (Redirect)
  5. Impact evaluation – where it should and should not be used (Redirect)
  6. Types of economic evaluation analysis (Redirect)
  7. Unequal inputs principle (‘level playing field’) principle
  8. Welcome to the Outcomes Theory Knowledge Base
  9. Organizational Requirements When Implementing the Duignan Approach Using DoView Within an Organization
  10. M & E systems – How to build an affordable simple monitoring and evaluation system using a visual approach
  11. Evaluation of Healthcare Information for All 2015 (HIFA2015) using a DoView visual evaluation plan and Duignan’s Visual Evaluation Planning Method
  12. DoView Results Roadmap Methodology
  13. Problems faced when monitoring and evaluating programs which are themselves assessment systems
  14. Reviewing a list of performance indicators
  15. Using visual DoView Results Roadmaps™ when working with individuals and families
  16. Proving that preventive public health works – using a visual results planning approach to communicate the benefits of investing in preventive public health
  17. Where outcomes theory is being used
  18. How a not-for-profit community organization can transition to being outcomes-focused and results-based – A case study
  19. Duignan’s Outcomes-Focused Visual Strategic Planning for Public and Third Sector Organizations
  20. Impact/outcome evaluation design types
  21. Introduction to outcomes theory
  22. Contracting for outcomes
  23. How a Sector can Assist Multiple Organizations to Implement the Duignan Outcomes-Focused Visual Strategic Planning, Monitoring and Evaluation Approach
  24. How community-based mental health organizations can become results-based and outcomes-focused
  25. Paul Duignan PhD Curriculum Vitae
  26. Integrating government organization statutory performance reporting with demands for evaluation of outcomes and ‘impacts’
  27. Non-output attributable intermediate outcome paradox
  28. Features of steps and outcomes appearing within outcomes models
  29. Principle: Three options for specifying accountability (contracting/delegation) when controllable indicators do not reach a long way up the outcomes model
  30. Outcomes theory diagrams
  31. Indicators – why they should be mapped onto a visual outcomes model
  32. What are Outcomes Models (Program logic models)?
  33. Methods and analysis techniques for information collection
  34. What are outcomes systems?
  35. The problem with SMART objectives – Why you have to consider unmeasurable outcomes
  36. Encouraging better evaluation design and use through a standardized approach to evaluation planning and implementation – Easy Outcomes
  37. New Zealand public sector management system – an analysis
  38. Using Duignan’s outcomes-focused visual strategic planning as a basis for Performance Improvement Framework (PIF) assessments in the New Zealand public sector
  39. Working with outcomes structures and outcomes models
  40. Using the ‘Promoting the Use of Evaluation Within a Country DoView Outcomes Model’
  41. What added value can evaluators bring to governance, development and progress through policy-making? The role of large visualized outcomes models in policy making
  42. Real world examples of how to use seriously large outcomes models (logic models) in evaluation, public sector strategic planning and shared outcomes work
  43. Monitoring, accountability and evaluation of welfare and social sector policy and reform
  44. Results-based management using the Systematic Outcomes Management / Easy Outcomes Process
  45. The evolution of logic models (theories of change) as used within evaluation
  46. Trade-off between demonstrating attribution and encouraging collaboration
  47. Impact/outcome evaluation designs and techniques illustrated with a simple example
  48. Implications of an exclusive focus on impact evaluation in ‘what works’ evidence-based practice systems
  49. Single list of indicators problem
  50. Outcomes theory: A list of outcomes theory articles
  51. Standards for drawing outcomes models
  52. Causal models – how to structure, represent and communicate them
  53. Conventions for visualizing outcomes models (program logic models)
  54. Using a generic outcomes model to implement similar programs in a number of countries, districts, organizational or sector units
  55. Using outcomes theory to solve important conceptual and practical problems in evaluation, monitoring and performance management
  56. Free-form visual outcomes models versus output, intermediate and final outcome ‘layered’ models
  57. Key outcomes, results management and evaluation resources
  58. Outcomes systems – checklist for analysis
  59. Having a common outcomes model underpinning multiple organizational activities
  60. What is best practice?
  61. Best practice representation and dissemination using visual outcomes models
  62. Action research: Using an outcomes modeling approach
  63. Evaluation questions – why they should be mapped onto a visual outcomes model
  64. Overly-simplistic approaches to outcomes, monitoring and evaluation work
  65. Evaluation types: Formative/developmental, process and impact/outcome
  66. Terminology in evaluation: Approaches, types (purposes), methods, analysis techniques and designs
  67. United Nations Results-Based Management System – An analysis
  68. Selecting impact/outcome evaluation designs: a decision-making table and checklist approach
  69. Definitions used in outcomes theory
  70. Balanced Scorecard and Strategy Maps – an analysis
  71. The error of limiting focus to only the attributable
  72. Reframing program evaluation as part of collecting strategic information for sector decision-making
  73. Distinguishing evaluation from other processes (e.g. monitoring, performance management, assessment, quality assurance)
  74. Full roll-out impact/outcome evaluation versus piloting impact/outcome evaluation plus best practice monitoring
  75. References to outcomes theory
  76. Techniques for improving constructed matched comparison group impact/outcome evaluation designs
%d bloggers like this: