CHAPTER 6
Today many approaches to evaluation begin their focus with learning more about some key features of the program to be evaluated. These features then serve to help the evaluator decide which questions should be addressed. The most prominent program-oriented approaches are the objectives-oriented approach and approaches that make use of logic models or program theory. In fact, theory-based evaluation is one of the most rapidly growing areas of evaluation (Weiss, 1995; Donaldson, 2007). Many government funding agencies and foundations require logic models, a variant of program theory, for program planning, evaluation, and research. Both logic models and program theory have evolved to help evaluators gain a better understanding of the rationale or reasoning behind the program’s intended effects; this represents a great improvement over the more traditional objectives-oriented evaluation, which focused only on stated program outcomes.
Fitzpatrick, Jody L.. Program Evaluation (p. 153). Pearson Education. Kindle Edition.
In this chapter, we will cover the original program-oriented evaluation approach—objectives-oriented evaluation—which continues to be used frequently today. We will then describe theory-oriented approaches and their cousin, logic models, and their applications today in helping evaluators make the critical choices of what to evaluate. The Objectives-Oriented Evaluation Approach The distinguishing feature of an objectives-oriented evaluation approach is that the purposes of some activity are specified, and then the evaluation focuses on the extent to which those purposes, or objectives, are achieved. In many cases, programs already have specified objectives. In other cases, the evaluator may work with stakeholders to articulate the program objectives, sometimes called goals or standards. The key role for the evaluator in an objectives-oriented evaluation is to determine whether some or all of the program objectives are achieved and, if so, how well they are achieved. In education, the objectives may be concerned with the purposes of a single lesson or training program or the knowledge students should attain during an entire year. In public health programs, the objectives may concern the effects of prevention efforts, community health interventions, or patient education. Objectives in environmental programs might include such quantitative outcomes as reduction in air pollutants or outcomes that are more difficult to measure such as citizens’ beliefs and behaviors about energy use. The information gained from an objectivesoriented evaluation could be used to determine whether to continue funding the program, change significant portions of it, or throw it out and consider other approaches. Many people have contributed to the evolution and refinement of the objectives-oriented approach to evaluation since its inception in the 1930s, but the individual most credited with conceptualizing and popularizing the focus on objectives in education is Ralph W. Tyler (1942, 1950). The Tylerian Evaluation Approach Tyler had a tremendous influence on both evaluation and education., His work influenced the Elementary and Secondary Education Act (ESEA) of 1965, the first federal act to require evaluation of educational programs. At the end of his career, he chaired the committee that started the National Assessment of Educational Progress (NAEP), which in the United States today remains the only way to examine educational achievement across all 50 states because of the different standards in each state. In the 1920s and 1930s, while working closely with teachers and schools, Tyler began to formulate his views on education and evaluation. His writings and work foreshadowed today’s concepts of continuous improvement and multiple means of assessment. He saw objectives as a way for teachers to
Fitzpatrick, Jody L.. Program Evaluation (p. 154). Pearson Education. Kindle Edition.
define what they wanted students to learn. By stating objectives in terms of what students should be able to do, Tyler believed that teachers could more effectively plan their curricula and lessons to achieve those objectives. Unlike later versions of behavioral objectives, however, Tyler believed that objectives should concern principles, not minute behaviors. He worked closely and cooperatively as an evaluator with teachers to make evaluation and education cooperative endeavors (Goodlad, 1979; Madaus, 2004; Madaus & Stufflebeam, 1989). Tyler considered evaluation to be the process of determining the extent to which the objectives of a program are actually being achieved. His approach to evaluation followed these steps: 1. Establish broad goals or objectives. 2. Classify the goals or objectives. 3. Define objectives in behavioral terms. 4. Find situations in which achievement of objectives can be shown. 5. Develop or select measurement techniques. 6. Collect performance data. 7. Compare performance data with behaviorally stated objectives. Discrepancies between performance and objectives would lead to modifications intended to correct the deficiency, and the evaluation cycle would be repeated. Tyler’s rationale was logical, scientifically acceptable, readily adoptable by evaluators (most of whose methodological training was very compatible with the pretest-posttest measurement of behaviors stressed by Tyler), and had great influence on subsequent evaluation theorists. Tyler advocated multiple measures of different types and considered many elements of a program during an evaluation. However, the objectives-oriented approaches that evolved from Tyler’s work in the 1960s and 1970s and that continue to be used in some settings today focused on a basic formula: articulate program objectives; identify the means, typically tests, to measure them; administer the tests; analyze the data in reference to previously stated objectives; and determine program success. This basic, objectives-oriented approach is largely discredited by professional evaluators today. However, many funding sources have not caught up with present-day evaluation approaches and require evaluations to make use of this traditional approach. Its strengths and limitations are discussed in the conclusion of the chapter. Provus’s Discrepancy Evaluation Model Another approach to evaluation in the Tylerian tradition was developed by Malcolm Provus, who based his approach on his evaluation assignments in the Pittsburgh public schools (Provus, 1971, 1973). Provus viewed evaluation as a continuous information-management process designed to serve as “the watchdog of program management” and the “handmaiden of administration in the management of program development through sound decision making” (Provus, 1973, p. 186). Although his was, in some ways, a management-oriented
Fitzpatrick, Jody L.. Program Evaluation (p. 155). Pearson Education. Kindle Edition.
evaluation approach, the key characteristic of his proposals stemmed from the Tylerian tradition. Provus viewed evaluation as a process of (1) agreeing on standards (another term used in place of objectives),1 (2) determining whether a discrepancy exists between the performance of some aspect of a program and the standards set for performance, and (3) using information about discrepancies to decide whether to improve, maintain, or terminate the program or some aspect of it. He called his approach, not surprisingly, the Discrepancy Evaluation Model (DEM). Provus determined that, as a program is being developed, it goes through four developmental stages, to which he added a fifth, optional stage: 1. Definition 2. Installation 3. Process (interim products) 4. Product 5. Cost-benefit analysis (optional) During the definition, or design, stage, the focus of work is on defining goals and processes or activities and delineating necessary resources and participants to carry out the activities and accomplish the goals. Provus considered programs to be dynamic systems involving inputs (antecedents), processes, and outputs (outcomes). Standards or expectations were established for each stage. These standards were the objectives on which all further evaluation work was based. The evaluator’s job at the design stage is to see that a complete set of design specifications is produced and that they meet certain criteria: theoretical and structural soundness. At the installation stage, the program design or definition is used as the standard against which to judge program operation. The evaluator performs a series of congruency tests to identify any discrepancies between expected and actual implementation of the program or activity. The intent is to make certain that the program has been installed as it has been designed. This is important because studies have found that staff vary as much in implementing a single program as they do in implementing several different ones. The degree to which program specifications are followed is best determined through firsthand observation. If discrepancies are found at this stage, Provus proposed several solutions to be considered: (a) changing the program definition to conform to the way in which the program is actually being delivered if the actual delivery seems more appropriate, (b) making adjustments in the delivery of the program to better conform to the program definition (through providing more resources or training), 1 Although standards and objectives are not synonymous, they were used by Provus interchangeably. Stake (1970) also stated that “standards are another form of objective: those seen by outside authority figures who know little or nothing about the specific program being evaluated but whose advice is relevant to
Fitzpatrick, Jody L.. Program Evaluation (p. 156). Pearson Education. Kindle Edition.
or (c) terminating the activity if it appears that further development would be futile in achieving program goals. During the process stage, evaluation focuses on gathering data on the progress of participants to determine whether their behaviors changed as expected. Provus used the term “enabling objective” to refer to those gains that participants should be making if longer-term program goals are to be reached. If certain enabling objectives are not achieved, the activities leading to those objectives are revised or redefined. The validity of the evaluation data would also be questioned. If the evaluator finds that enabling objectives are not being achieved, another option is to terminate the program if it appears that the discrepancy cannot be eliminated. At the product stage, the purpose of evaluation is to determine whether the terminal objectives for the program have been achieved. Provus distinguished between immediate outcomes, or terminal objectives, and long-term outcomes, or ultimate objectives. He encouraged the evaluator to go beyond the traditional emphasis on end-of-program performance and to make follow-up studies, based on ultimate objectives, a part of all program evaluations. Provus also suggested an optional fifth stage that called for a cost-benefit analysis and a comparison of the results with similar cost analyses of comparable programs. In recent times, with funds for human services becoming scarcer, cost-benefit analyses have become a part of many program evaluations. The Discrepancy Evaluation Model was designed to facilitate the development of programs in large public school systems and was later applied to statewide evaluations by a federal bureau. A complex approach that works best in larger systems with adequate staff resources, its central focus is on identifying discrepancies to help managers determine the extent to which program development is proceeding toward attainment of stated objectives. It attempts to assure effective program development by preventing the activity from proceeding to the next stage until all identified discrepancies have been removed. Whenever a discrepancy is found, Provus suggested a cooperative problem-solving process for program staff and evaluators. The process called for asking the following questions: (1) Why is there a discrepancy? (2) What corrective actions are possible? (3) Which corrective action is best? This process usually required that additional information be gathered and criteria developed to allow rational, justifiable decisions about corrective actions (or terminations). This particular problem-solving activity was a new addition to the traditional objectivesoriented evaluation approach. Though the Discrepancy Evaluation Model was one of the earliest approaches to evaluation, elements of it can still be found in many evaluations. For example, in Fitzpatrick’s interview with David Fetterman, a developer of empowerment evaluation, on his evaluation of the Stanford Teacher Education Program (STEP), Fetterman uses the discrepancy model to identify program areas (Fitzpatrick & Fetterman, 2000). The fact that the model continues to influence evaluation studies 30 years later is evidence of how these seminal
Fitzpatrick, Jody L.. Program Evaluation (p. 157). Pearson Education. Kindle Edition.
The three dimensions of the cube are as follows: 1. Needs of youth (the client): categories developed by Stufflebeam (1977) and expanded by Nowakowski et al. (1985) are • intellectual • physical recreation • vocational • social • moral • aesthetic/cultural • emotional 2. Age of youth (this dimension could be any relevant characteristic of the client): prenatal through young adult 3. Source of service to youth, such as • housing • social services • health services • economic/business • public works • justice • education • religious organizations In any category along any of the three dimensions, those planning a communitybased youth program may choose to establish relevant objectives. Few, if any, stakeholders in community-based programs will be interested in every cell of the cube, but the categories contained in each of the three dimensions will provide a good checklist for making certain that important areas or categories of objectives are not overlooked. Obviously, use of the cube is not limited to community-based programs but could extend to other types of programs as well. Logic Models and Theory-Based Evaluation Approaches Logic Models One of the criticisms of objectives-oriented evaluation is that it tells us little about how the program achieves its objectives. This can be a particular problem when programs fail to achieve their objectives, because the evaluation can provide little advice on how to do so. Logic models have developed as an extension of objectives-oriented evaluation and are designed to fill in those steps between the program and its objectives. Typically, logic models require program planners or evaluators to identify program inputs, activities, outputs, and outcomes, with outcomes reflecting longer-term objectives or goals of the program and outputs
Fitzpatrick, Jody L.. Program Evaluation (p. 159). Pearson Education. Kindle Edition. representing immediate program impacts. The model, typically presented in a diagram form, illustrates the logic of the program. A typical logic model may include the following: Inputs—annual budgets, staffing facilities, equipment, and materials needed to run the program Activities—weekly sessions, curriculum, workshops, conferences, recruitment, clinical services, newsletters, staff training, all the key components of the program Outputs—numbers of participants or clients served each week, number of class meetings, hours of direct service to each participant, number of newsletters and other immediate program products Immediate, intermediate, long-term, and ultimate outcomes—the longitudinal goals for participant change (development) Logic models are widely used in program planning and evaluation today. They have influenced evaluation by filling in the “black box” between the program and its objectives. Evaluators can use logic models to help program staff articulate and discuss their assumptions about how their program might achieve its goals and what elements are important to evaluate at any given time and generally to build internal evaluation capacity or the ability to think in an evaluative way. (See Taylor-Powell & Boyd [2008] for an example of the use of logic models in cooperative extension to build organizational capacity. Knowlton and Phillips [2009] also provide guidance for building logic models.) The United Way of America was one of the major organizations to bring logic models to evaluation through the logicmodel-based approach it requires for the organizations it funds (United Way, 1996). Other foundations, such as the W.K. Kellogg Foundation and the Annie E. Casey Foundation, have also been instrumental in training organizations in the use of logic models to improve program planning and evaluation. Theory-Based or Theory-Driven Evaluation Carol Weiss first discussed basing evaluation on a program’s theory in her 1972 classic book building on earlier writings by Suchman (1967) on the reasons that programs fail (Weiss, 1997; Worthen, 1996a). She has remained an effective and long-term advocate for theory-based evaluations (Weiss, 1995, 1997; Weiss & Mark, 2006). In the 1980s and 1990s, Huey Chen, Peter Rossi, and Leonard Bickman began writing about theory-based approaches to evaluation (Bickman, 1987, 1990; Chen & Rossi, 1980; 1983; Chen, 1990). Stewart Donaldson (2007) is one of the principal evaluators practicing and writing about the theory-driven evaluation approach today.2 Edward Suchman (1967) had first made the point that programs can fail to
Fitzpatrick, Jody L.. Program Evaluation (p. 160). Pearson Education. Kindle Edition.
achieve their goals for two distinctly different reasons: (a) the program is not delivered as planned and, therefore, is not really tested (implementation failure); and (b) the program is delivered as planned and the results, then, clearly indicate that the program theory is incorrect (theory failure). He and Weiss recognized that, if an evaluation were examining whether a program achieved its goals and that program failed, it was important to know whether the failure was an implementation failure or a theory failure. With this information, the evaluator could then reach appropriate conclusions about the program and make useful recommendations for the decision maker. To distinguish between implementation failure and theory failure, the evaluator had to know two things in addition to simply measuring outcomes: (a) the essentials of the program theory. and (b) how the program was implemented. With this information, the evaluator could then determine whether the program implementation matched the theory. This was the beginning of program theory and the recognition of its importance to evaluation practice. Chen’s and Bickman’s approaches to theory-based evaluation arose for these reasons, but also from their desire for evaluations to contribute more directly to social science research knowledge. Chen, for example, argued that evaluators of the time erred in focusing solely on methodology and failing to consider the theory or tenets of the program. For many of those writing about theory-based evaluation as it first emerged in the late 1980s and 1990s, theory meant connecting evaluation to social science research theories. Chen (1990), for example, encouraged evaluators to search the scientific research literature to identify social science theories that were relevant to the program and to use those theories in planning the evaluation. Evaluation results could then contribute to social science knowledge and theory as well as to program decisions (Bickman, 1987). Thus, theory-based evaluation arose from a science-based perspective and was often considered to be a strictly quantitative approach by others during the debates on qualitative and quantitative methods in the 1990s. However, today, theory-based evaluation is used by evaluators in many settings to gain a better understanding of the program. (See Rogers, 2000, 2001.) They can then use that understanding, the program theory, to better define the evaluation questions the study should address, to aid their choices of what concepts to measure and when to measure them, and to improve their interpretation of results and their feedback to stakeholders to enhance use. But what is program theory? And what do evaluators using theory-based evaluation approaches do? Bickman defines program theory as “the construction of a plausible and sensible model of how a program is supposed to work” (Bickman, 1987, p. 5). More recently, Donaldson defines program theory as “the process through which program components are presumed to affect outcomes and the conditions under which these processes are believed to operate” (2007, p. 22). In both cases, and in other definitions, program theory explains the logic of the program. How does it differ from a logic model? In fact, they are quite similar. A logic model may depict the program theory if its articulation of program inputs, activities, outputs, and outcomes is sufficient to describe why the program is intended to achieve its outcomes. Logic models are sometimes used as tools to develop program theory. In other words, a program theory may look like a logic model. In our experience, because the emphasis in logic models is on the stages of