Program Evaluation
Alternative Approaches and Practical Guidelines
FOURTH EDITION
Jody L. Fitzpatrick University of Colorado Denver
James R. Sanders Western Michigan University
Blaine R. Worthen Utah State University
Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto
Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo
Vice President and Editor in Chief: Jeffery W. Johnston Senior Acquisitions Editor: Meredith D. Fossel Editorial Assistant: Nancy Holstein Vice President, Director of Marketing: Margaret Waples Senior Marketing Manager: Christopher D. Barry Senior Managing Editor: Pamela D. Bennett Senior Project Manager: Linda Hillis Bayma Senior Operations Supervisor: Matthew Ottenweller Senior Art Director: Diane Lorenzo Cover Designer: Jeff Vanik Cover Image: istock Full-Service Project Management: Ashley Schneider, S4Carlisle Publishing Services Composition: S4Carlisle Publishing Services Printer/Binder: Courier/Westford Cover Printer: Lehigh-Phoenix Color/Hagerstown Text Font: Meridien
Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook appear on appropriate page within text.
Every effort has been made to provide accurate and current Internet information in this book. However, the Internet and information posted on it are constantly changing, so it is inevitable that some of the Internet addresses listed in this textbook will change.
Copyright © 2011, 2004, 1997 Pearson Education, Inc., Upper Saddle River, New Jersey 07458. All rights reserved. Manufactured in the United States of America. This publication is protected by Copyright, and permission should be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. To obtain permission(s) to use material from this work, please submit a written request to Pearson Education, Inc., Permissions Department, 501 Boylston Street, Suite 900, Boston, MA 02116, fax: (617) 671-2290, email: permissionsus@pearson.com.
Library of Congress Cataloging-in-Publication Data
Fitzpatrick, Jody L. Program evaluation: alternative approaches and practical guidelines / Jody L. Fitzpatrick, James R.
Sanders, Blaine R. Worthen. p. cm.
ISBN 978-0-205-57935-8 1. Educational evaluation—United States. 2. Evaluation research (Social action programs)—
United States. 3. Evaluation—Study and teaching—United States. I. Sanders, James R. II. Worthen, Blaine R. III. Worthen, Blaine R. Program evaluation. IV. Title.
LB2822.75.W67 2011 379.1’54—dc22
2010025390 10 9 8 7 6 5 4 3 2
ISBN 10: 0-205-57935-3 ISBN 13: 978-0-205-57935-8
mailto:permissionsus@pearson.com
Jody Fitzpatrick has been a faculty member in public administration at the Uni- versity of Colorado Denver since 1985. She teaches courses in research methods and evaluation, conducts evaluations in many schools and human service settings, and writes extensively about the successful practice of evaluation. She has served on the Board of the American Evaluation Association and on the editorial boards of the American Journal of Evaluation and New Directions for Evaluation. She has also served as Chair of the Teaching of Evaluation Topical Interest Group at the American Evaluation Association and has won a university-wide teaching award at her university. In one of her recent publications, Evaluation in Action: Interviews with Expert Evaluators, she uses interviews with expert evaluators on one evaluation to talk about the decisions that evaluators face as they plan and conduct evaluations and the factors that influence their choices. She is currently evaluating the chang- ing roles of counselors in middle schools and high schools and a program to help immigrant middle-school girls to achieve and stay in school. Her international work includes research on evaluation in Spain and Europe and, recently, she has spoken on evaluation issues to policymakers and evaluators in France, Spain, Denmark, Mexico, and Chile.
James Sanders is Professor Emeritus of Educational Studies and the Evaluation Center at Western Michigan University where he has taught, published, consulted, and conducted evaluations since 1975. A graduate of Bucknell University and the University of Colorado, he has served on the Board and as President of the American Evaluation Association (AEA) and has served as Chair of the Steering Committee that created the Evaluation Network, a predecessor to AEA. His publications in- clude books on school, student, and program evaluation. He has worked exten- sively with schools, foundations, and government and nonprofit agencies to develop their evaluation practices. As Chair of the Joint Committee on Standards for Educational Evaluation, he led the development of the second edition of The Program Evaluation Standards. He was also involved in developing the concepts of applied performance testing for student assessments, cluster evaluation for program evaluations by foundations and government agencies, and mainstreaming evaluation for organizational development. His international work in evaluation has been concentrated in Canada, Europe, and Latin America. He received distinguished ser- vice awards from Western Michigan University, where he helped to establish a PhD program in evaluation, and from the Michigan Association for Evaluation.
About the Authors
iii
Blaine Worthen is Psychology Professor Emeritus at Utah State University, where he founded and directed the Evaluation Methodology PhD program and the West- ern Institute for Research and Evaluation, conducting more than 350 evaluations for local and national clients in the United States and Canada. He received his PhD from The Ohio State University. He is a former editor of Evaluation Practice and founding editor of the American Journal of Evaluation. He served on the American Evaluation Association Board of Directors and received AEA’s Myrdal Award for Outstanding Evaluation Practitioner and AERA’s Best Evaluation Study Award. He has taught university evaluation courses (1969–1999), managed federally man- dated evaluations in 17 states (1973–1978), advised numerous government and private agencies, and given more than 150 keynote addresses and evaluation workshops in the United States, England, Australia, Israel, Greece, Ecuador, and other countries. He has written extensively in evaluation, measurement, and as- sessment and is the author of 135 articles and six books. His Phi Delta Kappan arti- cle, “Critical Issues That Will Determine the Future of Alternative Assessment,” was distributed to 500 distinguished invitees at the White House’s Goals 2000 Conference. He is recognized as a national and international leader in the field.
iv About the Authors
The twenty-first century is an exciting time for evaluation. The field is growing. People—schools, organizations, policymakers, the public at large—are interested in learning more about how programs work: how they succeed and how they fail. Given the tumult experienced in the first decade of this century, many peo- ple are interested in accountability from corporations, government, schools, and nonprofit organizations. The fourth edition of our best-selling textbook is designed to help readers consider how evaluation can achieve these purposes. As in previ- ous editions, our book is one of the few to introduce readers to both the different approaches to evaluation and practical methods for conducting it.
New to This Edition
The fourth edition includes many changes:
• A new chapter on the role of politics in evaluation and ethical considerations. • A new and reorganized Part Two that presents and discusses the most current
approaches and theories of evaluation. • An increased focus on mixed methods in design, data collection, and analysis. • Links to interviews with evaluators who conducted an evaluation that illus-
trates the concepts reviewed in that chapter, as they discuss the choices and challenges they faced.
• A discussion of how today’s focus on performance measurement, outcomes, impacts, and standards have influenced evaluation.
• New sections on organizational learning, evaluation capacity building, mainstreaming evaluation, and cultural competence––trends in evaluation and organizations.
Evaluation, today, is changing in a variety of ways. Policymakers, managers, citizens, and consumers want better tracking of activities and outcomes. More importantly, many want a better understanding of social problems and the programs and policies being undertaken to reduce these problems. Evaluation in many forms, including performance measurement and outcome or impact assessments, is ex- panding around the globe. People who work in organizations are also interested in evaluation as a way to enhance organizational learning. They want to know how well they’re doing, how to tackle the tough problems their organizations address, and how to improve their performance and better serve their clients and their
Preface
v
community. Many different methods are being developed and used: mixed meth- ods for design and data collection, increased involvement of new and different stakeholders in the evaluation process, expanded consideration of the potential uses and impacts of evaluation, and more effective and diverse ways to communicate findings. As evaluation expands around the world, the experiences of adapting eval- uation to different settings and different cultures are enriching the field.
In this new edition, we hope to convey to you the dynamism and creativity involved in conducting evaluation. Each of us has many years of experience in conducting evaluations in a variety of settings, including schools, public welfare agencies, mental health organizations, environmental programs, nonprofit organ- izations, and corporations. We also have years of experience teaching students how to use evaluation in their own organizations or communities. Our goal is, and always has been, to present information that readers can use either to conduct or to be a participant in evaluations that make a difference to their workplace, their clients, and their community. Let us tell you a bit more about how we hope to do that in this new edition.
Organization of This Text
The book is organized in four parts. Part One introduces the reader to key concepts in evaluation; its history and current trends; and ethical, political, and interper- sonal factors that permeate and transcend all phases of evaluation. Evaluation dif- fers from research in that it is occurring in the real world with the goal of being used by non-researchers to improve decisions, governance, and society. As a result, evaluators develop relationships with their users and stakeholders and work in a political environment in which evaluation results compete with other demands on decision makers. Evaluators must know how to work in such envi- ronments to get their results used. In addition, ethical challenges often present themselves. We find the ways in which evaluation differs from research to be both challenging and interesting. It is why we chose evaluation as our life’s work. In Part One, we introduce you to these differences and to the ways evaluators work in this public, political context.
In Part Two, we present several different approaches, often called models or theories, to evaluation. (Determining whether objectives or outcomes have been achieved isn’t the only way to approach evaluation!) Approaches influence how evaluators determine what to study and how they involve others in what they study. We have expanded our discussions of theory-based, decision-oriented, and participatory approaches. In doing so, we describe new ways in which evaluators use logic models and program theories to understand the workings of a program. Participatory and transformative approaches to empowering stakeholders and creating different ways of learning are described and contrasted. Evaluators must know methodology, but they also must know about different approaches to eval- uation to consciously and intelligently choose the approach or mix of approaches that is most appropriate for the program, clients, and stakeholders and context of their evaluation.
vi Preface
In Parts Three and Four, the core of the book, we describe how to plan and carry out an evaluation study. Part Three is concerned with the planning stage: learning about the program, conversing with stakeholders to learn purposes and consider future uses of the study, and identifying and finalizing evaluation questions to guide the study. Part Three teaches the reader how to develop an eval- uation plan and a management plan, including timelines and budgets for conduct- ing the study. In Part Four, we discuss the methodological choices and decisions evaluators make: selecting and developing designs; sampling, data collection, and analysis strategies; interpreting results; and communicating results to others. The chapters in each of these sections are sequential, representing the order in which decisions are made or actions are taken in the evaluation study. We make use of extensive graphics, lists, and examples to illustrate practice to the reader.
This Revision
Each chapter has been revised by considering the most current books, articles, and reports. Many new references and contemporary examples have been added. Thus, readers are introduced to current controversies about randomized control groups and appropriate designs for outcome evaluations, current discussions of political influences on evaluation policies and practices, research on participative approaches, discussions of cultural competency and capacity building in organiza- tions, and new models of evaluation use and views on interpreting and dissemi- nating results.
We are unabashedly eclectic in our approach to evaluation. We use many different approaches and methods––whatever is appropriate for the setting––and encourage you to do the same. We don’t advocate one approach, but instruct you in many. You will learn about different approaches or theories in Part Two and different methods of collecting data in Parts Three and Four.
To facilitate learning, we have continued with much the same pedagogical structure that we have used in past editions. Each chapter presents information on current and foundational issues in a practical, accessible manner. Tables and figures are used frequently to summarize or illustrate key points. Each chapter begins with Orienting Questions to introduce the reader to some of the issues that will be covered in the chapter and concludes with a list of the Major Concepts and Theories reviewed in the chapter, Discussion Questions, Application Exercises, and a list of Suggested Readings on the topics discussed.
Rather than using the case study method from previous editions, we thought it was time to introduce readers to some real evaluations. Fortunately, while Blaine Worthen was editor of American Journal of Evaluation, Jody Fitzpatrick wrote a column in which she interviewed evaluators about a single evaluation they had conducted. These interviews are now widely used in teaching about evaluation. We have incorporated them into this new edition by recommending the ones that illustrate the themes introduced in each chapter. Readers and instructors can choose either to purchase the book, Evaluation in Action (Fitzpatrick, Christie, & Mark, 2009), as a case companion to this text or to access many of the interviews
Preface vii
through their original publication in the American Journal of Evaluation. At the end of each chapter, we describe one to three relevant interviews, citing the chapter in the book and the original source in the journal.
We hope this book will inspire you to think in a new way about issues—in a questioning, exploring, evaluative way—and about programs, policy, and organi- zational change. For those readers who are already evaluators, this book will pro- vide you with new perspectives and tools for your practice. For those who are new to evaluation, this book will make you a more informed consumer of or participant in evaluation studies or, perhaps, guide you to undertake your own evaluation.
Acknowledgments
We would like to thank our colleagues in evaluation for continuing to make this such an exciting and dynamic field! Our work in each revision of our text has reminded us of the progress being made in evaluation and the wonderful insights of our colleagues about evaluation theory and practice. We would also like to thank Sophia Le, our research assistant, who has worked tirelessly, creatively, and diligently to bring this manuscript to fruition. We all are grateful to our families for the interest and pride they have shown in our work and the patience and love they have demonstrated as we have taken the time to devote to it.
viii Preface
Contents
PART ONE • Introduction to Evaluation 1
1 Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 3
Informal versus Formal Evaluation 5
A Brief Definition of Evaluation and Other Key Terms 6
Differences in Evaluation and Research 9
The Purposes of Evaluation 13
Roles and Activities of Professional Evaluators 16
Uses and Objects of Evaluation 18
Some Basic Types of Evaluation 20
Evaluation’s Importance—and Its Limitations 32
2 Origins and Current Trends in Modern Program Evaluation 38
The History and Influence of Evaluation in Society 38
1990–The Present: History and Current Trends 49
3 Political, Interpersonal, and Ethical Issues in Evaluation 64
Evaluation and Its Political Context 65
Maintaining Ethical Standards: Considerations, Issues, and Responsibilities for Evaluators 78
ix
PART TWO • Alternative Approaches to Program Evaluation 109
4 Alternative Views of Evaluation 111
Diverse Conceptions of Program Evaluation 113
Origins of Alternative Views of Evaluation 114
Classifications of Evaluation Theories or Approaches 120
5 First Approaches: Expertise and Consumer-Oriented Approaches 126
The Expertise-Oriented Approach 127
The Consumer-Oriented Evaluation Approach 143
6 Program-Oriented Evaluation Approaches 153
The Objectives-Oriented Evaluation Approach 154
Logic Models and Theory-Based Evaluation Approaches 159
How Program-Oriented Evaluation Approaches Have Been Used 164
Strengths and Limitations of Program-Oriented Evaluation Approaches 166
Goal-Free Evaluation 168
7 Decision-Oriented Evaluation Approaches 172
Developers of Decision-Oriented Evaluation Approaches and Their Contributions 173
The Decision-Oriented Approaches 173
How the Decision-Oriented Evaluation Approaches Have Been Used 184
Strengths and Limitations of Decision-Oriented Evaluation Approaches 184
x Contents
8 Participant-Oriented Evaluation Approaches 189
Evolution of Participatory Approaches 190
Developers of Participant-Oriented Evaluation Approaches and Their Contributions 191
Participatory Evaluation Today: Two Streams and Many Approaches 199
Some Specific Contemporary Approaches 205
How Participant-Oriented Evaluation Approaches Have Been Used 220
Strengths and Limitations of Participant-Oriented Evaluation Approaches 223
9 Other Current Considerations: Cultural Competence and Capacity Building 231
The Role of Culture and Context in Evaluation Practice and Developing Cultural Competence 232
Evaluation’s Roles in Organizations: Evaluation Capacity Building and Mainstreaming Evaluation 235
10 A Comparative Analysis of Approaches 243
A Summary and Comparative Analysis of Evaluation Approaches 243
Cautions About the Alternative Evaluation Approaches 244
Contributions of the Alternative Evaluation Approaches 248
Comparative Analysis of Characteristics of Alternative Evaluation Approaches 249
Eclectic Uses of the Alternative Evaluation Approaches 251
PART THREE • Practical Guidelines for Planning Evaluations 257
11 Clarifying the Evaluation Request and Responsibilities 259
Understanding the Reasons for Initiating the Evaluation 260
Conditions Under Which Evaluation Studies Are Inappropriate 265
Contents xi
Determining When an Evaluation Is Appropriate: Evaluability Assessment 268
Using an Internal or External Evaluator 271
Hiring an Evaluator 277
How Different Evaluation Approaches Clarify the Evaluation Request and Responsibilities 281
12 Setting Boundaries and Analyzing the Evaluation Context 286
Identifying Stakeholders and Intended Audiences for an Evaluation 287
Describing What Is to Be Evaluated: Setting the Boundaries 290
Analyzing the Resources and Capabilities That Can Be Committed to the Evaluation 304
Analyzing the Political Context for the Evaluation 307
Variations Caused by the Evaluation Approach Used 309
Determining Whether to Proceed with the Evaluation 310
13 Identifying and Selecting the Evaluation Questions and Criteria 314
Identifying Useful Sources for Evaluation Questions: The Divergent Phase 315
Selecting the Questions, Criteria, and Issues to Be Addressed: The Convergent Phase 328
Specifying the Evaluation Criteria and Standards 332
Remaining Flexible during the Evaluation: Allowing New Questions, Criteria, and Standards to Emerge 336
14 Planning How to Conduct the Evaluation 340
Developing the Evaluation Plan 342
Specifying How the Evaluation Will Be Conducted: The Management Plan 358
xii Contents
Establishing Evaluation Agreements and Contracts 367
Planning and Conducting the Metaevaluation 368
PART FOUR • Practical Guidelines for Conducting and Using Evaluations 379
15 Collecting Evaluative Information: Design, Sampling, and Cost Choices 381
Using Mixed Methods 383
Designs for Collecting Descriptive and Causal Information 387
Sampling 407
Cost Analysis 411
16 Collecting Evaluative Information: Data Sources and Methods, Analysis, and Interpretation 418
Common Sources and Methods for Collecting Information 419
Planning and Organizing the Collection of Information 443
Analysis of Data and Interpretation of Findings 444
17 Reporting Evaluation Results: Maximizing Use and Understanding 453
Purposes of Evaluation Reporting and Reports 454
Different Ways of Reporting 455
Important Factors in Planning Evaluation Reporting 456
Key Components of a Written Report 469
Suggestions for Effective Oral Reporting 476
A Checklist for Good Evaluation Reports 479
How Evaluation Information Is Used 479
Contents xiii
18 The Future of Evaluation 490
The Future of Evaluation 490
Predictions Concerning the Profession of Evaluation 491
Predictions Concerning the Practice of Evaluation 493
A Vision for Evaluation 496
Conclusion 497
Appendix A The Program Evaluation Standards and Guiding Principles for Evaluators 499
References 505
Author Index 526
Subject Index 530
xiv Contents
I
Introduction to Evaluation
Part
1
This initial section of our text provides the background necessary for the begin- ning student to understand the chapters that follow. In it, we attempt to accom- plish three things: to explore the concept of evaluation and its various meanings, to review the history of program evaluation and its development as a discipline, and to introduce the reader to some of the factors that influence the practice of evaluation. We also acquaint the reader with some of the current controversies and trends in the field.
In Chapter 1, we discuss the basic purposes of evaluation and the varying roles evaluators play. We define evaluation specifically, and we introduce the reader to several different concepts and distinctions that are important to evalua- tion. In Chapter 2, we summarize the origins of today’s evaluation tenets and prac- tices and the historical evolution of evaluation as a growing force in improving our society’s public, nonprofit, and corporate programs. In Chapter 3, we discuss the political, ethical, and interpersonal factors that underlie any evaluation and em- phasize its distinction from research.
Our intent in Part One is to provide the reader with information essential to understanding not only the content of the sections that follow but also the wealth of material that exists in the literature on program evaluation. Although the con- tent in the remainder of this book is intended to apply primarily to the evaluation of programs, most of it also applies to the evaluation of policies, products, and processes used in those areas and, indeed, to any object of an evaluation. In Part Two we will introduce you to different approaches to evaluation to enlarge your understanding of the diversity of choices that evaluators and stakeholders make in undertaking evaluation.
This page intentionally left blank
Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions
Orienting Questions
1. What is evaluation? Why is it important?
2. What is the difference between formal and informal evaluation?
3. What are some purposes of evaluation? What roles can the evaluator play?
4. What are the major differences between formative and summative evaluations?
5. What questions might an evaluator address in a needs assessment, a process evaluation, and an outcome evaluation?
6. What are the advantages and disadvantages of an internal evaluator? An external evaluator?
3
1
The challenges confronting our society in the twenty-first century are enormous. Few of them are really new. In the United States and many other countries, the public and nonprofit sectors are grappling with complex issues: educating children for the new century; reducing functional illiteracy; strengthening families; train- ing people to enter or return to the workforce; training employees who currently work in an organization; combating disease and mental illness; fighting discrimi- nation; and reducing crime, drug abuse, and child and spouse abuse. More recently, pursuing and balancing environmental and economic goals and working to ensure peace and economic growth in developing countries have become prominent concerns. As this book is written, the United States and many countries around
4 Part I • Introduction to Evaluation
the world are facing challenging economic problems that touch every aspect of so- ciety. The policies and programs created to address these problems will require evaluation to determine which solutions to pursue and which programs and poli- cies are working and which are not. Each new decade seems to add to the list of challenges, as society and the problems it confronts become increasingly complex.
As society’s concern over these pervasive and perplexing problems has intensified, so have its efforts to resolve them. Collectively, local, regional, national, and international agencies have initiated many programs aimed at eliminating these problems or their underlying causes. In some cases, specific programs judged to have been ineffective have been “mothballed” or sunk outright, often to be replaced by a new program designed to attack the problem in a different—and, hopefully, more effective—manner.
In more recent years, scarce resources and budget deficits have posed still more challenges as administrators and program managers have had to struggle to keep their most promising programs afloat. Increasingly, policymakers and man- agers have been faced with tough choices, being forced to cancel some programs or program components to provide sufficient funds to start new programs, to con- tinue others, or simply to keep within current budgetary limits.
To make such choices intelligently, policy makers need good information about the relative effectiveness of programs. Which programs are working well? Which are failing? What are the programs’ relative costs and benefits? Similarly, each program manager needs to know how well different parts of programs are working. What can be done to improve those parts of the program that are not working as well as they should? Have all aspects of the program been thought through carefully at the planning stage, or is more planning needed? What is the theory or logic model for the program’s effectiveness? What adaptations would make the program more effective?
Answering such questions is the major task of program evaluation. The ma- jor task of this book is to introduce you to evaluation and the vital role it plays in virtually every sector of modern society. However, before we can hope to convince you that good evaluation is an essential part of good programs, we must help you understand at least the basic concepts in each of the following areas:
• How we—and others—define evaluation • How formal and informal evaluation differ • The basic purposes—and various uses—of formal evaluation • The distinction between basic types of evaluation • The distinction between internal and external evaluators • Evaluation’s importance and its limitations
Covering all of those areas thoroughly could fill a whole book, not just one chapter of an introductory text. In this chapter, we provide only brief coverage of each of these topics to orient you to concepts and distinctions necessary to under- stand the content of later chapters.
Chapter 1 • Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 5
Informal versus Formal Evaluation
Evaluation is not a new concept. In fact, people have been evaluating, or examin- ing and judging things, since the beginning of human history. Neanderthals prac- ticed it when determining which types of saplings made the best spears, as did Persian patriarchs in selecting the most suitable suitors for their daughters, and English yeomen who abandoned their own crossbows in favor of the Welsh longbow. They had observed that the longbow could send an arrow through the stoutest armor and was capable of launching three arrows while the crossbow sent only one. Al- though no formal evaluation reports on bow comparisons have been unearthed in English archives, it is clear that the English evaluated the longbow’s value for their purposes, deciding that its use would strengthen them in their struggles with the French. So the English armies relinquished their crossbows, perfected and improved on the Welsh longbow, and proved invincible during most of the Hundred Years’ War.
By contrast, French archers experimented briefly with the longbow, then went back to the crossbow—and continued to lose battles. Such are the perils of poor evaluation! Unfortunately, the faulty judgment that led the French to persist in us- ing an inferior weapon represents an informal evaluation pattern that has been re- peated too often throughout history.
As human beings, we evaluate every day. Practitioners, managers, and policymakers make judgments about students, clients, personnel, programs, and policies. These judgments lead to choices and decisions. They are a natural part of life. A school principal observes a teacher working in the classroom and forms some judgments about that teacher’s effectiveness. A program officer of a founda- tion visits a substance abuse program and forms a judgment about the program’s quality and effectiveness. A policymaker hears a speech about a new method for de- livering health care to uninsured children and draws some conclusions about whether that method would work in his state. Such judgments are made every day in our work. These judgments, however, are based on informal, or unsystematic, evaluations.
Informal evaluations can result in faulty or wise judgments. But, they are characterized by an absence of breadth and depth because they lack systematic procedures and formally collected evidence. As humans, we are limited in making judgments both by the lack of opportunity to observe many different settings, clients, or students and by our own past experience, which both informs and bi- ases our judgments. Informal evaluation does not occur in a vacuum. Experience, instinct, generalization, and reasoning can all influence the outcome of informal evaluations, and any or all of these may be the basis for sound, or faulty, judg- ments. Did we see the teacher on a good day or a bad one? How did our past ex- perience with similar students, course content, and methods influence our judgment? When we conduct informal evaluations, we are less cognizant of these limitations. However, when formal evaluations are not possible, informal evalua- tion carried out by knowledgeable, experienced, and fair people can be very use- ful indeed. It would be unrealistic to think any individual, group, or organization could formally evaluate everything it does. Often informal evaluation is the only
6 Part I • Introduction to Evaluation
practical approach. (In choosing an entrée from a dinner menu, only the most compulsive individual would conduct exit interviews with restaurant patrons to gather data to guide that choice.)
Informal and formal evaluation, however, form a continuum. Schwandt (2001a) acknowledges the importance and value of everyday judgments and argues that evaluation is not simply about methods and rules. He sees the evaluator as helping practitioners to “cultivate critical intelligence.” Evaluation, he notes, forms a middle ground “between overreliance on and over-application of method, general principles, and rules to making sense of ordinary life on one hand, and advocating trust in personal inspiration and sheer intuition on the other” (p. 86). Mark, Henry, and Julnes (2000) echo this concept when they describe evaluation as a form of assisted sense-making. Evaluation, they observe, “has been developed to assist and extend natural human abilities to observe, understand, and make judgments about policies, programs, and other objects in evaluation” (p. 179).
Evaluation, then, is a basic form of human behavior. Sometimes it is thorough, structured, and formal. More often it is impressionistic and private. Our focus is on the more formal, structured, and public evaluation. We want to inform readers of various approaches and methods for developing criteria and collecting information about alternatives. For those readers who aspire to become professional evaluators, we will be introducing you to the approaches and methods used in these formal studies. For all readers, practitioners and evaluators, we hope to cultivate that critical intelligence, to make you cognizant of the factors influencing your more informal judgments and decisions.
A Brief Definition of Evaluation and Other Key Terms
In the previous section, the perceptive reader will have noticed that the term “evaluation” has been used rather broadly without definition beyond what was implicit in context. But the rest of this chapter could be rather confusing if we did not stop briefly to define the term more precisely. Intuitively, it may not seem dif- ficult to define evaluation. For example, one typical dictionary definition of eval- uation is “to determine or fix the value of: to examine and judge.” Seems quite straightforward, doesn’t it? Yet among professional evaluators, there is no uni- formly agreed-upon definition of precisely what the term “evaluation” means. In fact, in considering the role of language in evaluation, Michael Scriven, one of the founders of evaluation, for an essay on the use of language in evaluation recently noted there are nearly 60 different terms for evaluation that apply to one context or another. These include adjudge, appraise, analyze, assess, critique, examine, grade, inspect, judge, rate, rank, review, score, study, test, and so on (cited in Patton, 2000, p. 7). While all these terms may appear confusing, Scriven notes that the variety of uses of the term evaluation “reflects not only the immense im- portance of the process of evaluation in practical life, but the explosion of a new area of study” (cited in Patton, 2000, p. 7). This chapter will introduce the reader
Chapter 1 • Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 7
to the array of variations in application, but, at this point, we will focus on one definition that encompasses many others.
Early in the development of the field, Scriven (1967) defined evaluation as judging the worth or merit of something. Many recent definitions encompass this original definition of the term (Mark, Henry, & Julnes, 2000; Schwandt, 2008; Scriven, 1991a; Stake, 2000a; Stufflebeam, 2001b). We concur that evaluation is de- termining the worth or merit of an evaluation object (whatever is evaluated). More broadly, we define evaluation as the identification, clarification, and application of defensible criteria to determine an evaluation object’s value (worth or merit) in rela- tion to those criteria. Note that this definition requires identifying and clarifying de- fensible criteria. Often, in practice, our judgments of evaluation objects differ because we have failed to identify and clarify the means that we, as individuals, use to judge an object. One educator may value a reading curriculum because of the love it instills for reading; another may disparage the program because it does not move the child along as rapidly as other curricula in helping the student to recognize and interpret letters, words, or meaning. These educators differ in the value they assign to the cur- ricula because their criteria differ. One important role of an evaluator is to help stake- holders articulate their criteria and to stimulate dialogue about them. Our definition, then, emphasizes using those criteria to judge the merit or worth of the product.
Evaluation uses inquiry and judgment methods, including: (1) determining the criteria and standards for judging quality and deciding whether those stan- dards should be relative or absolute, (2) collecting relevant information, and (3) applying the standards to determine value, quality, utility, effectiveness, or sig- nificance. It leads to recommendations intended to optimize the evaluation object in relation to its intended purpose(s) or to help stakeholders determine whether the evaluation object is worthy of adoption, continuation, or expansion.
Programs, Policies, and Products
In the United States, we often use the term “program evaluation.” In Europe and some other countries, however, evaluators often use the term “policy evaluation.” This book is concerned with the evaluation of programs, policies, and products. We are not, however, concerned with evaluating personnel or the performance of indi- vidual people or employees. That is a different area, one more concerned with man- agement and personnel.1 (See Joint Committee. [1988]) But, at this point, it would be useful to briefly discuss what we mean by programs, policies, and products. “Program” is a term that can be defined in many ways. In its simplest sense, a pro- gram is a “standing arrangement that provides for a . . . service” (Cronbach et al., 1980, p. 14). The Joint Committee on Standards for Educational Evaluation (1994) defined program simply as “activities that are provided on a continuing basis” (p. 3). In their
1The Joint Committee on Standards for Educational Evaluation has developed some standards for personnel evaluation that may be of interest to readers involved in evaluating the performance of teach- ers or other employees working in educational settings. These can be found at http://www.eval.org/ evaluationdocuments/perseval.html.
http://www.eval.org/evaluationdocuments/perseval.html
http://www.eval.org/evaluationdocuments/perseval.html
8 Part I • Introduction to Evaluation
new edition of the Standards (2010) the Joint Committee noted that a program is much more than a set of activities. They write:
Defined completely, a program is
• A set of planned systematic activities • Using managed resources • To achieve specified goals • Related to specific needs • Of specific, identified, participating human individuals or groups • In specific contexts • Resulting in documentable outputs, outcomes and impacts • Following assumed (explicit or implicit) systems of beliefs (diagnostic, causal, in-
tervention, and implementation theories about how the program works) • With specific, investigable costs and benefits. (Joint Committee, 2010, in press)
Note that their newer definition emphasizes programs achieving goals related to particular needs and the fact that programs are based on certain theories or as- sumptions. We will talk more about this later when we discuss program theory. We will simply summarize by saying that a program is an ongoing, planned intervention that seeks to achieve some particular outcome(s), in response to some perceived ed- ucational, social, or commercial problem. It typically includes a complex of people, organization, management, and resources to deliver the intervention or services.
In contrast, the word “policy” generally refers to a broader act of a public orga- nization or a branch of government. Organizations have policies—policies about re- cruiting and hiring employees, policies about compensation, policies concerning interactions with media and the clients or customers served by the organization. But, government bodies—legislatures, departments, executives, and others—also pass or develop policies. It might be a law or a regulation. Evaluators often conduct studies to judge the effectiveness of those policies just as they conduct studies to evaluate pro- grams. Sometimes, the line between a program and a policy is quite blurred. Like a program, a policy is designed to achieve some outcome or change, but, unlike a pro- gram, a policy does not provide a service or activity. Instead, it provides guidelines, regulations, or the like to achieve a change. Those who study public policy define policy even more broadly: “public policy is the sum of government activities, whether acting directly or through agents, as it has an influence on the life of citizens” (Peters, 1999, p. 4). Policy analysts study the effectiveness of public policies just as evaluators study the effectiveness of government programs. Sometimes, their work overlaps. What one person calls a policy, another might call a program. In practice, in the United States, policy analysts tend to be trained in political science and economics, and evaluators tend to be trained in psychology, sociology, education, and public administration. As the field of evaluation expands and clients want more information on government programs, evaluators study the effectiveness of programs and policies.
Finally, a “product” is a more concrete entity than either a policy or a pro- gram. It may be a textbook such as the one you are reading. It may be a piece of software. Scriven defines a product very broadly to refer to the output of some- thing. Thus, a product could be a student or a person who received training, the
Chapter 1 • Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 9
work of a student, or a curricula which is “the product of a research and development effort” (1991a, p. 280).
Stakeholders
Another term used frequently in evaluation is “stakeholders.” Stakeholders are various individuals and groups who have a direct interest in and may be affected by the program being evaluated or the evaluation’s results. In the Encyclopedia of Evaluation, Greene (2005) identifies four types of stakeholders:
(a) People who have authority over the program including funders, policy makers, advisory boards;
(b) People who have direct responsibility for the program including program devel- opers, administrators, managers, and staff delivering the program;
(c) People who are the intended beneficiaries of the program, their families, and their communities; and
(d) People who are damaged or disadvantaged by the program (those who lose fund- ing or are not served because of the program). (pp. 397–398)
Scriven (2007) has grouped stakeholders into groups based on how they are impacted by the program, and he includes more groups, often political groups, than does Greene. Thus, “upstream impactees” refer to taxpayers, political supporters, funders, and those who make policies that affect the program. “Midstream impactees,” also called primary stakeholders by Alkin (1991), are program managers and staff. “Down- stream impactees” are those who receive the services or products of the program.
All of these groups hold a stake in the future direction of that program even though they are sometimes unaware of their stake. Evaluators typically involve at least some stakeholders in the planning and conduct of the evaluation. Their par- ticipation can help the evaluator to better understand the program and the infor- mation needs of those who will use it.
Differences in Evaluation and Research
It is important to distinguish between evaluation and research, because these dif- ferences help us to understand the distinctive nature of evaluation. While some methods of evaluation emerged from social science research traditions, there are important distinctions between evaluation and research. One of those distinctions is purpose. Research and evaluation seek different ends. The primary purpose of research is to add to knowledge in a field, to contribute to the growth of theory. A good research study is intended to advance knowledge. While the results of an evaluation study may contribute to knowledge development (Mark, Henry, & Julnes, 2000), that is a secondary concern in evaluation. Evaluation’s primary pur- pose is to provide useful information to those who hold a stake in whatever is be- ing evaluated (stakeholders), often helping them to make a judgment or decision.
10 Part I • Introduction to Evaluation
Research seeks conclusions; evaluation leads to judgments. Valuing is the sine qua non of evaluation. A touchstone for discriminating between an evaluator and a researcher is to ask whether the inquiry being conducted would be regarded as a failure if it produced no data on the value of the thing being studied. A researcher answering strictly as a researcher will probably say no.
These differing purposes have implications for the approaches one takes. Research is the quest for laws and the development of theory—statements of re- lationships among two or more variables. Thus, the purpose of research is typically to explore and establish causal relationships. Evaluation, instead, seeks to exam- ine and describe a particular thing and, ultimately, to consider its value. Some- times, describing that thing involves examining causal relationships; often, it does not. Whether the evaluation focuses on a causal issue depends on the information needs of the stakeholders.
This highlights another difference in evaluation and research—who sets the agenda. In research, the hypotheses to be investigated are chosen by the researcher based on the researcher’s assessment of the appropriate next steps in developing theory in the discipline or field of knowledge. In evaluation, the questions to be answered are not those of the evaluator, but rather come from many sources, including those of significant stakeholders. An evaluator might suggest questions, but would never determine the focus of the study without consultation with stakeholders. Such actions, in fact, would be unethical in evaluation. Unlike re- search, good evaluation always involves the inclusion of stakeholders—often a wide variety of stakeholders—in the planning and conduct of the evaluation for many reasons: to ensure that the evaluation addresses the needs of stakeholders, to improve the validity of results, and to enhance use.
Another difference between evaluation and research concerns generalizabil- ity of results. Given evaluation’s purpose of making judgments about a particular thing, good evaluation is quite specific to the context in which the evaluation object rests. Stakeholders are making judgments about a particular evaluation object, a program or a policy, and are not as concerned with generalizing to other settings as researchers would be. In fact, the evaluator should be concerned with the par- ticulars of that setting, with noting them and attending to the factors that are rel- evant to program success or failure in that setting. (Note that the setting or context may be a large, national program with many sites, or a small program in one school.) In contrast, because the purpose of research is to add to general knowledge, the methods are often designed to maximize generalizability to many different settings.
As suggested previously, another difference between research and evaluation concerns the intended use of their results. Later in the book, we will discuss the many different types of use that may occur in evaluation, but, ultimately, evalua- tion is intended to have some relatively immediate impact. That impact may be on immediate decisions, on decisions in the not-too-distant future, or on perspectives that one or more stakeholder groups or stakeholders have about the object of the evaluation or evaluation itself. Whatever the impact, the evaluation is designed to be used. Good research may or may not be used right away. In fact, research that adds in important ways to some theory may not be immediately noticed, and
Chapter 1 • Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 11
connections to a theory may not be made until some years after the research is conducted.2 Nevertheless, the research stands alone as good research if it meets the standards for research in that discipline or field. If one’s findings are to add to knowl- edge in a field, ideally, the results should transcend the particulars of time and setting.
Thus, research and evaluation differ in the standards used to judge their adequacy (Mathison, 2007). Two important criteria for judging the adequacy of research are internal validity, the study’s success at establishing causality, and external validity, the study’s generalizability to other settings and other times. These crite- ria, however, are not sufficient, or appropriate, for judging the quality of an eval- uation. As noted previously, generalizability, or external validity, is less important for an evaluation because the focus is on the specific characteristics of the program or policy being evaluated. Instead, evaluations are typically judged by their accuracy (the extent to which the information obtained is an accurate reflection—a one-to- one correspondence—with reality), utility (the extent to which the results serve the practical information needs of intended users), feasibility (the extent to which the evaluation is realistic, prudent, diplomatic, and frugal), and propriety (the extent to which the evaluation is done legally and ethically, protecting the rights of those involved). These standards and a new standard concerning evaluation accountabil- ity were developed by the Joint Committee on Standards for Evaluation to help both users of evaluation and evaluators themselves to understand what evalua- tions should do (Joint Committee, 2010). (See Chapter 3 for more on the Standards.)
Researchers and evaluators also differ in the knowledge and skills required to perform their work. Researchers are trained in depth in a single discipline—their field of inquiry. This approach is appropriate because a researcher’s work, in almost all cases, will remain within a single discipline or field. The methods he or she uses will remain relatively constant, as compared with the methods that evaluators use, because a researcher’s focus remains on similar problems that lend themselves to certain methods of study. Evaluators, by contrast, are evaluating many different types of programs or policies and are responding to the needs of clients and stakehold- ers with many different information needs. Therefore, evaluators’ methodological training must be broad and their focus may transcend several disciplines. Their edu- cation must help them to become sensitive to the wide range of phenomena to which they must attend if they are to properly assess the worth of a program or policy. Evaluators must be broadly familiar with a wide variety of methods and techniques so they can choose those most appropriate for the particular program and the needs of its stakeholders. In addition, evaluation has developed some of its own specific methods, such as using logic models to understand program theory and metaevalua- tion. Mathison writes that “evaluation as a practice shamelessly borrows from all disciplines and ways of thinking to get at both facts and values” (2007, p. 20). Her statement illustrates both the methodological breadth required of an evaluator and
2A notable example concerns Darwin’s work on evolution. Elements of his book, The Origin of the Species, were rejected by scientists some years ago and are only recently being reconsidered as new research sug- gests that some of these elements were correct. Thus, research conducted more than 100 years ago emerges as useful because new techniques and discoveries prompt scientists to reconsider the findings.
12 Part I • Introduction to Evaluation
the fact that evaluators’ methods must serve the purpose of valuing or establishing merit and worth, as well as establishing facts.
Finally, evaluators differ from researchers in that they must establish personal working relationships with clients. As a result, studies of the competencies required of evaluators often cite the need for training in interpersonal and communication skills (Fitzpatrick, 1994; King, Stevahn, Ghere, & Minnema, 2001; Stufflebeam & Wingate, 2005).
In summary, research and evaluation differ in their purposes and, as a result, in the roles of the evaluator and researcher in their work, their preparation, and the criteria used to judge the work. (See Table 1.1 for a summary of these differ- ences.) These distinctions lead to many differences in the manner in which research and evaluation are conducted.
Of course, evaluation and research sometimes overlap. An evaluation study may add to our knowledge of laws or theories in a discipline. Research can inform our judgments and decisions regarding a program or policy. Yet, fundamental distinctions remain. Our earlier discussion highlights these differences to help those who are new to evaluation to see the ways in which evaluators behave differently than researchers. Evaluations may add to knowledge in a field, contribute to theory development, establish causal relationships, and provide explanations for the relationship between phenomena, but that is not its primary purpose. Its primary purpose is to assist stake- holders in making value judgments and decisions about whatever is being evaluated.
Action Research
A different type of research altogether is action research. Action research, origi- nally conceptualized by Kurt Lewin (1946) and more recently developed by Emily Calhoun (1994, 2002), is research conducted collaboratively by professionals to
TABLE 1.1 Differences in Research and Evaluation
Factor Research Evaluation
Purpose Add to knowledge in a field, develop laws and theories
Make judgments, provide information for decision making
Who sets the agenda or focus?
Researchers Stakeholders and evaluator jointly
Generalizability of results
Important to add to theory Less important, focus is on particulars of program or policy and context
Intended use of results
Not important An important standard
Criteria to judge adequacy
Internal and external validity Accuracy, utility, feasibility, propriety, evaluation accountability
Preparation of those who work in area
Depth in subject matter, fewer methodological tools and approaches
Interdisciplinary, many methodological tools, interpersonal skills
Chapter 1 • Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 13
improve their practice. Such professionals might be social workers, teachers, or accountants who are using research methods and means of thinking to develop their practice. As Elliott (2005) notes, action research always has a developmental aim. Calhoun, who writes of action research in the context of education, gives exam- ples of teachers working together to conceptualize their focus; to collect, analyze, and interpret data on the issue; and to make decisions about how to improve their practice as teachers and/or a program or curriculum they are implementing. The data collection processes may overlap with program evaluation activities, but there are key differences: Action research is conducted by professionals about their own work with a goal of improving their practice. Action research is also considered to be a strategy to change the culture of organizations to one in which professionals work collaboratively to learn, examine, and research their own practices. Thus, action research produces information akin to that in formative evaluations— information to be used for program improvement. The research is conducted by those delivering the program and, in addition to improving the element under study, has major goals concerning professional development and organizational change.
The Purposes of Evaluation
Consistent with our earlier definition of evaluation, we believe that the primary purpose of evaluation is to render judgments about the value of whatever is being evaluated. This view parallels that of Scriven (1967), who was one of the earliest to outline the purpose of formal evaluation. In his seminal paper, “The Methodol- ogy of Evaluation,” he argued that evaluation has a single goal or purpose: to determine the worth or merit of whatever is evaluated. In more recent writings, Scriven has continued his emphasis on the primary purpose of evaluation being to judge the merit or worth of an object (Scriven, 1996).
Yet, as evaluation has grown and evolved, other purposes have emerged. A discussion of these purposes sheds light on the practice of evaluation in today’s world. For the reader new to evaluation, these purposes illustrate the many facets of evaluation and its uses. Although we agree with Scriven’s historical emphasis on the purpose of evaluation, to judge the merit or worth of a program, policy, process, or product, we see these other purposes of evaluation at play as well.
Some years ago, Talmage (1982) argued that an important purpose of eval- uation was “to assist decision makers responsible for making policy” (p. 594). And, in fact, providing information that will improve the quality of decisions made by policymakers continues to be a major purpose of program evaluation. Indeed, the rationale given for collecting much evaluation data today—by schools, by state and local governments, by the federal government, and by nonprofit organizations— is to help policymakers in these organizations make decisions about whether to continue programs, to initiate new programs, or, in other major ways, to change the funding or structure of a program. In addition to decisions made by policymakers, evaluation is intended to inform the decisions of many others, including program managers (principals, department heads), program staff (teachers, counselors,
14 Part I • Introduction to Evaluation
health care providers, and others delivering the services offered by a program), and program consumers (clients, parents, citizens). A group of teachers may use evaluations of student performance to make decisions on program curricula or materials. Parents make decisions concerning where to send their children to school based on information on school performance. Students choose institutions of higher education based on evaluative information. The evaluative information or data provided may or may not be the most useful for making a particular deci- sion, but, nevertheless, evaluation clearly serves this purpose.
For many years, evaluation has been used for program improvement. As we will discuss later in this chapter, Michael Scriven long ago identified program im- provement as one of the roles of evaluation, though he saw that role being achieved through the initial purpose of judging merit and worth. Today, many see organizational and program improvement as a major, direct purpose of evaluation (Mark, Henry, & Julnes, 2000; Patton, 2008a; Preskill & Torres, 1998).
Program managers or those who deliver a program can make changes to im- prove the program based on the evaluation results. In fact, this is one of the most frequent uses of evaluation. There are many such examples: teachers using the re- sults of student assessments to revise their curricula or pedagogical methods, health care providers using evaluations of patients’ use of medication to revise their means of communicating with patients about dosage and use, and trainers us- ing feedback from trainees to change training to improve its application on the job. These are all ways that evaluation serves the purpose of program improvement.
Today, many evaluators see evaluation being used for program and organi- zational improvement in new ways. As we will describe in later chapters, Michael Patton often works today in what he calls “developmental evaluation,” working to assist organizations that do not have specific, measurable goals, but, instead, need evaluation to help them with ongoing progress, adaptation, and learning (Patton, 1994, 2005b). Hallie Preskill (Preskill, 2008; Preskill & Torres, 2000) and others (King, 2002; Baker & Bruner, 2006) have written about the role of evaluation in improving overall organizational performance by instilling new ways of thinking. In itself, the process of participating in an evaluation can begin to influence the ways that those who work in the organization approach problems. For example, an evaluation that involves employees in developing a logic model for the program to be evaluated or in examining data to draw some conclusions about program progress may prompt those employees to use such procedures or these ways of approaching a problem in the future and, thus, lead to organizational improvement.
The purpose of program or organizational improvement, of course, overlaps with others. When an evaluation is designed for program improvement, the eval- uator must consider the decisions that those managing and delivering the program will make in using the study’s results for program improvement. So the purpose of the evaluation is to provide both decision making and program improvement. We will not split hairs to distinguish between the two purposes, but will simply acknowledge that evaluation can serve both purposes. Our goal is to expand your view of the various purposes for evaluation and to help you consider the purpose in your own situation or organization.
Chapter 1 • Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 15
Some recent discussions of the purposes of evaluation move beyond these more immediate purposes to evaluation’s ultimate impact on society. Some evalu- ators point out that one important purpose of evaluation is helping give voice to groups who are not often heard in policy making or planning programs. Thus, House and Howe (1999) argue that the goal of evaluation is to foster deliberative democracy. They encourage the evaluator to work to help less powerful stakehold- ers gain a voice and to stimulate dialogue among stakeholders in a democratic fash- ion. Others highlight the role of the evaluator in helping bring about greater social justice and equality. Greene, for example, notes that values inevitably influence the practice of evaluation and, therefore, evaluators can never remain neutral. Instead, they should recognize the diversity of values that emerge and arise in an evaluation and work to achieve desirable values of social justice and equity (Greene, 2006).
Carol Weiss (1998b) and Gary Henry (2000) have argued that the purpose of evaluation is to bring about social betterment. Mark, Henry, and Julnes (2000) de- fine achieving social betterment as “the alleviation of social problems, meeting of hu- man needs” (p. 190). And, in fact, evaluation’s purpose of social betterment is at least partly reflected in the Guiding Principles, or ethical code, adopted by the American Evaluation Association. One of those principles concerns the evaluator’s responsibil- ities for the general and public welfare. Specifically, Principle E5 states the following:
Evaluators have obligations that encompass the public interest and good. Because the public interest and good are rarely the same as the interests of any particular group (including those of the client or funder) evaluators will usually have to go beyond analysis of particular stakeholder interests and consider the welfare of society as a whole. (American Evaluation Association, 2004)
This principle has been the subject of more discussion among evaluators than other principles, and deservedly so. Nevertheless, it illustrates one important pur- pose of evaluation. Evaluations are concerned with programs and policies that are intended to improve society. Their results provide information on the choices that policymakers, program managers, and others make in regard to these programs. As a result, evaluators must be concerned with their purposes in achieving the so- cial betterment of society. Writing in 1997 about the coming twenty-first century, Chelimsky and Shadish emphasized the global perspective of evaluation in achiev- ing social betterment, extending evaluation’s context in the new century to world- wide challenges. These include new technologies, demographic imbalances across nations, environmental protection, sustainable development, terrorism, human rights, and other issues that extend beyond one program or even one country (Chelimsky & Shadish, 1997).
Finally, many evaluators continue to acknowledge the purpose of evaluation in extending knowledge (Donaldson, 2007; Mark, Henry, & Julnes, 2000). Although adding to knowledge is the primary purpose of research, evaluation studies can add to our knowledge of social science theories and laws. They provide an opportunity to test theories in real-world settings or to test existing theories or laws with new groups by examining whether those theories hold true in new
16 Part I • Introduction to Evaluation
settings with different groups. Programs or policies are often, though certainly not always, based on some theory or social science principles.3 Evaluations provide the opportunity to test those theories. Evaluations collect many kinds of information that can add to our knowledge: information describing client groups or problems, information on causes or consequences of problems, tests of theories concerning impact. For example, Debra Rog conducted an evaluation of a large intervention program to help homeless families in the early 1990s (Rog, 1994; Rog, Holupka, McCombs-Thornton, Brito, & Hambrick, 1997). At the time, not much was known about homeless families and some of the initial assumptions in planning were in- correct. Rog adapted her evaluation design to learn more about the circumstances of homeless families. Her results helped to better plan the program, but also added to our knowledge about homeless families, their health needs, and their circum- stances. In our discussion of the differences between research and evaluation, we emphasized that the primary purpose of research is to add to knowledge in a field and that this is not the primary purpose of evaluation. We continue to maintain that distinction. However, the results of some evaluations can add to our knowl- edge of social science theories and laws. This is not a primary purpose, but simply one purpose that an evaluation may serve.
In closing, we see that evaluation serves many different purposes. Its primary purpose is to determine merit or worth, but it serves many other valuable pur- poses as well. These include assisting in decision making; improving programs, or- ganizations, and society as a whole; enhancing democracy by giving voice to those with less power; and adding to our base of knowledge.
Roles and Activities of Professional Evaluators
Evaluators as practitioners play numerous roles and conduct multiple activities in performing evaluation. Just as discussions on the purposes of evaluation help us to better understand what we mean by determining merit and worth, a brief dis- cussion of the roles and activities pursued by evaluators will acquaint the reader with the full scope of activities that professionals in the field pursue.
A major role of the evaluator that many in the field emphasize and discuss is that of encouraging the use of evaluation results (Patton, 2008a; Shadish, 1994). While the means for encouraging use and the anticipated type of use may differ, considering use of results is a major role of the evaluator. In Chapter 17, we will discuss the different types of use that have been identified for evaluation and var- ious means for increasing that use. Henry (2000), however, has cautioned that fo- cusing primarily on use can lead to evaluations focused solely on program and organizational improvement and, ultimately, avoiding final decisions about merit and worth. His concern is appropriate; however, if the audience for the evaluation
3The term “evidence-based practice” emerges from the view that programs should be designed around social science research findings when basic research, applied research, or evaluation studies have found that a given program practice or action leads to the desired, intended outcomes.
Chapter 1 • Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 17
is one that is making decisions about the program’s merit and worth, this problem may be avoided. (See discussion of formative and summative evaluation in this chapter.) Use is certainly central to evaluation, as demonstrated by the prominent role it plays in the professional standards and codes of evaluation. (See Chapter 3.)
Others’ discussions of the role of the evaluator illuminate the ways in which evaluators might interact with stakeholders and other users. Rallis and Rossman (2000) see the role of the evaluator as that of a critical friend. They view the pri- mary purpose of evaluation as learning and argue that, for learning to occur, the evaluator has to be a trusted person, “someone the emperor knows and can listen to. She is more friend than judge, although she is not afraid to offer judgments” (p. 83). Schwandt (2001a) describes the evaluator in the role of a teacher, helping practitioners develop critical judgment. Patton (2008a) envisions evaluators in many different roles including facilitator, collaborator, teacher, management con- sultant, organizational development (OD) specialist, and social-change agent. These roles reflect his approach to working with organizations to bring about develop- mental change. Preskill and Torres (1998) stress the role of the evaluator in bring- ing about organizational learning and instilling a learning environment. Mertens (1999), Chelimsky (1998), and Greene (1997) emphasize the important role of in- cluding stakeholders, who often have been ignored by evaluation. House and Howe (1999) argue that a critical role of the evaluator is stimulating dialogue among various groups. The evaluator does not merely report information, or pro- vide it to a limited or designated key stakeholder who may be most likely to use the information, but instead stimulates dialogue, often bringing in disenfranchised groups to encourage democratic decision making.
Evaluators also have a role in program planning. Bickman (2002), Chen (1990), and Donaldson (2007) emphasize the important role that evaluators play in helping articulate program theories or logic models. Wholey (1996) argues that a critical role for evaluators in performance measurement is helping policymakers and managers select the performance dimensions to be measured as well as the tools to use in measuring those dimensions.
Certainly, too, evaluators can play the role of the scientific expert. As Lipsey (2000) notes, practitioners want and often need evaluators with the “expertise to track things down, systematically observe and measure them, and compare, ana- lyze, and interpret with a good faith attempt at objectivity” (p. 222). Evaluation emerged from social science research. While we will describe the growth and emergence of new approaches and paradigms, and the role of evaluators in edu- cating users to our purposes, stakeholders typically contract with evaluators to provide technical or “scientific” expertise and/or an outside “objective” opinion. Evaluators can occasionally play an important role in making program stakehold- ers aware of research on other similar programs. Sometimes, the people manag- ing or operating programs or the people making legislative or policy decisions on programs are so busy fulfilling their primary responsibilities that they are not aware of other programs or agencies that are doing similar things and the research conducted on these activities. Evaluators, who typically explore existing research on similar programs to identify potential designs and measures, can play the role
18 Part I • Introduction to Evaluation
of scientific expert in making stakeholders aware of research. (See, for example, Fitzpatrick and Bledsoe [2007] for a discussion of Bledsoe’s role in informing stakeholders of existing research on other programs.)
Thus, the evaluator takes on many roles. In noting the tension between advocacy and neutrality, Weiss (1998b) writes that the role(s) evaluators play will depend heavily on the context of the evaluation. The evaluator may serve as a teacher or critical friend in an evaluation designed to improve the early stages of a new reading program. The evaluator may act as a facilitator or collaborator with a community group appointed to explore solutions to problems of unemployment in the region. In conducting an evaluation on the employability of new immigrant groups in a state, the evaluator may act to stimulate dialogue among immigrants, policymakers, and nonimmigrant groups competing for employment. Finally, the evaluator may serve as an outside expert in designing and conducting a study for Congress on the effectiveness of annual testing in improving student learning.
In carrying out these roles, evaluators undertake many activities. These include negotiating with stakeholder groups to define the purpose of evaluation, developing contracts, hiring and overseeing staff, managing budgets, identifying disenfranchised or underrepresented groups, working with advisory panels, collecting and analyzing and interpreting qualitative and quantitative information, commu- nicating frequently with various stakeholders to seek input into the evaluation and to report results, writing reports, considering effective ways to disseminate information, meeting with the press and other representatives to report on progress and results, and recruiting others to evaluate the evaluation (metaevalu- ation). These, and many other activities, constitute the work of evaluators. Today, in many organizations, that work might be conducted by people who are formally trained and educated as evaluators, attend professional conferences and read widely in the field, and identify their professional role as an evaluator, or by staff who have many other responsibilities—some managerial, some working directly with students or clients—but with some evaluation tasks thrown into the mix. Each of these will assume some of the roles described previously and will conduct many of the tasks listed.
Uses and Objects of Evaluation
At this point, it might be useful to describe some of the ways in which evaluation can be used. An exhaustive list would be prohibitive, filling the rest of this book and more. Here we provide only a few representative examples of uses made of evaluation in selected sectors of society.
Examples of Evaluation Use in Education 1. To empower teachers to have more say in how school budgets are allocated 2. To judge the quality of school curricula in specific content areas 3. To accredit schools that meet or exceed minimum accreditation standards
Chapter 1 • Evaluation’s Basic Purpose, Uses, and Conceptual Distinctions 19
4. To determine the value of a middle school’s block scheduling 5. To satisfy an external funding agency’s demands for reports on effectiveness
of school programs it supports 6. To assist parents and students in selecting schools in a district with school
choice 7. To help teachers improve their reading program to encourage more volun-
tary reading
Examples of Evaluation Use in Other Public and Nonprofit Sectors 1. To decide whether to expand an urban transit program and where it should
be expanded 2. To establish the value of a job training program 3. To decide whether to modify a low-cost housing project’s rental policies 4. To improve a recruitment program for blood donors 5. To determine the impact of a prison’s early-release program on recidivism 6. To gauge community reaction to proposed fire-burning restrictions to im-
prove air quality 7. To determine the effect of an outreach program on the immunization of in-
fants and children
Examples of Evaluation Use in Business and Industry 1. To improve a commercial product 2. To judge the effectiveness of a corporate training program on teamwork 3. To determine the effect of a new flextime policy on productivity, recruitment,
and retention 4. To identify the contributions of specific programs to corporate profits 5. To determine the public’s perception of a corporation’s environmental image 6. To recommend ways to improve retention among younger employees 7. To study the quality of performance appraisal feedback
One additional comment about the use of evaluation in business and indus- try may be warranted. Evaluators unfamiliar with the private sector are sometimes unaware that personnel evaluation is not the only use made of evaluation in business and industry settings. Perhaps that is because the term “evaluation” has been absent from the descriptors for many corporate activities and programs that, when examined, are decidedly evaluative. Activities labeled as quality assurance, quality control, research and development, Total Quality Management (TQM), or Continuous Quality Improvement (CQI) turn out, on closer inspection, to possess many characteristics of program evaluation.
Uses of Evaluation Are Generally Applicable
As should be obvious by now, evaluation methods are clearly portable from one arena to another. The use of evaluation may remain constant, but the entity it is ap- plied to—that is, the object of the evaluation—may vary widely. Thus, evaluation
20 Part I • Introduction to Evaluation
may be used to improve a commercial product, a community training program, or a school district’s student assessment system. It could be used to build organizational capacity in the Xerox Corporation, the E. F. Lilly Foundation, the Minnesota Department of Education, or the Utah Division of Family Services. Evaluation can be used to empower parents in the San Juan County Migrant Education Program, workers in the U.S. Postal Service, employees of Barclays Bank of England, or residents in east Los Angeles. Evaluation can be used to provide information for decisions about programs in vocational education centers, community mental health clinics, university medical schools, or county cooperative extension offices. Such examples could be multiplied ad infinitum, but these should suffice to make our point.