2
blame, for the
fact that by my count over 200 sessions at the 1973 AERA Annual
Meeting
programs were
directly related to the methods and results of program evaluation
studies.
There were two
primary models for program evaluation in 1965, and there are two
today.
One is the
informal study, perhaps a self-study, usually using information already
available,
relying on the
insights of professional persons and respected authorities. It is the approach
of
regional
accrediting associations for secondary schools and colleges in the United
States, and is
exemplified by
the Flexner report (1916) of medical education in the USA and by the
Coleman
report (1966) of
equality of educational opportunity. On the sheet you received with
your
background
reading materials, one entitled Prototypes of Curriculum Evaluation, I have
ever-so
briefly
described this and other models: this one is referred to there as the School
Accreditation
Model. Most
Educators are partial to this evaluation model, more so if they can specify
who
the panel
members or examiners are. Researchers do not like it because it relies so much
on
second-hand
information. But there is much good about the model.
Most researchers
have preferred the other model, the pretest/post-test model, what I
have
referred to on
the prototype sheet as Ralph Tyler's model. It often uses prespecified
statements
of behavioral
objectives—such as are available from Jim Popham's Instructional
Objectives
Exchange—and is
nicely represented by Tyler's Eight-Year Study, Husén's
International
Education Study
and the National Assessment of Educational Progress. The focus of
attention
with this model
is primarily on student performance.
Several of us
have proposed other models. In a 1963 article is Cronbach's preference to
have evaluation
studies considered applied research on instruction, to learn what could
be
learned in
general about curriculum development, as was done on Hilda Taba's Social
Studies
Curriculum
Project. Mike Scriven strongly criticized Cronbach's choice in AERA Monograph No
1, stating that
it was time to give consumers (purchasing agents, taxpayers, and
parents)
information on
how good each existing curriculum is. To this end, Kenneth Komoski
established
in New York City
on Educational Products Information Exchange, which has reviewed
equipment,
books, and teaching aids but has to this day still not caught the buyer's
eye.
Dan Stufflebeam
was one who recognized that the designs preferred by researchers did
not
focus on the
variables that educational administrators have control over. With support
from
Egon Guba, Dave
Clark, Bill Gephart and others, he proposed a model for evaluation
that
emphasized the
particular decisions that a program manager will face. Data-gathering
would
include data on
Context, Input, Process and Product; but analysis would relate those things
to
the immediate
management of the program. Though Mike Scriven criticized this design
too,
saying that it
had too much bias toward the concerns and the values of the
education
establishment,
this Stufflebeam CIPP model was popular in the U.S. Office of Education
for
several years.
Gradually it fell into disfavor because it was not generating the
information—or
the
protection—that program sponsors and directors needed. But that occurred, I
think, not
because it was a
bad model, but partly because managers were unable or unwilling to
examine
their own
operations as part of the evaluation. Actually no evaluation model could
have
succeeded. A
major obstacle was a federal directive which said that no federal office
could
spend its funds
to evaluate its own work, that that could only be done by an office higher
up.
Perhaps the best
examples of evaluation reports following this approach are those done in
the
Pittsburgh
schools by Mal Provus and Esther Kresh. Before I describe the approach that I
have
been working
on—which I hope will someday challenge the two major models—I will
mention
several
relatively recent developments in the evaluation business.
It is
recognized, particularly by Mike Scriven and Ernie House, that co-option is a
problem,
that the rewards
to an evaluator for producing a favorable evaluation report often
greatly
outweigh the
rewards for producing an unfavorable report. I do not know of any
evaluators
who falsify
their reports, but I do know many who consciously or unconsciously choose
to