AN HONEST APPRAISAL OF PEER REVIEW

Bert Gold bgold at itsa.ucsf.edu
Sat Jun 15 12:36:25 EST 1996




PAPER in FASEB Journal (1993) volume 7  pages 619-621

On giraffes and peer review

D. R. FORSDYKE


Department of Biochemistry, Queen's University, Kingston,
Ontario, Canada. K7L3N6


SUMMARY
For several decades grant applications in the biomedical sciences
have been assessed by peer review. However, the design of the
peer review system was based on past precedent rather than on
recognition that a novel approach was needed. Flaws in system
design have been exposed by funding cut-backs. As a result the
research community is being torn apart.


VACANT: one ecological nitch. WANTED: an animal that can run like
a horse, but can also nibble the most juicy leaves at the tops of
trees. If you had to design such a beast from scratch, you would
probably end up drawing a horse-like quadruped with a long neck.
You would figure that the animal should be able to hear predators
and alarm calls and you would equip it with well-hooded ears.
Since it would receive alarm calls, it should also be able to
send them. So you would equip it with a larynx. You would then
pencil in a nerve running from the brain to the larynx, a
distance of perhaps 20 cm.  When checking your design against the
real world, you would find a great similarity to the giraffe.
However, the nerve to the larynx is actually several meters in
length! From the brain, it runs down the neck to the chest where
it loops round a major blood vessel and then returns up the neck
to the larynx.

                 Design by Revolution
  The reason for this strange peregrination is quite well
understood. In the course of evolution, tissues began moving
around taking their nerve and blood supplies with them. Some
tissues migrated forward to form structures in the neck; adjacent
tissue migrated into the chest. When this happened the "wires got
crossed".  A nerve got caught round a blood vessel. To solve the
problem either the blood vessel had to loop up into the neck and
then back to the chest, or the nerve had to loop down to the
chest and then back to the neck. The giraffe has not gone the way
of the dinosaurs because the length of its laryngeal nerve was
not critical for its survival. But millions of equally outrageous
evolutionary design flaws have resulted in early extinction for
the species concerned. Design by evolution is often very
inefficient. Design by evolution is always constrained by the
past. Sometimes, in human affairs, past intellectual baggage
hinders our ability to forge novel approaches. Problems which
require solution by revolution, rather than by evolution, are not
seen as such. The bold line drawn from the brain to the larynx of
your prototypic giraffe would be an example of "design by
revolution".

                   Present at the Creation
  The origins of the modern peer review system are murky1,2. It
seems that no one ever sat down and tried to design the system
from scratch.  Rather, it evolved in a piece-meal fashion. Peer
review has been with us for several decades. Yet, as currently
practiced, it threatens the renaissance in the biological
sciences that began with Darwin and Mendel and gained fresh
impetus with the discovery of the structure of our genetic
material in the 1950s1. Although historians may one day tell us
which committees and which individuals were responsible for
introducing the various aspects of the peer review process3, it
is doubtful whether we will ever know and fully understand the
factors, conscious or unconscious, which guided their
deliberations. I here offer an explanation of how the peer review
system arose in the hope that any insight provided may hasten
reform.
  The system as we know it today was clearly discernable in the
late 40's when the benefits to be derived from a large public
investment in biomedical research became readily apparent.
Briefly defined, the task was to devise a system for allocating
public funds so as to harness optimally the energy, enthusiasm
and expertise of a nation's biomedical workforce to the goal of
attaining solutions to problems such as cancer, heart disease,
etc..
  The design of the system appears to have been evolutionary; it
was based conceptually on other systems with which the designers
were familiar and with which they approved. Prominent among these
would have been the education system. We may assume that the
designers had all been through the education system and that the
system had been kind to them. One feature of the education system
is that a limited resource, such as access to university, is
rationed out based on one's ability to pass examinations. The
designers were all very good at examinations.  A teacher had
taught them the dates of the Battle of Hastings and of the
American War of Independence. Subsequently there was a test. The
test was marked by the teacher who knew the correct dates. Then,
there was a ranking of the students based on the marks they had
received.  A comforting feature of the test was that, when
repeated with different sets of questions, the previous ranking
was closely approximated. Thus it was perceived as objective and
just. Personal attributes needed to fare well in the examination
system, such as the possession of a good memory and the ability
to work hard in an organized manner, are attributes required for
many complex tasks in modern society. The examination system
worked well in allocating rewards to those who could best benefit
from the further educational opportunities needed to prepare them
for such complex tasks.
  In gaining the approval of the education system, the designers
had come to accept a variety of its premises, which included: (i)
that if you want to select people with some attribute you make
them take a test, (ii) that all attributes are testable and (iii)
that tests are accurate predictors.
  So, in the late 1940's, there were a number of biomedical
researchers who, by surmounting various academic obstacles, had
won positions at universities and research institutes. It was
very natural to think of asking them to write a "test" (grant
application) stating what they wanted to do and why they wanted
to do it. They had all been very good at writing tests, so did
not demur. Then there was a stumbling block. Where was the
teacher who, knowing the right answers, would mark the papers?
Thus, peer review was born. The researchers would mark each
other's papers. The loss of the authority figure (teacher) gave
the process a democratic air, which may have made it easier to
sell to the politicians. Another selling point was the notion
that the researchers would be competing with each other. Perhaps
the "spur of competition" would drive the biomedical research
system as effectively as it appeared to drive the capitalist
economic system4. Thus the designers would have drawn heavily on
analogies, not only with the educational system, but also with
the political and economic systems.

                 Cutbacks Reveal Flaws
  And so the process began. The grant applications were written
and duly marked. Funds were awarded to those who scored highly.
For many years, as long as adequate funds chased the pool of
talent there were few complaints from the research community.
Progress was hailed by system administrators as a sign that all
was well. The fact that a train is moving ahead at 20 miles/hour
sounds great if you do not know that trains are capable of much
greater speeds. Since the same peer review system, with minor
modifications, was adopted throughout the western world, there
were no adequate controls to allow one to determine whether the
system was better than any alternative.
  Then in the early 1970s came the crunch. For the first time (at
least in North America), there were insufficient funds to sustain
all the talented researchers5-7. The administrators, muttering
among themselves about the invigorating effects of heightened
competition, responded by elevating the cut-off point below which
funds would not be given. Suddenly, a new selective gate had been
imposed. Being able at research was no longer a guarantee of
getting through. A new breed of scientist began to emerge,...the
grantsmen,... people whose skills lay not so much in doing good
science, but in tuning into the perceptions of the peer group. (I
am generalising here. Fortunately a few precious individuals, we
all know who they are, escape such facile classification.)
  The new selective gate also influenced the choice of the peers
who would act as gate-keepers for the rest. There had always been
a tendency to choose the "best", as defined by being successful
at doing research (and hence getting funded), to act as peer-
reviewers. The grantsmen, by definition, were now the best and
these came to dominate the peer review process. So grantsmen were
being judged by grantsmen and their expertise lay, not in being
creative scientists, but in being able to tune-in to the
perceptions of other grantsmen.
  In response to mounting unrest, in the mid 1970s the US
National Institutes of Health launched a national enquiry into
the peer review system under the chairmanship of Ruth
Kirschstein. Much was said by all interested constituencies. Of
course the grantsmen were delighted with the system. We are
excellent; the system judges us as excellent; therefore the
system must be excellent. In time a multivolume report appeared8.
But the resulting changes were largely cosmetic. The
administrators shrugged. Sure, like democracy its a terrible
system, but its the best we have.
   The reasons why no change was forthcoming are not hard to
discern. By choosing to use all four limbs for locomotion the
ancestors of the giraffe had foreclosed the options of handling
tools or climbing trees. Likewise, three decades of nurturing the
development of procedures and forms (with such evocative titles
as PHS398 and MRCC11), had generated an entrenched bureaucracy.
Maintaining public confidence, and hence the flow of public
funds, was seen as critical. The virtues of peer review were
loudly proclaimed. The words "excellence" and "peer review" were
repeated together so often that mention of one came to imply the
other. To admit the possibility that the peer review process was
flawed might suggest to government the possibility of replacing
it with an alternative of its own design, which might be far
worse.
  And so through the 80s, as cut-backs deepened, the
administrators responded by raising the cut-off point higher and
higher. At competition after competition the guillotine came
down. Our universities and research institutes were awash with
academic blood9. Reports of cases of scientific plagiarism and
fraud increased. The peer review system was described by Joshua
Lederburg as having become "viscous beyond imagination"10 and by
Phillip Sharp as having taken on a "mask of madness"11. Lewis
Thomas bewailed the fact that the increased competition was
decreasing collaboration and communication between researchers12.
The administrators wrung their hands and mumbled that things
would be just perfect if there were just more money. The public
and the politicians responded as best they could, but the new
dollars went straight into the pockets of the grantsmen. The
administrators tried to improve collaboration by trumpeting new
forms of competition to encourage researchers to collaborate. The
grantsmen moved in. Grant applications arrived festooned with
appendices containing letters from prospective collaborators
(other grantsmen) all eulogising the qualities of the applicant
and swearing eternal collaboration.
   And so to the 90s. The incidence of cancer increases. An AIDS
pandemic spreads relentlessly into new sectors of the population.
The halls and corridors of our hospitals and mental institutions
echo with the cries of the unfortunate losers in genetic
roulette. This is a deadly serious business.

               Recognition of Error-Proneness
   The problem, as I see it, is to break out of the mould created
by the evolutionary mind-set of the system designers. One should
consider that what we are really trying to do with peer review is
to predict the future. Which of a set of researchers is most
likely to make a contribution which, with hindsight, will be
recognized by future generations as having been the most logical
at this point in the development of biomedical knowledge? One
should then arrive at the conclusion that the task is either
impossible or, at least, highly error-prone.
  Daniel Osmond has pointed out that in a valid competition, be
it for research funds or anything else, there must be appropriate
conditions, such as a starting line and a goal. "But research
cannot be reduced to such terms. The runners are at different
starting points on different tracks going in different
directions". He concludes that "those who conduct competitions
must be more humble and realistic about the validity of what they
do"9. Similarly, an analysis of Stephen Cole and his colleagues
concluded that "the fate of a particular grant application is
roughly half determined by the characteristics of the proposal
and the principal investigator, and about half by apparently
random elements which might be characterized as the luck of the
draw."13
  The peer review process is also error-prone because the
creative thinking which one is trying to assess tends to become
less communicatable as it becomes more creative. The less obvious
an idea is, the more difficult it is to communicate. Something
which is readily perceived by a group of peers may sometimes be
the result of a brilliant insight, but more often it will
represent a more modest advance which will readily be assimilated
into existing knowledge. Peer review is like a race where the
real leaders are invisible to the judges. Stories of the
fallibility of peer review abound14. David Prescott has recently
related how sceptical reviewers were of his claim in the early
1970s to have discovered a novel form of DNA. This led to
outright rejection of his grant application15. Most immunologists
are now familiar with the "two signal" concept and the role of
"positive selection" in the education of lymphocytes. Yet it
would have been professional suicide to have proposed experiments
to test these ideas when they were introduced in the 60s and
70s16,17.
  Another error in conception is the notion that it is valid to
draw a parallel between the creativity of an entrepreneur in the
world of finance and that of a biomedical researcher. The case
against this has been argued elsewhere4.
  If an evaluation process is error-prone it does not follow that
evaluation is impossible. It simply means that one has to design
the system taking error-proneness into account. This is what the
designers of the peer-review system failed to do. Two principles
of decision-making in uncertain environments are, (i) place most
weight on parameters which can be assessed objectively, and (ii)
hedge your bets. A design based on these principles, named
bicameral review, has been presented elsewhere 18,19. Grant
applications are divided into a major retrospective part and a
minor prospective part, which are routed separately. The
retrospective part (track record), is subjected to peer review.
The prospective part (proposed work) is subjected to in house
review by the agency, solely with respect to budget
justification. Funding is allocated on a sliding scale. Although
bicameral review is much less revolutionary than the bold stroke
from the brain to the larynx of our prototypic giraffe, it does
offer an alternative to a status quo which is becoming
increasingly unacceptable.

REFERENCES
 1. Chubin, D. F. and Hackett, E. J. (1990). Peerless Science:
Peer Review and US Science Policy. State University of New York
Press, Albany
 2. Harden, V. A. (1986). Inventing the NIH. John Hopkins
University Press, Baltimore
 3. Strickland, S. P. (1988) An interview with Kenneth Endicott.
FASEB.J. 2, 2439-2444
 4. Forsdyke, D. R. (1989) A systems analyst asks about AIDS
research funding. Lancet 2, 1382-1384.
 5. Apirion, D. (1979) Research funding and the peer review
system. Federation Proc. 38, 2649-2650
 6. Mandel, H. G. (1983) Funding more NIH grants. Science 221,
338-340
 7. Forsdyke, D. R. (1983) Canadian medical research strategy for
the 80s. I. Damage limitation of superelitism? Medical Hypothesis
11, 141-156
 8. Kirschstein, R. L. et al., (1976) Grants Peer Review: Report
to the Director, NIH. Phase      I. NIH, Washington
 9. Osmond, D. (1983) Malice's wonderland. Research funding and
peer review.  J.Neurobiol. 14, 95-112
10. Lederberg, J. (1989) Does scientific progress come from
projects or people? Current  Contents, Life Sciences 32, No.48,
5-12
11. Sharp, P. A. (1990) The crisis in funding: a time for
decision.  Cell 62, 839-840
12. Angier, N. (1988) Introduction. In Natural Obsessions: The
Search for the Oncogene, pp. 1-4. Houghton-Mifflin, Boston.
13. Cole, S., Cole, J. & Simon, G. (1981) Chance and consensus in
peer review. Science  214, 881-886
14. Garfield, E. (1987) Refereeing and peer review. Part 3. How
the peer review of research grant proposals works and what
scientists say about it. Current Contents, Life Sciences
30, No. 4, 3-8
15. Prescott, D. M. (1992) Cutting, splicing, reordering and
elimination of DNA sequences in hypotrichous ciliates. Bioessays
14, 317-324
16. Forsdyke, D. R. (1968) The liquid scintillation counter as an
analogy for the distinction between self and not-self in
immunological systems. Lancet 1, 281-283
17. Forsdyke, D. R. (1975) Further implications of a theory of
immunity. J.Theoret.Biol. 52, 187-198
18. Forsdyke, D. R. (1991) Bicameral grant review: an alternative
to conventional peer review. FASEB J. 5, 2312-2314
19. Forsdyke, D. R. (1992) Bicameral grant review: how a systems
analyst with AIDS would reform research funding. Accountability
in Research 3, 1-5







More information about the Bioforum mailing list