Self-Archiving the Refereed Journal Literature

Stevan Harnad harnad at cogito.ecs.soton.ac.uk
Thu Nov 25 14:37:17 EST 1999

On Thu, 25 Nov 1999, Donald King wrote:

> I believe that, in the next 5+ years, article databases (centralized or
> decentralized) will be a welcome complement to (not replacement of)
> current personal subscriptions and library collections. Initially, they
> will replace the 100 million or so copies of articles currently
> distributed (in the U.S.) through ILL/document delivery, reprints and
> colleague distributed preprints and photocopies. However, over the long
> run, these databases may be the basis for real change in the
> communications processes.

It is not clear whether you are referring here to self-archived papers
in Open Archives <http://vole.lanl.gov/ups/ups.htm> such as Los Alamos
<http://xxx.lanl.gov/cgi-bin/show_monthly_submissions> and CogPrints
<http://cogprints.soton.ac.uk>, in which papers are available for free,
or to proprietary publishers' online archives, which can only be
accessed by paying (either Subscription, Site-license, or Pay-Per-View,

There is a world of difference. Assuming that you are indeed referring
to the free Open Archives as the replacement for the "ILL/document
delivery, reprints and colleague distributed preprints and photocopies,"
then I agree (and hope only that the speed of the change will be
considerably faster than your prediction).

> I'm still convinced that publishers add value to the communication
> process, although no one has, as yet, quantified exactly how much they
> have done so. 

I agree completely, and they should continue to be compensated for it.
Our difference might be in WHAT we view as that added value. I think it
is the quality-control and certification (QC/C) of the refereed journal
literature -- the SERVICE of implementing refereeing -- and not
providing the PRODUCT of a document to be paid for by the

In the online era it is now possible to pay for that add-on QC/C
service at the author end, by the author-institution, out of its S/L/P
annual savings from cancelling the product at the reader end.

So the service is the value-added, and what should be paid for (the
author did the rest, doing the research, and writing the paper, and not
seeking any payment for any of it); so that all-important QC/C service
should no longer be a basis for holding the paper itself hostage, as a
proprietary product. For it can now be made available through the Open
Archives, free for all.

> To add value, some resources/costs must be incurred and,
> as long as this holds, there must be some way to recover these costs;
> either through some combination of the "troika" you mention
> (subscriptions, site-licence, pay-per-view) or some new "paradigm" such
> as your suggestion that it be paid for by the author-institution up
> front. In fact, in our book we suggest that [the latter] approach ought to
> be re-examined, although I'm not optimistic.

It is not clear why one should not be optimistic, as the outcome is not
only optimal, but obtainable now. All it calls for is author
self-archiving in Open Archives:


> By the way, even if a royalty is charged for access to the current
> distribution of article separates, the revenue to publishers will not
> be appreciably increased.

No one was discussing a royalty! (That would just be the "P" again, in
S/L/P, and that would continue to hold this anomalous literature
hostage -- anomalous because the authors want, and have always wanted,
to GIVE it away; the royalty model is precisely what does not fit this
literature, and never has.) Rumours of offering the authors of refereed
journal articles a royalty in order to get them to collaborate in
continuing to hold this literature hostage have surfaced recently, but
I doubt that the strategy would succeed.

Journal Article Royalties: Reanimating the "Faustian Bargain" 

Conversely, page-charges to authors in exchange for limitless free
e-prints that they can give away online have been proposed by Thomas
Walker as a transitional strategy (and argued against by me) in the
above forum.


> We make a big point that pricing (or, if you will, cost recovery) may
> be the biggest challenge in the future of scholarly publishing.
> Unfortunately, we could find no "magic bullet."  We do make a strong
> point, however, that any system changes must take into account the
> effect on scientists' time which has been negatively affected by
> replacing personal subscriptions with library use. This time dominates
> the "system" costs and is by far the biggest component of the "price"
> paid by scientists for the information. With sufficient reading of a
> journal, for example, it can take less of their time to "receive"
> journal issues (even in paper) than to sort through or search a large
> database.

I don't think that accurately describes the preferences or practises of
those who have actually tasted Web access to Open Archives and are in
a position to compare the old way with the new, such as the daily users
of Los Alamos:


Nor did any personal paper collection ever open up the world of online
navigation of the entire literature via citation linking:


> The disturbing aspect of spiraling prices is that all participants are
> losing - scientists spend more of their valuable time obtaining
> articles, libraries are providing less information at a greater cost to
> them (and their funders are disillusioned), and publishers have lower
> circulation (and are getting hammered by detractors). Yet, it appears
> that the total amount of system resources (and their costs) have not
> changed much over a 20 year period (on a cost per scientist or cost per
> reading basis). Note that the "true" system costs must exclude the
> exchanges of money (i.e., subscription payment); otherwise the system
> "total costs" would be distorted through duplication.

I am interested in cutting through the baroque complexities of the
current legacy and status quo to ask: How much would it really cost if
all that journals provided was the QC/C service, and left it to Open
Archives and authors to provide access to the product?

A lot less money, I think, and a lot more research would change hands as a

> You mention that new costs would be less than 20% of what they are now
> per article. Such a value (20%) does not hold in all circumstances. For
> example, it varies substantially by circulation.

You are right. But the 20% was really just a ceiling estimate for total
S/L/P revenues from the total corpus of journals (based on calculations
like those of Andrew Odlyzko).

    Odlyzko, A.M. (1998) The economics of electronic journals. In:
    Ekman R.  and Quandt, R. (Eds) Technology and Scholarly
    Communication Univ. Calif. Press, 1998.

The actual QC/C cost per paper will of course be an ABSOLUTE figure,
varying a little from field to field, but certainly not coupled in any
way to the journal's circulation size. The latter is a spurious figure,
based on the current PRODUCT model (based on blocking and selling
access to the paper to the reader-institution) rather than the SERVICE
model (to the author-institution, freeing access to the literature for

The total 100% revenues for the entire corpus is no doubt distorted by
the vagaries of the product model. The new online-only journals, such as
the spectacularly successful Journal of High Energy Physics, are
reporting their costs (after start-up) as about $300 per paper. That is
the right ball-park.


> We have tried to estimate the amount of resources used by publishers
> (with costs attached to the labor, space, equipment, supplies, etc.).
> Evidence seems to suggest that per article costs vary in unanticipated
> ways. The size of the journal  (in number of articles or pages) is one
> variable in which unit costs appear to be low with small journals, rises
> up to an average sized journal (i.e., dis-economies of scale) and then
> levels off and, perhaps, drops.

To the extent that any of these costs are based on a PRODUCT (the
paper) to the reader-institution, rather than a SERVICE
(quality-control and certification) to the author-institution, they are
simply based on an obsolescent and unsustainable access-blockage model,
one that does not fit the give-away research literature that is
at issue here, and one that is at odds with what is in the best
interests of research and researchers.

All figures must be recalculated for a down-sized operation providing
QC/C only. All the rest is merely barricading in place a paper cardhouse
that should at last be allowed to collapse of natural causes.

> A comment by Andrew Odlyzko at a meeting triggered a memory I had of
> looking into this in the late 1970s. Fritz Machlup sent me some raw
> data (from his publisher survey) so that I could see if there were
> economies of scale based on the size of journals. The opposite was
> observed with small journals having low unit costs and large journals
> high unit costs (on the average). I was going to pursue it more to find
> out why, but never did.

Those are all paper/product-era figures. It's time to re-do them for a
networked online-only corpus in the new millennium.

(I remember Fritz well from Princeton, Austro-Hungarian heel-clicking,
"Kuess die Hand," and all!)

> Another variable related to cost/price is the size of the publisher
> (i.e., number of journals published). There also seems to be a
> correlation between price and size of publishers (McCabe) which some
> attribute to monopolistic pricing (and large profits). Some of the
> difference may be attributable to the low circulation of journals
> published by large publishers. This has yet to be determined. However,
> I suspect a more likely culprit is that overhead tends to rise with an
> increase in size of labor-intensive service organizations; which is the
> case in scholarly publishing.

Let's downsize all that overhead to that of a modern-era journal like
Journal of High Energy Physics <http://jhep.cern.ch/>, spawned by the
field that is the most advanced on the planet, on the road to the optimal
and inevitable for all refereed research publication: Physics. 

> One final thought. Any "global archive" database(s) must be accompanied
> with sound search and retrieval capabilities. Much of the valuable
> reading takes place outside the author community and much of the
> reading is of older articles. Furthermore, reading by individuals is
> extending across disciplines. Much of the reading is of "new"
> information, not previously known to the reader.

I agree completely, and those navigational capabilities will indeed be
there, waiting for the users of the Open Archives.


> I'm not sure if these comments reveal anything to you. However, I think
> your attempts to deal with "publishing" cost recovery should be
> welcomed and explored further.
> The title of our book is: Towards Electronic Journals: Realities for
> Scientists, Librarians, and Publishers. Carol Tenopir, Professor at the
> University of Tennessee is the co-author. The text has been put to bed,
> with the bibliography (over 600 citations), author index and subject
> index nearly complete.

Looking forward to seeing it.

Best wishes,

Stevan Harnad                     harnad at cogsci.soton.ac.uk
Professor of Cognitive Science    harnad at princeton.edu
Department of Electronics and     phone: +44 23-80 592-582
Computer Science                  fax:   +44 23-80 592-865
University of Southampton         http://www.cogsci.soton.ac.uk/~harnad/
Highfield, Southampton            http://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM           

NOTE: A complete archive of this ongoing discussion of "Freeing the
Refereed Journal Literature Through Online Self-Archiving" is available
at the American Scientist September Forum (98 & 99):


