Need for systematic scientometric analyses of open-access data

Stevan Harnad harnad at ecs.soton.ac.uk
Thu Dec 19 07:34:59 EST 2002


>sh>      about 200 open-access journals out
>sh>      of a total of 20,000 toll-access peer-reviewed journals in all. If
>sh>      you have more exact data, please post it for all of us. S.H.]
>
>sh> http://dmoz.org/Science/Publications/Journals_and_Magazines/Free_Online_Journals/
>sh> http://www.freemedicaljournals.com/
>sh> http://www.lcls.lib.il.us/ste/ejournals.htm
>sh> http://highwire.stanford.edu/lists/largest.dtl
>sh> http://www.ulrichsweb.com/ulrichsweb/

On Thu, 19 Dec 2002, Eberhard R. Hilf wrote:

> in Physics we know of 55 Physics refereed ejournals alone.
> see http://www.physnet.de/PhysNet/journals.html

That is certainly good news (and perhaps less surprising, considering that
Physics is the most advanced in Open Access)!

How many refereed Physics journals are there worldwide? That would make
it possible to know the proportion of the total that is open-access so
far. 

The excellent (truly remarkable!) Regensburg resource Ebs cites below:
http://rzblx1.uni-regensburg.de/ezeit/index.phtml?bibid=AAAAA&colors=7&lang=en
lists 759 Physics journals, of which 103 (14%) are open access. (Is this
complete?)

A 3-year series of the corresponding data would also give us a
chance of extrapolating the trend, to estimate when it will approach 100%
open-access at the present rate.

Another useful set of data would be the corresponding figures for
self-archived papers in Physics. We need to know how many refereed
papers appear annually in Physics, and then what proportion of them are
made openly accessible by their authors through self-archiving. Again,
a 3-year series would allow us to extrapolate.

ArXiv's 11-year data provide part of the answer:
http://arxiv.org/show_monthly_submissions
but they of course need to be supplemented by (1) estimates of the total
annual Physics refereed publication count as well as (2) the Physics
self-archiving figures for archives other than ArXiv (but not counting
the archives of the open-access journals, of course).

A 3-year series of those data (total vs. open-access subset, for the
open-access journal corpus and for the author/institution self-archived
corpos) would give a good picture of where things stand, and how fast
they are going, in Physics.

Even better would be a series of such estimates in all the other
disciplines too, as well as for the entire peer-reviewed corpus, across
all disciplines.

And (subtle, but critically important!), we need to know the *quality
level* of the current open-access journals -- as well as of the current
self-archived refereed articles -- within the hierarchy of journals
(and articles). An estimate of this would come from the journal impact
factors (and perhaps also the rejection rates) of the open-access journals
(as estimated, for example, by ISI's http://wos.mimas.ac.uk/),
compared to the rest, and from the author and article citation (and
perhaps usage) impacts for the self-archived article (as estimated, for
example, by http://citebase.eprints.org/ and http://citeseer.nj.nec.com/cs)

Here is a prediction: With a few prominent exceptions, most of the
current open-access journals will come from the lower range of the journal
hierarchy. These are the small journals with small readerships, low
visibility and low impact, struggling to make ends meet. These journals are
of course very welcome among the ranks of the open-access journals,
but they are, in a sense, not the primary targets: the journals toward
the top of the hierarchy.

So the 3 year time-series data for the growth of open access journals
will be very useful and informative, but we will also have to see where
in the journal hierarchy the growth is occurring. The bottom is, of
course, the easier part to convince of the benefits of open access!

Here is another prediction: In the corresponding three-year time-series
data for the growth of open access through author self-archiving, there
will *not* be the (negative) correlation with journal-quality that
(I predict) there will be in the case of open-access journals, because
individual researchers will be self-archiving on their own initiative
rather than on the basis of the hierarchical level of the journal in
which their articles appear.

For self-archiving, Oaister
http://oaister.umdl.umich.edu/o/oaister/viewcolls.html 
with its 1,042,164 records from 119 institutions might be one of the
sources of the data for comparing the progress of open access through
self-archiving of toll-access articles (BOAI-1) with open access through
open-access journals (BOAI-2).

(It is one of the important features of the BOAI-2 strategy, as
practised by PLoS, that they are targetting the very top of the
quality-hierarchy. For whereas the bottom-up BOAI-2 path to open access
may be a long one -- with the weaker journals spontaneously converting
only out of necessity -- PLoS's top-down strategy may prove much faster.)

> In total, across all fields, the number of free full text ejournals should
> be far beyond 200,
> http://rzblx1.uni-regensburg.de/ezeit/index.phtml?bibid=AAAAA&colors=7&lang=en
> "the Electronic Journals Library" of Regensburg
> gives 3,309 in total!! in contrast to 12,831 not freely accessible.
> That is a ratio of 26 %.
> so, do not be that pessimistic.

That is certainly very encouraging! But to know what it means (especially
for the rate of progress towards universal open access to the entire
peer-reviewed literature), we will need the finer-grained time-series
analyses and comparisons I outlined above, including the hierarchical
level data. I hope someone will do those analyes! The rich data are there,
to be mined (and there are refereed articles to be written on the
findings!).

I close with a few percentages I quickly scratched out from the superb
Regensburg data:

Field total::open (%)

Agriculture, Forestry etc. 416::129
Archaeology 50::15
Architecture, Civil Engineering 191::43
Art History 57::25
*Biology 1456::197 (14%)
*Chemistry and Pharmacology 1021::117 (12%)
Classical studies 45::23
*Computer Science 697::165 (24%)
*Economics 1252::303 (24%)
Education 624::244
Electrical Engineering, Measurement and Control Technology 428::72
Energy, environment protection, Nuclear Power Engineering 395::70
English, American Studies 131::44
Ethnic Sciences 173::35
General , Reference works, Communication, Environment 476::243
Geography 227::45
Geology and Paleontology 317::44
German, Dutch and Scandinavian Studies 31::22
*History 317::101 (32%)
History of Education 54::45
Law 416::184
Linguistics and Literary Studies 313::77
*Mathematics 605::128 (21%)
Mechanical Engineering 545::61
Medicine 3556::828
Musicology 53::26
Philosophy 250::74
*Physics 759::103 (14%)
Political Science 569::162
Process Engineering, Biotechnology 276::31
*Psychology 712::109 (15%) 
Romance Studies 28::10 
Sciences 174::64 
Slavonic Studies 16::6 
Sociology 633::135 
Sports Science 86::34 
Technology 304::112
Theology and Religious Studies 129::46

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 02):

    http://amsci-forum.amsci.org/archives/september98-forum.html
                            or
    http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Amsci/index.html

Discussion can be posted to: september98-forum at amsci-forum.amsci.org 

See also the Budapest Open Access Initiative:
    http://www.soros.org/openaccess

the Free Online Scholarship Movement:
    http://www.earlham.edu/~peters/fos/timeline.htm

the SPARC position paper on institutional repositories:
    http://www.unites.uqam.ca/src/sante.htm

the OAI site:
    http://www.openarchives.org

and the free OAI institutional archiving software site:
    http://www.eprints.org/





More information about the Jrnlnote mailing list