On Mon, 13 Jun 2005, Tim Gray wrote:
> Thank you for your full and illuminating reply to my query about how much
> material in OA archives is available as full text. I am surprised at how
> low you estimate the figure to be and that it is not, yet, possible to
> produce a definitive number.
Why the number of full texts in OA archives is so low is because the
number of institutions with OA self-archiving mandates (as opposed to
the number institutions with OA Archives) is so low. Cf.:
The remedy is quite obvious (and will come, but is taking rather
than it might).
Swan, Alma and Brown, Sheridan (2005) Open access self-archiving:
An author study. Technical Report, Joint Information
Systems Committee (JISC), UK FE and HE funding councils.
> I am wondering if the Open DOAR (Directory of Oopen Access Repositories -
> the 'sister project' to the Directory of Open Access Journals, DOAJ) will
> set strictly 'full text only' rules for inclusion in its directory?
Archives with mixed contents, some of it other than OA full-texts, should
not be excluded, but an algorithm must be devised to recognise and record
the number of full-texts separately. Tim Brody and co-workers at Southampton
are working on this now for the Southampton OA Archives Registry.
"Newly enhanced Registry of Open Access Repositories (ROAR)"
> how will it relate to the archives.eprints directory you are involved with?
That remains to be clarified, but my understanding is that there will be a
collaboration and DOAR will be built on the Southampton OA Archives Registry.
(Others will have to confirm whether that is indeed the case.)
> It gets confusing to me because there are so many lists of repositories around
> on the web.
That was why the Southampton OA Archives Registry was created, two years ago.
Moreover, because all the other registries rely only on voluntary
self-registration, and archives have not been rigorous about self-registering,
the Southampton OA Archives Registry has been hand-trawling the Web and other
registries to find and register new OA Archives as they are created.
Perhaps a recognizable, consistent self-identifier tag will evolve, so
OA Archives can be automatically harvested and registered, but so far
this has not yet happened. Indeed, some of the ostensibly OAI-compliant
OA Archives may not even be OAI-compliant!
This too will improve, as more institutions adopt institutional self-archiving
policies. Germany's DINI certificate will help.
"Goettingen/DINI/SPARC-Europe Open Access Meeting"
> How does the celestial harvesting list you mention relate to
> the archives.eprints list (are they the same list?)
Celestial, written by Tim Brody, from the University of Southampton,
is an OAI aggregator/cache application that imports OAI metadata from
version 1.0,1.1,2.0 OAI-compliant repositories, and re-exposes that metadata
through either an aggregated or per-repository OAI-compliant 2.0 interface.
Tim is also the creator and maintainer of the Southampton OA Archives Registry
archives.eprints.org where it is explained that:
What does Not in Celestial mean?
This means the archive has not been listed/harvested by Celestial
(celestial.eprints.org). This may be because the archive doesn't
have a functioning OAI-PMH interface.
What does OAI Interface Unknown mean?
Either the archive doesn't have a functioning Open Archives interface,
or we couldn't track down where it is. Site admins should say on
their 'about' or 'help' page where their OAI interface is and use a
common URL for it (e.g. /perl/oai or /cgi-bin/oai). Submitting your
site to the OAI registry/Hussein Suleman's Repository Explorer will
also help to get your site noticed.
> or the large list kept
> by the University of Illinois at Urbana-Champaign (UIUC) at
That is one of the registries from which the the Southampton OA Archives Registry
hand-harvests. The Registry regularly harvests also from OAIster
It can also import lists from OAI list-friends automatically:
> I take the archives.eprints to be the closest to a definitive list of the
> OA Institutional Repositories which we are concerned with here - alhtough I
> notice that our 'DSpace at Cambridge' repository
> <http://www.lib.cam.ac.uk/dspace/index.htm> is not included.
DSpace at Cambridge is in the Registry: See
But it is "not in Celestial" because http://www.dspace.cam.ac.uk/dspace-oai/
is either not the correct OAI base URL or does not work.
In contrast, Cambridge's other OA Archive *is* in Celestial:
All OA archive managers are encouraged to register their Archives, including
their OAI Base URL, and to contact Tim to make sure it works:
(I have emailed this posting to Cambridge's Tom de Mulder and Peter Morgan
in the hope that they will work with Tim to make sure Cambridge is
> I see the distinction between OA Archives and the Open Access Initiative.
Yes, the OAI protocol is for all digital contents, whether OA or non-OA.
It concerns metadata interoperability.
> Maybe this is not strictly relevant to this forum and a basic
> misunderstanding of the purposes of archiving, but I still cannot
> understand why people are archiving *just* the metadata and not the full
> text. It makes OA search engines like OAIster more like a any other
> standard bibliographic database with mostly subscription-only access.
You are quite right about the latter. And the main reason they are only
archiving metadata is what I have already pointed out: The low number of
institutional (full-text) OA self-archiving requirements to date.
But a second reason is that for some kinds of objects (non-OA objects,
i.e., not preprints, postprints or dissertations, e.g., library or
institutional records) the institution may not *want* to archive the
object, only its OAI metadata. The solution, as noted, is automatic
distinction between OA full-text and other kinds of OAI records.
> I am interested in the whole area of Open Access and keeping up with
> developments. This forum is excellent for that purpose.