[Journal-notes] Re: Do PrePrints and PostPrints Need a Copyright
harnad at ecs.soton.ac.uk
Sun Oct 23 07:28:33 EST 2005
On Sat, 22 Oct 2005, Imre Simon wrote:
> I believe that self-archiving in institutional repositories is a very
> important part of the Open Access movement, but I am afraid that just
> the availability of the papers in these institutional repositories is
> not a solid enough solution.
Not a solid enough solution to what? The problem that OA is intended to solve is
research access-denial and resulting research impact loss. Self-archiving
immediately solves both problems, but only 85% of articles are still not being
self-archived, so the access/impact problem becomes the problem of inducing the
remaining 85% of articles to be self-archived. Is Imre Simon saying that his "not
solid enough" problem trumps the access/impact problem? That optimisation schemes
should precede 100% content-provision? And, most important, is he saying providing
a "solid enough" solution should take precedence even if it is in fact at odds with
inducing authors to provide the missing 85% of OA content?
This is the right way to weight these questions, not from the standpoint of
abstract optimisation schemes with no regard for the practical problem of getting
the content in the first place:
> Why? Because what the researcher needs are
> focused disciplinary or thematic digital libraries where a researcher
> can find a lot of papers in the covered theme or discipline.
What the researchers need is the 85% of content, freely accessible for all, online.
*Thenµ we can worry about whether they are missing anything else.
> The more
> papers in the covered area he finds there the better it is.
Before a paper can be found; free online; it must be made free online.
The reason 85% of papers are not findable is because they have not been made OA.
> In a one
> stop search he can find the paper he is looking for, instead of having
> to go to dozens of institutional repositories, each one with his own
> user interface.
Imre, please look at the OAI interoperability protocol, and the OAI IR harvesters,
such as OAIster, or even scirus or google scholar. You are reaking down open doors.
> Even more important and useful would be if the full text of the papers
> could be digested (indexed) by computer programs and one could navigate
> in the disciplinary library through search engines using the full text
> of the papers making use of the text of all other papers as well to
> determine the ranking of a given paper (number of citations, for
That problem will find its place in the queue once we have 100% OA. At the present
15% OA is is a joke, compared to the access denial to 85% of the content for thos
ewho cannot afford access to the journal version.
> Other navigations could be made available: through forward or
> backward references, through hubs and authorities, through text
> similarity or through cited bibliography similarity. A living example,
> with over 700.000 papers with full text in Computer Science is CiteSeer,
> <http://citeseer.ist.psu.edu/> a very useful digital library, a true
> research outlet in Computer Science.
See also Citebase.
> The one condition that makes
> CiteSeer less powerful is the fact that it is still far from complete.
Correct. And providing that missing 85% content is right now the 1st, 2nd
and Nth priority. All else depends on it.
> Theoretically, at least, these disciplinary digital libraries could be
> realized through the OAI protocols, each of them would be a "service
> provider" in the OAI jargon. That is to say, the service provider would
> harvest the papers in the institutional repositories, copy the full text
> of the papers, index them conveniently and make its services available
> to its users.
Not just theoretically, but in actual practice. What is missing is not
functionality, but content.
> Given this scenario, I would like to pose two questions to specialists
> in copyright law, which I am most certainly not.
> Considering the existing permissions to self-archive, given by green
> publishers, do they allow the electronic copy (by a robot) of the full
> text of the self-archived papers, so that they can be indexed by an
> interested service provider and allow him to deliver the services of the
> type described? I think that they probably do not allow for this, but
> would like to hear a more informed opinion.
Why is this question even being asked now; when the 15% of self-archived content
*is* being harvested by all these indexers, whereas the 85% is not being provided?
Why talk about copyright when the problem is missing content?
> The second question is this: assuming that the author would have
> retained the right to distribute his paper under a Creative Commons
> Attribution-NonCommercial license (or even freer than that), would that
> license allow the copy and the operations dewscribed in the paragraph
> above? I think that with a CC license this operation would be perfectly
> legal, even by a robot, but again, I would like to hear a more informed
As I have suggested repeatedly, self-archiving just reuaires self-archiving, not
the CC license, which could be in conflict even with a green publisher's copyright
agreement, and hence in conflict with the author's inclination to self-archive (at
a time when 85% of authors don't yet self-archive)?
If I sound a little shrill, it is because we have already needlessly lost at least
10 years of access and impact because of fretting about or getting distracted by
irrelevancies. It would be good if we could keep our eyes on the ball just long
enough to reach 100% OA. After that, it can be a free-for-all for the meliorists.
Till then, please let's focus on solving the real immediate problem, at long last.
> If my reasoning is correct, this would be another definitive and very
> important difference between having or not having a CC license available
> to the author to distribute his paper.
And if my reasoning is correct, this is a completely irrelevant distrction at this
More information about the Jrnlnote