On Wed, 8 Oct 2003, Michael Eisen wrote:
> > Stevan Harnad wrote:
> > Whether the digital text (including data) of an article is made openly
> > accessible by being published in an open-access journal or by being
> > published in a toll-access journal but being self-archived in an
> > open-access archive is irrelevant: Either way, the data reported in it
> > are available to be used computationally. Don't confuse the use and
> > re-use of data with the use of the *text* to generate other text (other
> > than by quoting it): Any other re-use of text is plagiarism (i.e., if it
> > is not quotation). Text, unlike data and software code, cannot be
> > reprocessed and made one's own: It can only be cited and quoted.
> > http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2967.html>> What do you mean "don't confuse the re-use of data with the use of text"?
> These are exactly the same thing. There are two goals for open access
> publishing. The first is free access for all to the scientific publications.
> The second is the ability to treat text as data - something that you deride.
Let me explain as clearly as I can: If I put a data-table into my text:
"1 3 7 2 4 6 5" and publish the text as a (copyrighted) article of mine on
paper (only), the reader/user is free to copy down my data (by pencil,
or typing it on a computer) and then do some computations to re-order
it and publish the results in an article (on paper) to the effect that
"the (cited) author generated the wrong data-ordering. It should have
been 1 2 3 4 5 6 7" or even that "the extension of the (cited) authors
finding is: 8 10 14 9 11 13 12" etc.
Now if you can do that on paper, you can do it even more easily if the
full-text, including the data, is freely accessible online -- either as
the result of having been published in an open-access journal or as the
result of having been published in a toll-access journal and also
self-archived by the author in an open-access archive. It saves the user
the trouble of having to re-pencil or retype it.
The only thing the user may not do (if copyright protection is retained,
as it *should* be, rather than putting the text into the public domain,
as recommended by the Sabo Bill) is to republish my own words of text
(rather than the data I report), on-paper or online. That protection means
(1) you may not print or distribute paper copies of my text other than
for your own use, (2) you may not include my text in your own text as
your own published text (though you may quote it in your own published
text if you attribute authorship), and (3) you may not include my text
in your own text, even with authorship attribution, if you alter or
corrupt my text.
But -- and this is absolutely critical if we are to understand free
online full-text access in the PostGutenberg Age correctly: My making
my own full-text freely accessible online means *anyone* worldwide who has
access to the web may (i) retrieve my full-text online, (ii) read it
on-screen, (iii) download it, (iv) save it, (v) print it off, (vi) do
online or offline computations on it. In addition, the software agents
(e.g. google) that I choose to empower to do so (and, by default, this
could be all of them) can, like individuals, (vii) harvest my text,
invert it, index it, perhaps perform further computations on it.
This is not even a legal fact, it is a practical, technological and
inevitable fact about free, full-text web access and the nature of
computers and the internet (and of files that are not fire-walled by a
password or encryption or agent-blocker). If a law allows us to walk
inside a building, a separate law is not needed to say we may breathe the
air in the building; nor is a law that says we may walk but not breathe
enforceable -- except in a sci-fi scenario that is not worth our wasting
our time even contemplating, either in the case of walk-but-don't-breathe
If I choose to make my full-text open access, all of the above goes
with the territory. If I instead choose to put my text behind a
password-protected firewall, or to encrypt it to block certain uses,
then I am not making it open-access.
The crucial thing to understand, though, is that I can make my text
open-access in two ways: either by publishing it in an open-access
journal (which will then presumably go on to make my full-text openly
accessible on my behalf) or by publishing it in a toll-access journal,
and self-archiving it in my own institutional open-access archive,
thereby making it open-access on my own behalf.
There is no need to put the text in the public domain, in either case
(open-access publishing or open-access self-archiving). That would be a
completely unnecessary sacrifice and risk to both my authorship and the
integrity of my text (as discussed extensively on the thread
"Public Access to Science Act (Sabo Bill, H.R. 2613)"
And the copyright issue is much simpler and more straightforward than
most parties have been suggested (usually because they are preoccupied
with their own interests or because they have misconceptualized what is
(1) There no need for open-access papers to be put in the
public domain: copyright can and should still be asserted for
(2) There is no need for open-access authors to refuse to transfer
copyright to their publishers if the publishers wish it (as long the
publisher is (Romeo) "blue" or "green" (i.e., recognizes the authors
right to make his own full-text open-access by self-archiving it)
as 55% of publishers already are, and many others will agree if asked::
(3) Even with non-blue/green publishers, the full-texts can be made
open-access using the preprint + corrigenda strategy. The result is
a bit less convenient, but otherwise functionally identical to (2):
(4) Open-access authors can itemize their rights and preferences, as per
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/ or they can
simply assert standard copyright, self-archive their full-text in
their institutional open-access eprint archive, and allow the natural
constraints of the online medium to ensure that all users have the
capabilities ((i) - (vii) above) that were the reason they made them
open-access in the first place!
See the Southampton ECS open-access eprint archive for a
sample of all that can be done with self-archived full-texts:
http://eprints.ecs.soton.ac.uk/ Or see the other open-access archives
(You'll see that what's missing there is not functionality but only
articles! And one of the reasons I am at pains to lay to rest the notion
that free full-text online access is somehow not access enough
is to increase the number of articles, by reassuring those who are
hesitating to self-archive because they have gotten the idea that free
full-text online access is somehow not functionality enough!)
> If you think that the only possible use of text is to cite and quote than
> you have completely missed the potential of open access publishing. If all
> we accomplish is to let people read any paper they want, we will have
> accomplished a lot, but we will still have failed. The true potential of
> open access publishing is the ways in which we can go beyond simply being
> able to read papers for free online.
As I have tried to show above, the rest of the ways already come with
the territory, if one makes one's full-text freely accessible online.
> Searching the full-text of articles is an obvious example of a use of text
> that goes beyond citing and quoting. And searching is only a trivial example
> of a use of the content of scientific publications. Scientific publications
> are not just words - thet contain knowledge, and the type of use Richard is
> referring to deals not with the data described in a paper, but with the
> knowledge contained in the paper itself- ideas, methods, results and
Please see my response to Richard and the prior threads cited therein,
concerning data-archiving and data-sharing.
See above for my response about harvesting/searching/indexing and
computations on data and text.
A good example of a dedicated harvester for open-access research alone
is OAIster http://oaister.umdl.umich.edu/o/oaister/ but I expect you
have already heard of google, which harvests all full-texts made freely
accessible on the web. http://paracite.eprints.org/
> The open archives movement is focussed on making it possible for people to
> read individual works for free. Open access publishing is focussed on this
> task, as well as the more important goal of ensuring that the contents -
> data as well as text - of all scientific publication are available not only
> for people to access, but for them to use. So long as self-archiving
> focusses only on access, it will not realize the full potential of
> electronic publishing to transform how we use the scientific literature.
Mike, I am afraid the access/use distinction (for full-texts made freely
accessible on the web, as in OAIster) is as completely lost on me
as it was when you first stressed it in Budapest. Unless there is
something I have missed or misunderstood, it seems to me, for the
reasons cited above, to be a non-issue.
I agree, though, that the functionality that is being enhanced by the
OAI's splendid work on interoperability should be extended to
data-tagging and -sharing as the amount of data both in published
articles and in direct data-archives increases:
This, however, has less to do with the nature of open-access than with
the resourcefulness with which we aplly our computational powers.
NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 02 & 03):
Discussion can be posted to: september98-forum at amsci-forum.amsci.org