Dealing with biosci.src - reply to a comment from Fote Macrides
macrides at sci.wfeb.edu
Sat Jun 11 11:53:33 EST 1994
In article <1994Jun10.060539.13225 at comp.bioz.unibas.ch>, doelz at comp.bioz.unibas.ch (Reinhard Doelz) writes:
> David Kristofferson (kristoff at net.bio.net) wrote:
> : >I hate it when I do WAIS searches of biosci.src, and the hits are all old,
> : >obsolete messages, and I can't think of a query that'll generate hits for the
> : >latest, current, correct information...
> : This is definitely a problem, and I've been contemplating establishing
> : a separate index that only includes "recent" messages. I'd like some
> : feedback on how we should define "recent."
> 'recent' would be a year back in my view.
> I tried various methods with my private mailbox (>80 messages/day) which
> I WAIS'ed to find things more quickly. Possible solutions were:
> (1) Make annual sources to be searched individually
> (2) make 'threaded' indices so that only one is shown
> - sorry, I don't know how to do this
> (3) Keep everything as is and introduce a postprocessor
> As a result, only (expensive and cumbersome) manual filtering helped.
> Unless we make bionet.software.www moderated the same fate is envisaged
> there :-)
The newsgroup archives and many of the textual biology archives are
WAIS databases. The latter typically use header info as potential queries,
to return that text and or the associated sequences. The news articles are
archived with their headers, which include Date:, From:, Subject:, Newsgroup,
etc., lines, which could be used equivalently.
Instead of breaking up the databases in relation to personal concepts
of what is "current", or applying an archiver's personal concepts of what is
"relevant", it would be better (IMHO) to "introduce a postprocessor" for the
queries, via FORM (http) or/and ASK (gopher) interfaces.
One can usually track down what one is seeking in the newsgroup
archives via the general WAIS searches and/or "hunting and pecking" through
the net.bio.net gopher server, but it can take a while.
With FORM or ASK block interfaces, the user could enter a date range,
or an adequate portion of a subject line, or a person or set of people from
whom the messages are likely to have been sent, or a set of newsgroups likely
to contain the relevant messages, etc, as well as a query with terms to be
sought in the bodies of the messages. The script for the FORM or ASK block
could then concoct an appropriate WAIS query (with appropriate booleans, etc.,
assuming Don's WAIS enhancements have been implemented), which make it likely
what the right hits for what that user is seeking will be returned on the
first, or first few, searches. That, in effect, is what a user is doing when
submitting repeated, progressively more elaborate queries, and/or hunting here
and there in the net.bio.net gopher's data tree.
In general it is better to rely on the powerful search software, and
develop progressively optimized, user friendly interfaces to it, rather than
further breaking up or "filtering" the databases, themselves. The latter is
just going further in the wrong direction, often making it more difficult
to retrieve what's wanted, if it doesn't conform well enough to what the
archiver "anticipated" that the users would want. This is particularly the
the case for archives of human verbiage.
Foteos Macrides Worcester Foundation for Experimental Biology
MACRIDES at SCI.WFEB.EDU 222 Maple Avenue, Shrewsbury, MA 01545
More information about the Bioforum