BioBit 25 ( Filter Service )
harper at ebi.ac.uk
Thu Jan 26 14:10:29 EST 1995
2525 2525 2525 2525252525 2525 2525 2525 252525252525
25252525 2525252525 25252525 252525252525
25252525 2525 2525 2525 25252525 2525 2525
2525 2525 2525 2525 2525 2525 2525 2525 2525
2525252525 2525 2525252525 2525252525 2525 2525
2525252525 2525 2525252525 2525252525 2525 2525
BIO-NAUT NEWSLETTER 25-1-95
<< EDITED BY ROBERT HARPER >>
Here is a little bit of history for you. When I first began to get
messages from BioSci/Bionet I got them once a week from Martin Bishop
in the UK. All of the messages in the whole of bionet fitted comfortable
onto one very long E-mail message. I stored these messages on an old
VMS machine and people were quite happy to browse through them using
the type command. Ahhh... those were the days... Simplicity.
I then graduated from these large files to individual Email messages
which came from Listserv at Irlearn. As more messages were sent to
Bionet my mail-box was always being filled up. I was always running
out of disk quota, and I hated deleting all of those mail messages.The
repetativeness of any task can bore the pants off you. Therefore I was
very happy to be introduced to newsreaders, whereby I was able to
be more selective about what I wanted to read. However over the years
the number of newsgroups in Bionet have increased and I found that it
was no longer possible to read all of the Bionet newsgroups, so I had
to be more selective about what I wanted to read. Ahhh... the perils
Quite recently I had a discussion to the effect that perhaps it would
be good to have some grad student scan the Bionet newsgroups and
filter out the most important news and pass it on via email as "The
best of Bionet", and from there developed the idea of the EBI Netnews
Filtering Service. Simplicity and sophistication.
Selecting a Netnews Profile.
The aim of the EBI Netnews filter service is to help researchers to
find those articles published in Usenet News that are related to their
fields of interest.
Usenet News provides a complex hierarchy of groups in which people can
freely discuss ideas, exchange relevant information and announce
events of interest to people who follow a particular newsgroup. Many
of the Usenet groups are also gatewayed to electronic mailing lists
which means that they are also available to people that do not have
access to Usenet but only have access to minimal e-mail facilities.
A scientist will usually subscribe to several newsgroups (or mailing
lists if they only have e-mail) that reflect their interests and are
related to their work. From that moment onwards they will then
continuously receive any news posted in these newsgroups or mailing
lists. If they subscribe to too many newsgroups then they immediately
experience the problems of information overload. Whereas if they
subscribe to only a small set of newsgroups then it is probable that
they will lose many significant contributions that were posted
elsewhere. In addition if they try to follow too many groups this can
take up a significant amount of their energy and time.
Now wouldn't it be fine if there was a service that allowed you to
read only those articles that you were interested in? If you could
leave instructions somewhere to say send me every article that Mike
Cherry writes to Bionet, or inform me when Roger Sayle releases a new
version of Rasmol? Here is where the EBI filtering service comes in,
since it allows you to define a profile, or many profiles to reflect
your particular interests.
Your profile is then scanned against the Bionet/EMBnet/Sci newsgroups
and if the filter finds any matches then the article will be posted to
you. It is like having your own personal grad student scanning the
newsgroups and feeding relevant information back to you. Hopefully the
benefits will be a great saving of time and effort. Instead of
subscribing to many groups, you specify the keywords that completely
define all your areas of interest and let the server scan all the
available newsgroups for any article that matches your profile. Thus,
by using the EBI filtering service, you can follow most newsgroups,
extracting only relevant information and ignoring all the articles
that are not related to any of your interests.
The EBI Netnews Filtering Service provides access to all the
newsgroups of the Bionet, EMBNet and Sci hierarchies, thus covering
practically all the groups of interest for scientific and academic
professionals. Once you subscribe to this service, any interesting
article posted in any group of these hierarchies will be scanned and
filtered and if there is a match to your profile then a message will
be sent to you automatically.
How to use a News Filtering Server
OK! the theory sounds great but how does it work, and does it work
well? It may sound useful but the best thing to do is connect to the
server and make a few tests. This way you can see how it works and
verify for yourself that the filtering service may have advantages for
There are two interfaces:
The easiest way to use the service is by means of the World Wide Web
using a browser like Mosaic or Netscape. Just point your browser to
the URL <http://www.ebi.ac.uk/sift>
If you don't have access to the Word Wide Web, or if you just prefer to
use e-mail because you are more familiar with it, you can do the same by
sending a message to <netnews at ebi.ac.uk> with the word help in the main
body of the message.
Using the World Wide Web
Once you are connected to the URL http://www.ebi.ac.uk/sift there are
several options available that will offer you more information on the
service, and how it works.
The "Form for profile submission" is the core of the service, by using
it you can make test runs, define your profiles, subscribe, administer
and review your subscriptions.
The first step in using the service is to identify yourself. This is very
important, since all the notifications from the server will take place by
electronic mail. And if you haven't identified your e-mail address
correctly, none of the reports will be able to reach you.
Next you need to specify a password. This password will ensure that
only you (or any people to whom you have told your password) will be
able to modify or cancel your subscriptions. This password is *not*
your local account password, and we recommend you don't use your local
account password. Invent a new password especially for your sift
sessions. You will need this password later on to modify your profiles or
cancel your subscriptions, so please, try not to forget it.
Now that we have dealt with the serious administrivia, let's see what
this service has to offer:
Making test runs
The next part of the form lets you define your profiles and make tests
to verify that they behave as you expect them. Just type in the
keywords that better reflect your interest. Any article that contains
*all* these keywords will be considered as potentially interesting.
For instance, the line "Molecular Structure" will select all the
articles containing the words "Molecular" and "Structure".
You can further narrow the search by indicating words that you don't
want to appear. For instance, say that you are not interested in NMR
structural analysis, you could use "Molecular Structure Not NMR". The
*NOT* before the word NMR specifies that you don't want that word to
appear in the articles.
This is like using *AND* operators in database searches. But what about
*OR* operations? This is for filtering articles that contain any of several
alternative keywords? For instance, looking for articles that speak only
of molecular structure refined by NMR or X-Rays. The solution is to make
a different profile for each of them: something like "Molecular Structure
NMR" and "Molecular Structure X-Ray".
You can now make a test run and see what happens with the keywords you
have chosen: select the "Test run" button and then press the "Submit
Request" button at the bottom of the form. You will then get a
listing of matching articles and thus will be able to see if the
profile you defined is too narrow (it only gets a few matches or none
at all) or too broad (you get a huge number of articles). The best
way of finding your profile is probably trying this interactive method
until you are satisfied with the results. When doing a test run the
profiles will be marked up in html so by clicking on the Message-ID
for an entry will retrieve the *whole* message from the newsreader
You can make up as many profiles as you want. Each profile is given a
SID number and you can use this SID number if you want to delete a
profile thta you think is no longer useful. If you want to read
article by Mike cherry or Reinhard Doelz then you would use "Cherry"
or "Doelz" in seperate profiles, and each would get its own SID number.
Subscribing to the service
Now we come to the subscription: when you decide if the filtering
service is a useful tool for you, and that you want to use it, you
must select the "Subscribe" button. But before submitting your
request, you can further modify the way the server will handle your
You can specify the period (e.g.. 3 or 7days) which donates how often
you want to receive the reports, and how long your subscription is
going to last (e.g.. the number of days before your subscription
expires). You can also chose between "boolean" and "weighted"
profiles. A boolean profile will select any article that contains your
keywords. A weighted one, will estimate how much close an article is
to the profile you specified, and only mail you articles that surpass
a given threshold. In addition you can specify how many lines from the
original article do you want to read. Usually 10 or 15 is sufficient
to give you an idea of the content of an article, though you are free
to choose any number you want.
Once you are subscribed, you will receive an e-mail message from the
server with the periodicity you requested, for each profile that you
submitted. The E-mail will contain all matching articles (looking
exactly like the example you obtained from the test run). If you find
any article interesting, all you need to do now is to request the
server to send you the full contents by e-mail, or boot up your
newsreader and get the full details. However if you are doing "test
runs" then it is easier to just click on the article you want and it
will be displayed.
Using Electronic Mail
The e-mail address of the EBI server is <netnews at ebi.ac.uk>.
Sending a message with a line containing the word "help" will get you a
description of the mail interface. It is useful to do this even if you
are subscribing using the WWW, since you may be requesting articles
later by e-mail.
It is very easy though: you have the same functionality only that you
must type the commands instead of clicking on an electronic form.
You can make test runs with the command "search" which will lookup your
keywords on recent articles in the database.
To subscribe you send a message stating your profile, and the desired
characteristics of the subscription. At minimum you must specify the
keywords. All other parameters have default values and you only need to
specify them if you do not agree with the default. For instance, the
default periodicity is daily, if you want a longer one you must say so.
An example message could look like this:
From: <Joe.Random.Luser at bioresearch.bionet.org>
To: netnews at ebi.ac.uk
Subject: This line is ignored and you can leave it blank
subscribe Molecular Structure X Ray Not NMR
Once you begin to receive reports, and decide that some articles are of
interest and deserve further reading, you can request them by sending
"get" commands in which you identify the article to be retrieved:
mail netnews at ebi.ac.uk
Later on, as you get more confident to the service, you can request the
server to give you some feedback about your subscriptions and even to
guess the appropriate profile by itself giving sample articles. As with
the WWW, you can review, update or cancel your profiles at any moment.
Read the help file for a detailed description.
You will have noticed too that our examples end with a line containing
the word "end". This is recommended to avoid the server get confused
with signature lines that could be appended to your message. It won't
hurt, and may save you more than one headache.
The EBI version of the Netnews Server
The EBI netnews server is based on the Stanford Netnews Service, and
as that one, it uses SIFT, a program by Tak W. Yan of the Department
of Computer Science at Stanford University to do the work. You can get
a look at the Stanford Server by pointing your WWW browser to the URL
We are using an in-house port of this program to DEC AXP machines
using OSF/1. We have tried our best to ensure that the new version of
the program worked correctly on our machines, but as with any new
version of a program some bugs could still lurk deep inside.
Therefore, if you find any error in the way the server works, we will
greatly appreciate if you tell us so that we can correct it. We
endevour to provide a good an reliable service, but as with any free
service, we can only provide limited warranties on it (see the
Disclaimers on the Web pages).
Getting More Information
The primary sources of information are available on the web pages at
the server address <http:www.ebi.ac.uk/sift> and the help file
retrieved by electronic mail <netnews at ebi.ac.uk>. If you want more
specific information, you can contact our personal, human support
people by sending them e-mail to the address <nethelp at ebi.ac.uk>.
Also, if you encounter problems, or unexpected behaviour in the system,
Then contact Jose-Ramon Valverde (Jose.Valverde at ebi.ac.uk) who has
been responsible for setting up the server.
R. Andrew Harper E-mail: harper at ebi.ac.uk
European Bioinformatics Institute URL: http://www.ebi.ac.uk
Hinxton Hall, Hinxton, Telephone: +44 (0)223 494 437
Cambridge CB10 1RQ U.K. Fax: +44 (0)223 494 468
More information about the Bionews