BioBit 25 ( Filter Service )

Robert Harper harper at
Thu Jan 26 14:10:29 EST 1995

     2525252525                       2525252525
     2525252525                       2525252525
     2525  2525   2525   2525252525   2525  2525   2525   252525252525
     25252525            2525252525   25252525            252525252525
     25252525     2525   2525  2525   25252525     2525       2525
     2525  2525   2525   2525  2525   2525  2525   2525       2525
     2525252525   2525   2525252525   2525252525   2525       2525
     2525252525   2525   2525252525   2525252525   2525       2525

                                 No 25
                        BIO-NAUT NEWSLETTER 25-1-95
                       << EDITED BY ROBERT HARPER >>

Information overload.

Here is a little bit of history for you. When I first began to get
messages from BioSci/Bionet I got them once a week from Martin Bishop
in the UK. All of the messages in the whole of bionet fitted comfortable
onto one very long E-mail message. I stored these messages on an old
VMS machine and people were quite happy to browse through them using
the type command. Ahhh... those were the days... Simplicity.

I then graduated from these large files to individual Email messages
which came from Listserv at Irlearn. As more messages were sent to
Bionet my mail-box was always being filled up. I was always running
out of disk quota, and I hated deleting all of those mail messages.The
repetativeness of any task can bore the pants off you. Therefore I was
very happy to be introduced to newsreaders, whereby I was able to
be more selective about what I wanted to read. However over the years
the number of newsgroups in Bionet have increased and I found that it
was no longer possible to read all of the Bionet newsgroups, so I had
to be more selective about what I wanted to read. Ahhh... the perils
of sophistication.

Quite recently I had a discussion to the effect that perhaps it would
be good to have some grad student scan the Bionet newsgroups and
filter out the most important news and pass it on via email as "The
best of Bionet", and from there developed the idea of the EBI Netnews
Filtering Service. Simplicity and sophistication.

Selecting a Netnews Profile.

The aim of the EBI Netnews filter service is to help researchers to
find those articles published in Usenet News that are related to their
fields of interest.

Usenet News provides a complex hierarchy of groups in which people can
freely discuss ideas, exchange relevant information and announce
events of interest to people who follow a particular newsgroup. Many
of the Usenet groups are also gatewayed to electronic mailing lists
which means that they are also available to people that do not have
access to Usenet but only have access to minimal e-mail facilities.

A scientist will usually subscribe to several newsgroups (or mailing
lists if they only have e-mail) that reflect their interests and are
related to their work. From that moment onwards they will then
continuously receive any news posted in these newsgroups or mailing
lists.  If they subscribe to too many newsgroups then they immediately
experience the problems of information overload. Whereas if they
subscribe to only a small set of newsgroups then it is probable that
they will lose many significant contributions that were posted
elsewhere.  In addition if they try to follow too many groups this can
take up a significant amount of their energy and time.

Now wouldn't it be fine if there was a service that allowed you to
read only those articles that you were interested in? If you could
leave instructions somewhere to say send me every article that Mike
Cherry writes to Bionet, or inform me when Roger Sayle releases a new
version of Rasmol? Here is where the EBI filtering service comes in,
since it allows you to define a profile, or many profiles to reflect
your particular interests.

Your profile is then scanned against the Bionet/EMBnet/Sci newsgroups
and if the filter finds any matches then the article will be posted to
you. It is like having your own personal grad student scanning the
newsgroups and feeding relevant information back to you. Hopefully the
benefits will be a great saving of time and effort. Instead of
subscribing to many groups, you specify the keywords that completely
define all your areas of interest and let the server scan all the
available newsgroups for any article that matches your profile. Thus,
by using the EBI filtering service, you can follow most newsgroups,
extracting only relevant information and ignoring all the articles
that are not related to any of your interests.

The EBI Netnews Filtering Service provides access to all the
newsgroups of the Bionet, EMBNet and Sci hierarchies, thus covering
practically all the groups of interest for scientific and academic
professionals. Once you subscribe to this service, any interesting
article posted in any group of these hierarchies will be scanned and
filtered and if there is a match to your profile then a message will
be sent to you automatically.

How to use a News Filtering Server

OK! the theory sounds great but how does it work, and does it work
well? It may sound useful but the best thing to do is connect to the
server and make a few tests. This way you can see how it works and
verify for yourself that the filtering service may have advantages for

There are two interfaces:

The easiest way to use the service is by means of the World Wide Web
using a browser like Mosaic or Netscape.  Just point your browser to
the URL <>

If you don't have  access to the Word Wide Web, or if you just  prefer to
use e-mail because you are more  familiar with it, you can do the same by
sending a message to <netnews at> with the word help in the main
body of the message.

Using the World Wide Web

Once you are connected to the URL there are
several options available that will offer you more information on the
service, and how it works.

The "Form for profile submission" is the core of the service, by using
it you can make test runs, define your profiles, subscribe, administer
and review your subscriptions.

Identifying yourself

The first step in using the service is to identify yourself. This is very
important, since all the notifications from the server will take place by
electronic  mail. And  if you  haven't  identified  your  e-mail  address
correctly, none of the reports will be able to reach you.

Next you need to specify a password.  This password will ensure that
only you (or any people to whom you have told your password) will be
able to modify or cancel your subscriptions.  This password is *not*
your local account password, and we recommend you don't use your local
account password. Invent a new password especially for your sift
sessions. You will need this password later on to modify your profiles or
cancel your subscriptions, so please, try not to forget it.

Now that we have dealt with the serious administrivia, let's see what
this service has to offer:

Making test runs

The next part of the form lets you define your profiles and make tests
to verify that they behave as you expect them. Just type in the
keywords that better reflect your interest.  Any article that contains
*all* these keywords will be considered as potentially interesting.
For instance, the line "Molecular Structure" will select all the
articles containing the words "Molecular" and "Structure".

You can further narrow the search by indicating words that you don't
want to appear.  For instance, say that you are not interested in NMR
structural analysis, you could use "Molecular Structure Not NMR". The
*NOT* before the word NMR specifies that you don't want that word to
appear in the articles.

This is like using *AND*  operators in database searches.  But what about
*OR* operations? This is for filtering articles that contain any of several
alternative keywords? For instance,  looking for articles that speak only
of molecular structure refined  by NMR or X-Rays. The solution is to make
a different profile for each of them: something like "Molecular Structure
NMR" and "Molecular Structure X-Ray".

You can now make a test run and see what happens with the keywords you
have chosen: select the "Test run" button and then press the "Submit
Request" button at the bottom of the form.  You will then get a
listing of matching articles and thus will be able to see if the
profile you defined is too narrow (it only gets a few matches or none
at all) or too broad (you get a huge number of articles).  The best
way of finding your profile is probably trying this interactive method
until you are satisfied with the results.  When doing a test run the
profiles will be marked up in html so by clicking on the Message-ID
for an entry will retrieve the *whole* message from the newsreader

You can make up as many profiles as you want. Each profile is given a
SID number and you can use this SID number if you want to delete a
profile thta you think is no longer useful. If you want to read
article by Mike cherry or Reinhard Doelz then you would use "Cherry"
or "Doelz" in seperate profiles, and each would get its own SID number. 

Subscribing to the service

Now we come to the subscription: when you decide if the filtering
service is a useful tool for you, and that you want to use it, you
must select the "Subscribe" button.  But before submitting your
request, you can further modify the way the server will handle your

You can specify the period (e.g.. 3 or 7days) which donates how often
you want to receive the reports, and how long your subscription is
going to last (e.g.. the number of days before your subscription
expires).  You can also chose between "boolean" and "weighted"
profiles. A boolean profile will select any article that contains your
keywords.  A weighted one, will estimate how much close an article is
to the profile you specified, and only mail you articles that surpass
a given threshold. In addition you can specify how many lines from the
original article do you want to read. Usually 10 or 15 is sufficient
to give you an idea of the content of an article, though you are free
to choose any number you want.

Once you are subscribed, you will receive an e-mail message from the
server with the periodicity you requested, for each profile that you
submitted. The E-mail will contain all matching articles (looking
exactly like the example you obtained from the test run).  If you find
any article interesting, all you need to do now is to request the
server to send you the full contents by e-mail, or boot up your
newsreader and get the full details. However if you are doing "test
runs" then it is easier to just click on the article you want and it
will be displayed.

Using Electronic Mail

The e-mail address of the EBI server is <netnews at>.

Sending a message  with a line containing  the word "help" will get you a 
description  of the mail interface.  It is useful to do  this even if you 
are  subscribing using  the WWW,  since you  may be  requesting articles 
later by e-mail.

It is very easy though:  you have  the same  functionality only  that you
must type the commands instead of clicking on an electronic form.

You can make test  runs with the command "search"  which will lookup your
keywords on recent articles in the database.

To subscribe  you send a message  stating your profile,  and the  desired 
characteristics of  the  subscription.  At minimum  you must  specify the 
keywords.  All other parameters have  default values and you only need to 
specify them  if you do  not agree with  the default.  For instance,  the 
default periodicity is daily, if you want a longer one you must say so.

An example message could look like this:
  From:    <Joe.Random.Luser at>
  To:      netnews at
  Subject: This line is ignored and you can leave it blank

  subscribe Molecular Structure X Ray Not NMR
  period 7
  type weighted
  threshold 60
  lines 10

Once you begin to receive reports, and decide that some articles are of
interest and  deserve further reading,  you can request them by sending
"get" commands in which you identify the article to be retrieved:

  mail netnews at
  get  bionet.molbio.methds-reagnts.2555
  get  sci.techniques.xtallography.4323
  get  bionet.structural-nmr.657

Later on, as you get more confident to the service, you can request the
server to give you  some feedback about  your subscriptions and even to
guess the appropriate profile by itself giving sample articles. As with 
the WWW, you can review, update or  cancel your profiles at any moment. 
Read the help file for a detailed description.

You will have noticed too that  our examples end with a line containing
the word "end".  This is recommended to  avoid the server  get confused
with signature lines that could  be appended to your message.  It won't
hurt, and may save you more than one headache.

The EBI version of the Netnews Server

The EBI netnews server is based on the Stanford Netnews Service, and
as that one, it uses SIFT, a program by Tak W. Yan of the Department
of Computer Science at Stanford University to do the work. You can get
a look at the Stanford Server by pointing your WWW browser to the URL

We are using an in-house port of this program to DEC AXP machines
using OSF/1.  We have tried our best to ensure that the new version of
the program worked correctly on our machines, but as with any new
version of a program some bugs could still lurk deep inside.

Therefore, if you find any error in the way the server works, we will
greatly appreciate if you tell us so that we can correct it. We
endevour to provide a good an reliable service, but as with any free
service, we can only provide limited warranties on it (see the
Disclaimers on the Web pages).

Getting More Information

The primary sources of information are available on the web pages at
the server address <> and the help file
retrieved by electronic mail <netnews at>. If you want more
specific information, you can contact our personal, human support
people by sending them e-mail to the address <nethelp at>.

Also, if you encounter problems, or unexpected behaviour in the system,
Then contact Jose-Ramon Valverde (Jose.Valverde at who has
been responsible for setting up the server.

Happy sifting.

      R. Andrew Harper                   E-mail:    harper at    
      European Bioinformatics Institute  URL:  
      Hinxton Hall, Hinxton,             Telephone: +44 (0)223 494 437
      Cambridge CB10 1RQ U.K.            Fax:       +44 (0)223 494 468

More information about the Bionews mailing list