ADDITON to: remainders of vector sequences in nucleotide databases

Kay Hofmann KHOFMANN at cipvax.biolan.Uni-Koeln.DE
Mon Oct 21 05:30:24 EST 1991


Hello,
  Let me add a few comments on my previous posting about remainders of
vector sequence stretches in non-vectorial sequences.

I just noticed the flaming discussion about correctness of genbank which
seems to be led on bionet.general and am afraid that someone could mistake
my posting as part of the witch-hunt on database maintainers.
As a matter of fact i LIKE the way GENBANK, EMBL and especially SWISSprot
are maintained and annotated, at least if one takes the annotation-time into
account.
So the reasons for my posting were the following:

- point the users of DNA-databases towards the fact that there are at the
  moment vectorial sequences at unusual positions that maight be mistaken
  as possible repeat sequences becuase they are not ruled out by the standard
  ways of discarding known repeats/sequence families in the statistical
  analysis of repeats

- ask the database users if they agree that is is undesirable to have those
  sequences present. In my personal opinion it is not only desirable to remove
  the vectorial sequences themselves but also linkers and restriction sites
  that are not occuring in the sequence as it occurs in nature. Another 
  submission strategy would be submitting clones as they are (including vectors
  linkers, etc.) but i would prefer for 'normal' (not 'syn') entries doing
  it the first way.

- ask data-submitters if they could avoid submitting vector-containing
  sequences

- ask the database maintainers if it is possible to remove those from existing
  entries.

Of course, the vectorial sequences originate from the submitters and this 
problem is NOT restricted to EMBL-databases. In my last posting i mentioned
only EMBL-entries because this is the database i usually query first.

Best regards,
               Kay Hofmann

------------------------------------------------------------------------
Kay Oliver Hofmann                        Tel. ++49 201 478 6980
Institut fuer Biochemie (med. Fak.)       FAX  ++49 201 478 6979
Universitaet Koeln
Joseph Stelzmann Str. 52            INTERNET:
D-5000 Koeln 41                     KHOFMANN at cipvax.biolan.uni-koeln.de
------------------------------------------------------------------------



More information about the Embl-db mailing list