Hello,
Let me add a few comments on my previous posting about remainders of
vector sequence stretches in non-vectorial sequences.
I just noticed the flaming discussion about correctness of genbank which
seems to be led on bionet.general and am afraid that someone could mistake
my posting as part of the witch-hunt on database maintainers.
As a matter of fact i LIKE the way GENBANK, EMBL and especially SWISSprot
are maintained and annotated, at least if one takes the annotation-time into
account.
So the reasons for my posting were the following:
- point the users of DNA-databases towards the fact that there are at the
moment vectorial sequences at unusual positions that maight be mistaken
as possible repeat sequences becuase they are not ruled out by the standard
ways of discarding known repeats/sequence families in the statistical
analysis of repeats
- ask the database users if they agree that is is undesirable to have those
sequences present. In my personal opinion it is not only desirable to remove
the vectorial sequences themselves but also linkers and restriction sites
that are not occuring in the sequence as it occurs in nature. Another
submission strategy would be submitting clones as they are (including vectors
linkers, etc.) but i would prefer for 'normal' (not 'syn') entries doing
it the first way.
- ask data-submitters if they could avoid submitting vector-containing
sequences
- ask the database maintainers if it is possible to remove those from existing
entries.
Of course, the vectorial sequences originate from the submitters and this
problem is NOT restricted to EMBL-databases. In my last posting i mentioned
only EMBL-entries because this is the database i usually query first.
Best regards,
Kay Hofmann
------------------------------------------------------------------------
Kay Oliver Hofmann Tel. ++49 201 478 6980
Institut fuer Biochemie (med. Fak.) FAX ++49 201 478 6979
Universitaet Koeln
Joseph Stelzmann Str. 52 INTERNET:
D-5000 Koeln 41 KHOFMANN at cipvax.biolan.uni-koeln.de
------------------------------------------------------------------------