  Let me add a few comments on my previous posting about remainders of
vector sequence stretches in non-vectorial sequences.

I just noticed the flaming discussion about correctness of genbank which
seems to be led on bionet.general and am afraid that someone could mistake
my posting as part of the witch-hunt on database maintainers.
As a matter of fact i LIKE the way GENBANK, EMBL and especially SWISSprot
are maintained and annotated, at least if one takes the annotation-time into
So the reasons for my posting were the following:

- point the users of DNA-databases towards the fact that there are at the
  moment vectorial sequences at unusual positions that maight be mistaken
  as possible repeat sequences becuase they are not ruled out by the standard
  ways of discarding known repeats/sequence families in the statistical
  analysis of repeats

- ask the database users if they agree that is is undesirable to have those
  sequences present. In my personal opinion it is not only desirable to remove
  the vectorial sequences themselves but also linkers and restriction sites
  that are not occuring in the sequence as it occurs in nature. Another 
  submission strategy would be submitting clones as they are (including vectors
  linkers, etc.) but i would prefer for 'normal' (not 'syn') entries doing
  it the first way.

- ask data-submitters if they could avoid submitting vector-containing

- ask the database maintainers if it is possible to remove those from existing

Of course, the vectorial sequences originate from the submitters and this 
problem is NOT restricted to EMBL-databases. In my last posting i mentioned
only EMBL-entries because this is the database i usually query first.

