Vector contamination in seq. data (Self-righteous tirade)
danj at welchdev.welch.jhu.edu
Sat Jul 17 08:44:45 EST 1993
In article <CAA1Jq.9t8 at ccu.umanitoba.ca> frist at ccu.umanitoba.ca () writes:
>I just tried to retrieve the Lambda phage sequence from
>retrieve at ncbi.nlm.nih.gov. I got back everything BUT Lambda!
>The reason is that vector contamination is now annotated in GenBank,
>and the email server performs a search on all annotation, you get any
>entry that in any way refers to the sequence you requested. So when I
>tried retrieving with Lambda's unique accession number (J02459) I got
>so many vector-contaminated sequences that it exceeded the line limit that
>I had set at what I thought was a conservatively high figure, before it
>even got to Lambda itself!
>The purpose of this posting is not to gripe at NCBI. I have already sent
>suggestions to them on how I think this problem might be overcome. What
>I want to do here is vent my spleen about sloppy sequencing!
>How can you not know that your sequence contains vector? You know who you
>are; yes I'm talking to YOU! Yes you over there with the 32P all over
>your hands -- are you so sloppy that you don't even bother to look
>for the BEGINNNING of your insert on each and every sequencing run?
I agree with Brian that vector contamination in the databases is a problem,
and that researchers should be more careful about the data that they submit
(and that some serious spleen venting is in order).
For retrieval though in the past year and a half or so it has gotten easier
to pull sequences out of the databases. A gopher search of genbank takes a
matter of seconds to do, if your first search was to broad then you just
refine it a little and search again.
In this case a simple search like:
lambda and phage and genome
pulls the entry of interest right out. Or if you do have the accession
number - a search for
also pulls the entry of interest right out in a matter of seconds.
Best of luck,
danj at welchgate.welch.jhu.edu
More information about the Methods