IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Peer review and self review of sequence data

Brian Foley btf at t10.lanl.gov
Fri Oct 11 00:09:21 EST 1996

Dear Molecular Biologists and Virologists,

        With the advent of PCR, it is now much easier and
quicker to obtain enough DNA for sequencing.  While PCR
is a very powerful tool, it must also be treated with
respect and caution.  It does not take sloppy technique
to generate a PCR artifact sequence, it can happen to even
the most careful researcher.
        In many cases, there is no economical way to determine
if a PCR-generated sequence is correct.  In the case of
sequences from Human Immunodeficiency Virus (especially of
the env gene), there are thousands of sequences already in
the database which can be used to judge the newly determined
        In the 14 months I have been working for the HIV
genetic sequence database at LANL, I have discovered a great
many sequences, published in peer-reviewed journals, which
were clearly not what they were reported to be.  In some cases
only a careful analysis of the sequence can detect the error,
but in other cases, the reviewers and the authors should have
been able to see from the published figures, that something
was wrong.
        A recent publication:
Learn,G.H.Jr. et al J. Virol 70 (8): 5720-5730 (1996),
discusses problems with HIV sequences, and how to detect
such problems prior to publication.  Many of the methods used
to detect problems in HIV sequences, cannot be used for other
organisms from which fewer samples are available, or for which
the mutation rate is much slower.  However I urge all researchers
who generate DNA sequence data to do as much as possible to
check their sequences for error prior to publication and/or
submission to the databases.  It is much easier, and less
embarassing, to re-sequence a sample than to print an
erratum or re-write a paper.
        In some cases, erroneous sequence data makes little
difference to the scientific conclusions in a single
publication.  However, if the data are submitted to GenBank,
they may be re-used in another publication or used in
a court of law, to great disadvantage.

|Brian T. Foley                btf at t10.lanl.gov                      |
|HIV Database                  (505) 665-1970                        |
|Los Alamos National Lab       http://hiv-web.lanl.gov/index.html    |
|Los Alamos, NM 87544  U.S.A.  http://hiv-web.lanl.gov/~btf/home.html|

More information about the Virology mailing list

Send comments to us at biosci-help [At] net.bio.net