Gaps in DNA sequence analysis

Stephen A. Karl karl at CHUMA.CAS.USF.EDU
Mon Dec 4 14:34:28 EST 1995

A quick but potentially important point:

Doug Eernissee writes:

  >>>>> lots of good information removed <<<<<<

> In other words, as the alignment deteriorates, 
> one should become increasingly wary of treating gaps as anything other 
> than missing data.
  It seems that this is actually a very general problem.  Why focus on the
difficulties of determining the homology of the in/dels only?  When the
alignment gets bad, confidence in the _general_ homology should also be of
concern.  If many gaps are needed to align a sequence then it would seem
that the degree of divergence between the sequences are such that little
"true" phylogenetic information may be available.  The converse is also
probably true.  If the sequences are easily aligned and there are "clear"
gaps necessary for the alignment then including them in the data set is
probably warranted.  Weighting is still a problem, but considering them as
a single step is generally conservative. 
  I believe that this is an important point because many people work with 
sequences where gap are less of a problem.  People working with ribosomal 
DNA sequence are probably not in this group and have legitimate 
concerns.  For others where in/dels are not as frequent, it really would 
be wasteful to throw out this information.  Since many different types of 
people read net groups like this, it is important to stress that the 
in/dels themselves are _not_ necessarily the problem.  Yes, deletions 
hide information (e.g., sequence substitutions or other smaller 
deletions which were contained within the larger deleted region) and 
may be the result of parallel evolutionary changes (stem-loop 
structures will have a common underlying _mechanism_ for their deletion 
with the pattern of removal being independent of the lineage relationships).
However, this is not the case for all in/dels and many in/dels may be 
well behaved characters amenable to evolutionary analysis.

Remember -- it is only worth what you paid for it!

Steve Karl

Stephen Karl
Department of Biology
University of South Florida
4202 East Fowler Ave, LIF 169
Tampa, Florida 33620-5150
Voice (813) 974-1592
Fax   (813) 974-3263
EMail Karl at .chuma.cas.usf.edu

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net