Gaps in DNA sequence analysis
Stephen A. Karl
karl at CHUMA.CAS.USF.EDU
Mon Dec 4 14:34:28 EST 1995
A quick but potentially important point:
Doug Eernissee writes:
>>>>> lots of good information removed <<<<<<
> In other words, as the alignment deteriorates,
> one should become increasingly wary of treating gaps as anything other
> than missing data.
It seems that this is actually a very general problem. Why focus on the
difficulties of determining the homology of the in/dels only? When the
alignment gets bad, confidence in the _general_ homology should also be of
concern. If many gaps are needed to align a sequence then it would seem
that the degree of divergence between the sequences are such that little
"true" phylogenetic information may be available. The converse is also
probably true. If the sequences are easily aligned and there are "clear"
gaps necessary for the alignment then including them in the data set is
probably warranted. Weighting is still a problem, but considering them as
a single step is generally conservative.
I believe that this is an important point because many people work with
sequences where gap are less of a problem. People working with ribosomal
DNA sequence are probably not in this group and have legitimate
concerns. For others where in/dels are not as frequent, it really would
be wasteful to throw out this information. Since many different types of
people read net groups like this, it is important to stress that the
in/dels themselves are _not_ necessarily the problem. Yes, deletions
hide information (e.g., sequence substitutions or other smaller
deletions which were contained within the larger deleted region) and
may be the result of parallel evolutionary changes (stem-loop
structures will have a common underlying _mechanism_ for their deletion
with the pattern of removal being independent of the lineage relationships).
However, this is not the case for all in/dels and many in/dels may be
well behaved characters amenable to evolutionary analysis.
Remember -- it is only worth what you paid for it!
Department of Biology
University of South Florida
4202 East Fowler Ave, LIF 169
Tampa, Florida 33620-5150
Voice (813) 974-1592
Fax (813) 974-3263
EMail Karl at .chuma.cas.usf.edu
More information about the Mol-evol