IUBio

DNA substitutions saturated?

Mark Siddall mes at zoo.toronto.edu
Wed Dec 6 00:40:52 EST 1995


In article <bengt.oxelman-0412951325470001 at mac38.systbot.gu.se> bengt.oxelman at systbot.gu.se (Bengt Oxelman) writes:

             (actually I wrote this following bit)
>> 1) indel events are not observed data (that is one does not 
>> observe gaps in a sequence), they are matter of inference, thus, should
>> not be treated as observed data points (i.e., code them as "missing").
>
>Length differences are as 'observed' as polymorphisms at 'inferred'
>nucleotide positions.

I disagree to a point.
Polymorphisms are not observed either.  No one sequence has more than one
base at any given position.  Multiple sequences from multiple isolates
may have different bases that are observed, but no one "observes"
a polymorphism.

Regardless this differs from my point.  
My point is, change the alignment parameters and you change the 
homology statement about gaps (in many circumstances).

Take for example:
Taxon I AACCGTACT
TaxonII AACT

In so far as one could get:
AACCGTACT
AAC-----T
or 
AACCGTACT
AA-----CT
or 
AACCGTACT
A-----ACT

under the same alignment parameters, these are obviously not "obersvations"
but inferences.

I do not think this is all that trivial and it makes me wonder about the
veracity of coding gaps as a "fifth state".

The alternative is to treat them as uninformative but this really does
nto treat them as "nothing" it treats them as one of the four observed
states (ACGT) whatever is most parsimonious, notwithstanding that 
none of the four observed states was observed or could rationally
be placed in that position.

I like Dougs idea of coding gaps separately like:
Taxon 1  AACCGTCAGTCAGT-----CGACGTACGTACGTAC 0
Taxon 2  AACCGTCAGTCAGT-----CGACGTACGTACGTAC 0
Taxon 3  AACCGTCAGTCAGTGGACTCGACGTACGTACGTAC 1
Taxon 4  AACCGTCAGTCAGTGGACTCGACGTACGTACGTAC 1

But it has only limited utility and is still a matter of inference since
if we add a Taxon 5 and Taxon 6 we could get:

Taxon 1  AACCGTCAGTCAGT-----CGACGTACGTACGTAC 
Taxon 2  AACCGTCAGTCAGT-----CGACGTACGTACGTAC 
Taxon 3  AACCGTCAGTCAGTGGACTCGACGTACGTACGTAC 
Taxon 4  AACCGTCAGTCAGTGGACTCGACGTACGTACGTAC 
Taxon 5  AACCGTCAGTCAGT---CTCGACGTACGTACGTAC 
Taxon 6  AACCGTCAGTCAGT_GACTCGACGTACGTACGTAC 
>
and now what?
>-- 

Mark

-- 
Mark E. Siddall                "I don't mind a parasite...
mes at vims.edu                    I object to a cut-rate one" 
Virginia Inst. Marine Sci.                     - Rick
Gloucester Point, VA, 23062



More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net