If anyone wants an objective procedure for treating uncertain regions
in alignments (in the absence of a badly needed statistical
framework for alignment+phylogeny uncertainty), here is one.
First, get a multiple alignment by some objective procedure (see
below *).
Second, use the SOAP (stability of aligned positions) method of
Ari L=F6ytynoja & Michel Milinkovitch
(http://evol-linux1.ulb.ac.be/~aril/SOAP/) to get reliability scores
for each alignment column. This requires having a set of alternative
alignments, because SOAP takes 1 reference alignment and a set of
N alignments (typically including the reference alignment and a set
of alternative), and computes, for each column in the reference
alignment, the frequency with which its juxtaposition of
sequences/positions occurs in the set of alignments. Each column
thus gets a score ranging from 1/N (unique to the reference alignment)
to 1 (found in all alignments).
Third, use the reliability scores in your analysis. For instance,
if you are doing parsimony, you can use the alignment reliability
as a character weight. This is easy with PAUP because Ari L=F6ytynoja
has modified a command-line version of SOAP to produce NEXUS output
with a matrix of character weights that can be read directly into
PAUP. If you are using some other method of analysis, you could
at least apply a threshold value to exclude unreliable alignment
columns. This would address the kind of problem that Jerry Learn
mentioned, by distinguishing reliable gaps from unreliable gaps.
* To choose the best alignment, we combine the first two steps
in a somewhat tedious process that is made manageable by Perl scripts.
We produce a set of a few dozen multiple alignments with a range
of gap parameters. Then we subject *each one* to SOAP scoring.
Then we simply choose the alignment with the most reliable columns.
In effect, this is an objective method for choosing gap parameters.
Arlin
--=20
Arlin Stoltzfus (arlin at carb.nist.gov)
Research Biologist, NIST; Adj. Asst. Prof., UMBI
CARB, 9600 Gudelsky Dr., Rockville, Md 20850
ph. 301 738-6208; fax 301 738-6255; http://www.molevol.org/camel
---