In <4aej39$553 at nntp3.u.washington.edu> joe at evolution.genetics.washington.edu
(Joe Felsenstein) writes:
>In article <4acsqs$bhu at news.nstn.ca>, Andrew Roger <aroger at ac.dal.ca> wrote:
>>It also turns out, annoyingly, that many distance corrections
>>do not adequately correct for saturation (you still
>>see the saturation pattern)-both simple and complex
>>corrections. I can see how saturated
>>distances confound distance tree estimation. But its
>>not clear that parsimony and likelihood are going to
>>be affected in the same way. Can anyone help me
>>with this?
>If the distances are saturated, this means that there is no sign that any
>two of them are related at all -- they look completely random compared to
>each other. That will lead any method -- distance, likelihood, or parsimony
>-- to have long and wildly variable branch lengths, and random tree
>topology. The noise has overwhelmed the signal. So methods will give
>purely spurious results.
>(That is why many of us are highly suspicious of the method -- which
>fortunately seems to have died out -- of using a totally random sequence as
>an outgroup when no real outgroup is available. It would give a result based
>only on noise, and connect in at an unpredictable place in the tree.)
Do people very often start with a given sequence and randomize it in plausible
ways, to observe how well the various methods represent the actual
relationships?