mkkuhner at phylo.genetics.washington.edu (Mary K. Kuhner) writes:
>In article <1993Mar8.190355.21445 at husc3.harvard.edu>
>robison1 at husc10.harvard.edu (Keith Robison) writes:
>>>I would like to make a tree for a gene family, but the problem is that
>>the known members of the family are a mix of complete sequences,
>>N-termini, and C-termini. I don't need things to be perfect, just
>>a decent guess at the phylogeny. What is the best way to go about this?
>>Two possible ideas (I certainly won't claim to know the best way):
>>1. Use a distance method such as neighbor-joining, and calculate
>distances between sequences as percent difference--you can calculate
>this even with sequences of different lengths, though it will not be as
>accurate as it would with full sequences, especially if variability is
>not randomly distributed across your gene.
If, say, C-terminal fragments do not intersect with N-terminal ones,
some of elements in the distance matrix will not be defined and
any distance method will fail to give the correct place in the tree
for such fragments if it does not include a special treatment for
the indefinite elements.