Treeing partial sequences
smith at nucleus.harvard.edu
Wed Mar 10 14:36:04 EST 1993
robison1 at husc10.harvard.edu (Keith Robison) writes:
>I would like to make a tree for a gene family, but the problem is that
>the known members of the family are a mix of complete sequences,
>N-termini, and C-termini. I don't need things to be perfect, just
>a decent guess at the phylogeny. What is the best way to go about this?
>Department of Cellular and Developmental Biology
>Department of Genetics / HHMI
>robison at biosun.harvard.edu
One common way is to calculate a distance matrix from the
overlapping regions, and use Neighbor Joining, or max-likelyhood
to resolve the phylogeny. Phylip 3.5c has a nice function
for geting distances out of AA sequence.
Parsimony is a problem when you dont have positional overlap accross
all members, but the distance methods should give you a good approximation.
Just don't try to over-interpret your results. UPGMA clustering makes
the least speculative statement about your data, and may be most
More information about the Mol-evol