Treeing partial sequences

Steven Smith smith at nucleus.harvard.edu
Wed Mar 10 14:36:04 EST 1993


robison1 at husc10.harvard.edu (Keith Robison) writes:


>I would like to make a tree for a gene family, but the problem is that
>the known members of the family are a mix of complete sequences,
>N-termini, and C-termini.  I don't need things to be perfect, just
>a decent guess at the phylogeny.  What is the best way to go about this?


>Keith Robison
>Harvard University
>Department of Cellular and Developmental Biology
>Department of Genetics / HHMI

>robison at biosun.harvard.edu 

Keith,
  One common way is to calculate a distance matrix from the
overlapping regions, and use Neighbor Joining, or max-likelyhood
to resolve the phylogeny.  Phylip 3.5c has a nice function
for geting distances out of AA sequence.
  Parsimony is a problem when you dont have positional overlap accross
all members, but the distance methods should give you a good approximation.
Just don't try to over-interpret your results.  UPGMA clustering makes
the least speculative statement about your data, and may be most
appropriate.

Sincerely,
Steve Smith
-- 



More information about the Mol-evol mailing list