IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

phylip package question

Des Higgins fatherdes at eircom.net
Wed Oct 2 16:03:24 EST 2002

"Chris Hoffman" <choffman at lucas.cis.temple.edu> wrote in message
news:and3qk$pm7$1 at mercury.hgmp.mrc.ac.uk...
> Hello guys,
> I have a question regarding the DNADIST program from Phylip Package.
> I run SEQBOOT with my seqs and get my new data sets produced using
> bootstrap and so far so good. but when i use these new data sets to run
> DNADIST, the program can't run it because it finds one or more sequences
> that are supposedly too different to allow the computation to proceed.
> I tried all the methods available in the program and all give similar
> results.
> btw:  I haven't found any similar msgs running DNAPARS or DNAML
> Thanks a lot,
> Chris. ---

I have to guess here because I have not run DNADIST in a while but, when you
apply a "correction for multiple hits" (same as when you calculate a
distance using the Jukes and Cantor or Kimura or some other models), the
program tries to estimate the true number of substitutions that have
happened between each pair of sequences, based on the observed number of
differences.  If the observed number of differences exceeds 75% or 0.75
differences per site, then some methods get upset (they get asked to take
the log of a negative number and complain).  This level of divergence is
what you expect two completely random sequences to have (given equal numbers
of A,C,G,T) so this is very divergent.  Even if your real sequences do not
have greater divergence than this, when you bootstrap, some
pseudo-replicates can easily contain pairs of sequences that do.
Ultimately, you have some very divergent sequences and the distances between
these are hard to estimate properly.  At least that is what my hazy
recollection suggests.

Des Higgins
Dept. Biochemistry
UC, Cork, Ireland


More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net