[Computational-biology] PHYLIP and DNADIST

Chris Hoffmann via comp-bio%40net.bio.net (by hoffmanc from mail.med.upenn.edu)
Wed Jun 27 10:44:37 EST 2007


Hi everybody,
I was wondering about DNADIST, from the PHYLIP package.
I am conducting a big sequencing project and there will be several phases. I 
would like to construct a distance matrix using DNADIST with a initial 
dataset and later on only add more sequences to the set. but I didn't want 
to have to re-run the program with all the sequences again. is there a way 
to only insert the new data into the matrix?
For example:
initially I want calculate the distances from sequences in group of 
sequences A;
then when I get group of sequences B, calculate the distances within 
sequences in group B;
and calculate the distances between sequences in group A and B without 
having to re-calculate the distances for group A again.
Tthis is a simple example, I am actually likely to have 5 or more sets of 
sequences, ranging from 5000 to 20000 sequences per group (perhaps more).
I realize I may have to adapt the code (another issue entirely) but what I 
am concerned is if the methods used by DNADIST give reliable results if I 
calculate them in this fashion.
I wanted to use the F84 model, the default, but I am open to suggestions.
Any help on this would be great.
Thanks
Chris





More information about the Comp-bio mailing list