In article <cp7vat$oge$1 at mercury.rfcgr.mrc.ac.uk>,
Thomas Isenbarger <isen at plantpath.wisc.edu> wrote:
>If the sequences were related and could be trimmed to the same length, I
>would do an alignment and then use phylip to create a distance matrix,
>but since my sequences are unrelated and cannot really be trimmed to the
>same length, I am at a loss for what to do.
>>For a set with so many unrelated sequences of different lengths, the
>only thing I have been able to think of is an all-against-all BLAST to
>create a score matrix using the normalised bits score, but this gives
>high scores for similarities. From there, the only thought I had was to
>use the reciprocal of the BLAST score as some perverse measure of
>distance.
BLAST assumes that alignment means something. What does "similarity"
mean if there is no alignment? I think you have to answer that
first before getting into specific algorithms.
--
Joe Felsenstein joe at removethispart.gs.washington.edu
Department of Genome Sciences and Department of Biology,
University of Washington, Box 357730, Seattle, WA 98195-7730 USA
---