In article <7gnhmg$bm1 at net.bio.net>, s b <biolinux at yahoo.com> writes:
>>I'm wondering if there is any software or algorithms, given a set of
>DNA sequences to find a maximally divergent segment (of length N) in
>each sequence so when aligned to all other sequences it is of the
>highest divergence. Most people are looking for similiarities but I'd
>like to do this!! I guess that I could use blast, but that would give
>me similarities. I tried flipping scores for matches and mismatches
>but it did not quite work!!
Never seen a tool to do this specifically.
You could get an approximate result by chopping up each test sequence into
small fragments, FASTA or BLAST each fragment against all the other
(intact) sequences, and keep track of the best scores for each search. The
fragment with the lowest "best" score is roughly the piece you're after.
"Most divergent" is a very broad target. Simplest case, assuming gaps are
not allowed, a sequence of length N has 3^N "most divergent" sequence
"matches". (That is, every base is some other base.) Conversely, there is
only one "most similar" target - itself.
mathog at seqaxp.bio.caltech.edu
Manager, sequence analysis facility, biology division, Caltech