Janine Graves wrote:
> I'm a mathematician just starting to look at multiple sequence alignment
> of DNA. I'm not very familiar with the field and seek references on
> algorithms or survey articles on this subject.
I am a magazine editor, not an expert in this field, but I can point you
to a few references. I did a Web search on Alta Vista
(http://www.altavista.digital.com) on the words <"multiple sequence
alignment"> (_include_ the quotes) and came up with many hits. An
interesting one is at
http://www.krl.caltech.edu/~brown/alife/news/biocomp-html/0140.html
which is an old archived post to this group summarizing some responses
to a query similar to yours. A couple of the papers mentioned are
TI: Comparative analysis of multiple protein-sequence alignment methods.
AU: McClure-MA; Vasi-TK; Fitch-WM
SO: Mol-Biol-Evol. 1994 Jul; 11(4): 571-92
TI: Comprehensive study on iterative algorithms of multiple sequence
alignment.
AU: Hirosawa-M; Totoki-Y; Hoshida-M; Ishikawa-M
SO: Comput-Appl-Biosci. 1995 Feb; 11(1): 13
Another useful site might be
http://geoff.biop.ox.ac.uk/papers/rev93_1/section3_6.html
I believe DIMACS at Rutgers had some interesting workshops on math of
sequence analysis last year; brief mention of a relevant one is at
http://dimacs.rutgers.edu/Events/Titles/Schaffer.html
and you might want to contact DIMACS for further info and proceedings.
Look thru your search results and you'll find much more.
As for speed, a very fast algorithm for sequence alignment, knowledge of
which has not (as far as I know) yet thoroughly percolated through the
sequence community, is described in
I. Rigoutsos and A. Califano, "Searching in Parallel for Similar
Strings," IEEE Computational Science & Engineering, Vol. 1, No. 2,
Summer 1994, pp. 60-75.
This describes Flash and dFlash, algorithms based on fast searching of a
_hashed_ pre-prepared database of sequences for imperfect matches with
definable levels of imperfection. I am not sure whether it has been
extended to multiple alignments. It's further described at
http://bioneer.kaist.ac.kr/info/dflash.srv among other places.
--
11
Miles Ehrlich 1111
111
mehrlich at computer.org, ph 714-821-8380, fax 714-821-4010 000000000
Staff Editor, IEEE Computational Science & Engineering 000 111 000
10662 Los Vaqueros Circle, Los Alamitos, CA 90720 USA 000 111 000
http://www.computer.org/pubs/cs&e/cs&e.htm 000000000
111
111111111