ClustalW: number of sequences

Julie THOMPSON-MAALOUM julie at titus.u-strasbg.fr
Mon Feb 28 04:34:51 EST 2000


Jerry Learn wrote:

> Hello,
>
> Does anyone out there no if there is a maximum number of sequences that
> ClustalW can align? One of our users is trying to align several thousand
> 100 nucleotide sequences. It appears to be using about 750 meg of RAM.
>
> Jerry Learn
>
> Research Associate
>
> Health Sci. Ctr., Rm. K443-C      |
> Dept. of Microbiology             | Learn at u.washington.edu
> University of Washington          | Phone: (206) 616-4286
> Box 357740                        |   FAX: (206) 616-1575
> Seattle, WA  98195-7740  USA      |
> --
> Jerry Learn
>
> Dept. of Microbiology             |   University of Washington
> Seattle, WA  98195-7740  USA      |

Hello,

If you're using version 1.7(x) or 1.8 of ClustalW 1.7 , then the maximum of

sequences to be aligned should depend only on the memory available in
your computer. Older versions used static memory arrays, and the limits
were hard-coded into the program.

The main problem with using clustalw to align thousands of sequences, is
the time taken to build the guide tree. As the Neighbour-Joining algorithm
takes O(N3) time, this will be the longest step. In which case, as Kevin
Karplus
points out, an alternative program may well be better.

(Although what his alignment of 10,000 proteins looks like, I cannot
imagine!
- the whole of swissprot only contains about 80,000 sequences!)

Julie Thompson






More information about the Bio-soft mailing list