[Computational-biology] Re: Point Specific Mutation Matrix vs. profile HMM ?

Kevin Karplus karplus at cheep.cse.ucsc.edu
Sun Feb 26 16:54:26 EST 2006


On 2006-02-26, harald <please_noSpam at gmx.de> wrote:
> thanks a lot for your quick and detailed answers.
> The papers comparing PSI-Blast and HMM profiles and the one about the 
> statistical theory were pretty interesting.
>
> But since the database, which I want to search for homologs is very big 
> (~3 Mio. sequences), I think that a tool like hmmsearch would be too slow.

If you are doing 3 million vs. 3 million, then HMM-based methods are
probably too slow for you.  If you are doing a few hundred vs. 3
million, then HMM-based methods are OK.  It takes a while to do the
iterative search and alignment needed to build a decent HMM, but
scoring sequences with it is not too terrible.  I routinely score all
of PDB (about 22,000 sequences), and it usually takes a couple of
minutes for a 140-long HMM.  Since running time is proportional to the
number of characters, scoring 3 million sequences would take about 5
hours (less on a more modern computer).  This is feasible for hundreds
of models, but not millions of models.

------------------------------------------------------------
Kevin Karplus 	karplus at soe.ucsc.edu	http://www.soe.ucsc.edu/~karplus
Professor of Biomolecular Engineering, University of California, Santa Cruz
Undergraduate and Graduate Director, Bioinformatics
(Senior member, IEEE)	(Board of Directors & Chair of Education Committee, ISCB)
life member (LAB, Adventure Cycling, American Youth Hostels)
Effective Cycling Instructor #218-ck (lapsed)
Affiliations for identification only.



More information about the Comp-bio mailing list