Retriving sequences from large local databases

Malay mbasu at mail.nih.gov
Fri Apr 1 09:57:33 EST 2005


Kevin Karplus wrote:
> In article <491dde49.0503080219.17666dac at posting.google.com>, Rolf wrote:
> 
>>I am running local blast on different databases. This works perfectly.
>>The problem comes when I want to retrieve sequences from the large
>>(and practically not possible to open) databases.
>>Can this be done through the BioEdit program?
>>Other programs?
> 
> 
> If you are using NCBI blast, then the "fastacmd" program from NCBI can
> retrieve your hits for you.  
> 
> fastacmd -d nr -i idfile > foo.fa
> 
> To get the ids, you might want to use the "-m 9" format for the output
> of the blast query and -I T to turn on the full names. With this
> format, you still have to extract the ids, but it is a simpler script
> than with the usual nearly unparseable blast output.
> 
> ------------------------------------------------------------
> Kevin Karplus 	karplus at soe.ucsc.edu	http://www.soe.ucsc.edu/~karplus
> Professor of Biomolecular Engineering, University of California, Santa Cruz
> Undergraduate and Graduate Director, Bioinformatics
> (Senior member, IEEE)	(Board of Directors, ISCB)
> life member (LAB, Adventure Cycling, American Youth Hostels)
> Effective Cycling Instructor #218-ck (lapsed)
> Affiliations for identification only.

Actually any string can be used to get the sequence for the BLAST 
database, so long the string is present in the original FASTA file used 
to create the BLAST database using formatdb.

fastacmd -d "database" -s "string"

-Malay




More information about the Bio-soft mailing list