Urbigene Software Distribution

Pierre plindenbaum at yahoo.fr
Mon Jun 2 04:00:20 EST 2003


hello all,

The Urbigene Package contains modest C++ tools for molecular biology I
wrote at the INTEGRAGEN company. As a subset of those tools do not
present any commercial interest so I've been allowed to release it to
the scientific community as an open source package under the GNU
General Public License (GPL). You'll find sources for parsing blast
results in XML format, the new versions of the CloneIt program,
filters for FASTA sequences, for PRIMER3 output... etc...

(There are also programs that are not dedicated to biology but may be
of general interest. For example PIVOT creates cross tables from
delimited files, GeneticProg tries to find an equation that fits
experimental values, etc...)

The package is available at:

                  http://www.urbigene.com


Usage Example
Consider the following script:


#This script takes as input the chromosome 22 from the goldenpath
#It then digests the whole chromosome by NotI
#cuts the boundaries by 6 bases,
#keeps fragments between 100 bases and 10Kb,
#keeps fragments containing a CA repeat,
#keeps fragments where %GC is between 40 and 60%,
#just keeps the 10 first sequences,
#converts the sequences as an input for primer3
#launches primer3
#converts the amplified fragments to FASTA
#blast those fragments against the whole goldenpath
#retains BLAST HSP where score is lower than 10 or greater then 50
#converts the output to text
#transforms this text to XML
#keeps the 50 first lines
#

BIN=./bin/
${BIN}/fastaretrieve -chr 22 -entry
/env/ig/pubdb/mirror/golden_path/14nov2002/chromosomes/entry_points.csv
|
${BIN}/fastadigest -e NotI |
${BIN}/fastacrop -5 6 -3 6 |
${BIN}/fastasize -m 100 -M 100000 |
${BIN}/fastaslice -e 5000 -n 10000 |
${BIN}/fastafind -s CACACACACACA -print T |
${BIN}/fastagc -min 40 -max 60 -sort T |
${BIN}/fastahead |
${BIN}/fasta2primer3 -max-stgy 1 -gc-min 20 -gc-max 80 -max-size 2000
|
primer3 |
${BIN}/primer3tofasta | 
blastall -e 10 -p blastn -d
/env/ig/pubdb/blastdb/GP10apr2003/gp10apr2003 -m 7 |
#${BIN}/blastlisp -e 'or(lt(hsp.score(),10),gt(hsp.score(),50))' |
${BIN}/blast2txt |
${BIN}/text2xml |
head -n 50 > demo.txt


Result will be 



//iteration
######################################################################	22:0-47748584(+)|restriction_fragment[NotI(35516844)-NotI(35580318):63482]|crop_5(6)crop_3(6)|size_filter(100-100000)|slice(50000-59999)|gc(41.53%)|pcr_0(34-884:851
pb) primer_left(TTCCAAAGTGCTGGGATTATAG)
primer_right(TCTGGGATTTTCCAGAGGTATAG)	len:851
 ####################################################################>	build33|chr22|slice(37101000-37250999)	len:150000	Object:94394-95244	Query:1-851	830	0
                                                         .....>      
	build33|chr22|slice(37101000-37250999)	len:150000	Object:47208-47286	Query:682-761	32	1.48067e-07
 ..>                                                                 
	build33|chr22|slice(37101000-37250999)	len:150000	Object:138640-138677	Query:3-40	30	2.31183e-06
 <..                                                                 
	build33|chr22|slice(37101000-37250999)	len:150000	Object:88694-88734	Query:43-3	29	9.13492e-06
 <..                                                                 
	build33|chr22|slice(37101000-37250999)	len:150000	Object:113943-113983	Query:43-3	29	9.13492e-06
 <.                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:16320-16349	Query:32-3	26	0.000563575
 <..                                                                 
	build33|chr22|slice(37101000-37250999)	len:150000	Object:76801-76838	Query:40-3	26	0.000563575
 <..                                                                 
	build33|chr22|slice(37101000-37250999)	len:150000	Object:104040-104080	Query:43-3	25	0.0022269
 <.                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:132737-132766	Query:32-3	22	0.137388
 <.                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:71167-71196	Query:32-3	22	0.137388
 <.                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:142101-142130	Query:32-3	22	0.137388
 <.                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:127879-127904	Query:28-3	22	0.137388
 .>                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:82934-82963	Query:3-32	22	0.137388
 <.                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:53573-53602	Query:32-3	22	0.137388
 <.                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:62190-62219	Query:32-3	22	0.137388
 .>                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:33547-33576	Query:3-32	22	0.137388
 ..>                                                                 
	build33|chr22|slice(37101000-37250999)	len:150000	Object:94134-94171	Query:3-40	22	0.137388
 <..                                                                 
	build33|chr22|slice(37101000-37250999)	len:150000	Object:95472-95509	Query:40-3	22	0.137388
 .>                                                                  
	build33|chr22|slice(37101000-37250999)	len:150000	Object:121477-121506	Query:3-32	22	0.137388
(...


Enjoy
Pierre Lindenbaum PhD





More information about the Bio-soft mailing list