NEW (RNA_MAP) RNA/EST mapping program/EST-SET map and OLIGO_MAP

webmaster webmaster at softberry.com
Fri Apr 27 05:02:20 EST 2001


New program RNA_MAP is one of group (including ESTS_MAP, OLIGO_MAP and 
DBSAN). This group is devoted to comparisons with long genomic 
sequences 20-300 MB. 

It is available at: 

http://www.softberry.com/scan.html 

RNA_MAP is a fast algorithm to accurately map mRNA sequence to genomic 
sequence taking into account splice sites flanking intron sequences.
Time to map mRNA 300 bp on 52 MB of unmasked Y chromosome is about 19 
sec, for 7300 bp the time about 47 sec (checked both chains, one DEC 
alpha processor 500 Mz).

EST_SMAP is for mapping the whole set to a chromosome sequence. For 
example, 11000 sequences of full mRNA from NCBI reference set are 
mapped to 52 MB of unmasked Y chromosome for 18 –25 min (depending on 
computer memory size).

OLIGO_MAP is designed to map a set of oligonucleotides used for 
microarray production. The program map 300000 oligos 25-30 bp long on
49 MB of unmasked Chromosome 22 for 8 min. Program is useful to check 
location of oligos and their uniqueness in genome.
  

Example of an output of the RNA_MAP program:

 Sequence hsNM_005405 RefSeq      human
[D] Sequence:       0, S:        1040, chrY
        1 ----------(..)----------AAATCATCCACTTTCCCGAGAATCTAGGGATTATGC
        1 ggtagctcag(..)atgcccacagAAATCATCCACTTTCCCGAGAATCTAGGGATTATGC

       37 TCCACTGTCTAGAGACTATGCATACCATGATTATGGTCCTTCTAGTTGGGATCAACATTT
 26343643 TCCACTGTCTAGAGACTATGCATACCATGATTATGGTCATTCTAGTTGGGATGAACATTT

       97 CTCTAGAGGATATAG----------(..)----------TGATTGTGATGGCTGTGGTGA
 26343703 CTCTAGAGGATATAGgtattacaac(..)ttcaatttagTGATTGTGATGGCTGTGGTGA

      133 GGTGATGTTAGAGATCATTCTGAACGTCCAAGTGGAAGTTCTTATAGAGATGCATTTCAG
 26343821 GGTGATGTTAGAGATCATTCTGAACGTCCAAGTGGAAGTTCTTATAGAGATGCATTTCAG

      193 AGATAGG----------(..)----------GAACCTCTCATGGTGCACCATCTGCAGGA
 26343881 AGATAGGgtaagggtcc(..)tcccctgcagGGACCTCTCATGGTGCACCATCTGCAGGA

      229 GTGCCTCTGTTGTCTTATGGNGGAAGCAGCCACCATGATTATAGCAATAAATGAGATAGA
 26344341 GTGCCTCTGTTGTCTTATGGTGGAAGCAGCCACCATGATTATAGCAATAAATGAGATAGA

      289 TATGGCAT----------(..)----------
 26344401 TATGGCATaagtcgggag(..)nnnnnnnnnn
[R] Sequence:       0, S:        1040, chrY
        1 ----------(..)----------ATGCCATATCTATCTCATTTATTGCTATAATCATGG
        1 ggtagctcag(..)ctcccgacttATGCCATATCTATCTCATTTATTGCTATAATCATGG

       37 TGGCTGCTTCCNCCATAAGACAACAGAGGCACTCCTGCAGATGGTGCACCATGAGAGGTT
 21018059 TGGCTGCTTCCACCATAAGACAACAGAGGCACTCCTGCAGATGGTGCACCATGAGAGGTC

       97 C----------(..)----------CCTATCTCTGAAATGCATCTCTATAAGAACTTCCA
 21018119 Cctgcagggga(..)ggacccttacCCTATCTCTGAAATGCATCTCTATAAGAACTTCCA

      133 CTTGGACGTTCAGAATGATCTCTAACATCACCTCACCACAGCCATCACAATCA-------
 21018576 CTTGGACGTTCAGAATGATCTCTAACATCACCTCACCACAGCCATCACAATCActaaatt

      186 ---(..)----------CTATATCCTCTAGAGAAATGTTGATCCCAACTAGAAGGACCAT
 21018636 gaa(..)gttgtaatacCTATATCCTCTAGAGAAATGTTCATCCCAACTAGAATGACCAT

      229 AATCATGGTATGCATAGTCTCTAGACAGTGGAGCATAATCCCTAGATTCTCGGGAAAGTG
 21018754 AATCATGGTATGCATAGTCTCTAGACAGTGGAGCATAATCCCTAGATTCTCGGGAAAGTG

      289 GATGATTT----------(..)----------
 21018814 GATGATTTctgtgggcat(..)nnnnnnnnnn
		


---




More information about the Bio-www mailing list