[SE] New WWW,Email service:POLYAH - Recognition of 3'-end cleavage and polyadenilation

"Victor V. Solovyev"solovyev at cmb.bcm.tmc.edu "Victor V. Solovyev"solovyev at cmb.bcm.tmc.edu
Wed Aug 30 20:48:40 EST 1995

New BCM Gene-Finder service:
	POLYAH - Recognition of 3'-end cleavage and polyadenilation region 
of human mRNA precursors
   Department of Cell Biology, Baylor College of Medicine

Analysis of uncharacterized human sequences is available through WWW: 


or by sending your file containing a sequence (the sequence format is 
described below) to University of Houston and soon to Weizmann Institute 
of Science Email services:

service at bchs.uh.edu   or services at bioinformatics.weizmann.ac.il
     with the subject line "polyah". 

Examples: mail -s polyah service at bchs.uh.edu < test.seq

mail -s polyah services at bioinformatics.weizmann.ac.il < test.seq

where test.seq a file with the sequence.

   Algorithm predicts potential position poly-A region by linear discriminant
   functions combining characteristics describing various contextual
   features of these sites. The default LDF threshold in the server is equal 0.

   The accuracy has been estimated for the set of 131 poly-A regions
   and 1466 non-poly-A regions of human genes, having AATAAA sequence. 
   For 86% accuracy poly-A region prediction the algorithm has 8% false 
   predictions (Sp=50%; C=0.62). For example, with threshold 0.7 it
   predicts 8 of 9 poly-A sites of AD2 genome (35937 bp.) and overpredict
   4 false (Compare with method of poly-A site prediction 
   (CABIOS 1994,10,597-603), which for 8 true predicted sites gives 968 
   false positive sites).


   For email submission the sequences must have the following format:  

Name of your  sequence

   (The line length must be less than 80 letters).

   You have to send the file containing the sequence to: 
   service at theory.bchs.uh.edu
   Subject line must be:polyah

   Example: mail -s polyah service at bchs.uh.edu < test.seq

POLYAH output:		

   1st line - name of your sequence; 2nd line - Length of your sequence
   Next lines - positions of predicted sites and their 'weights',
   Position shows the first nucleotide of the AATAAA consensus in the
   predicted region


 HSG11C4A     1741 bp    DNA             PRI       21-FEB
 Length of sequence-      1741
     1 potential polyA site was predicted
 Pos.:    988 LDF-  4.06

Salamov A.A., Lawrence C.B., Solovyev V.V. Recognition of 
	3'-end cleavage and polyadenilation region of human mRNA precursors.
	(1995) (in preparation). 

Questions:solovyev at cmb.bcm.tmc.edu

The other services are 
FGENEH - search for gene structure with exons assembling by dynamic programming 
FEXH   - search for 5'-, internal and 3'-exons
HEXON  - search for internal exons 
HSPL   - search for splice sites
RNASPL - prediction exon-exon junctions in cDNA sequences
CDSB   - prediction of Bacterial coding regions
HBR    - recognition of human and bacterial sequences to test a library 
         for E. coli contamination by sequencing example clones

SSP    - prediction of a-helix and b-strand in globular proteins
	 by segment-oriented approach.
NSSP   - prediction of a-helix and b-strand segments in globular proteins
         by nearest-neighbor algorithm.

More information about the Bio-www mailing list