[SE] New WWW,Email service:POLYAH - Recognition of 3'-end cleavage and polyadenilation

"Victor V. Solovyev"solovyev at cmb.bcm.tmc.edu "Victor V. Solovyev"solovyev at cmb.bcm.tmc.edu
Wed Aug 30 20:48:40 EST 1995


New BCM Gene-Finder service:
===========================================================================
	POLYAH - Recognition of 3'-end cleavage and polyadenilation region 
of human mRNA precursors
===========================================================================
	    
   Department of Cell Biology, Baylor College of Medicine

Analysis of uncharacterized human sequences is available through WWW: 

     http://dot.imgen.bcm.tmc.edu:9331/gene-finder/gf.html

or by sending your file containing a sequence (the sequence format is 
described below) to University of Houston and soon to Weizmann Institute 
of Science Email services:

service at bchs.uh.edu   or services at bioinformatics.weizmann.ac.il
 
     with the subject line "polyah". 

Examples: mail -s polyah service at bchs.uh.edu < test.seq

mail -s polyah services at bioinformatics.weizmann.ac.il < test.seq

where test.seq a file with the sequence.
 
 METHOD DESCRIPTION:

   Algorithm predicts potential position poly-A region by linear discriminant
   functions combining characteristics describing various contextual
   features of these sites. The default LDF threshold in the server is equal 0.
   
Accuracy:

   The accuracy has been estimated for the set of 131 poly-A regions
   and 1466 non-poly-A regions of human genes, having AATAAA sequence. 
		
   For 86% accuracy poly-A region prediction the algorithm has 8% false 
   predictions (Sp=50%; C=0.62). For example, with threshold 0.7 it
   predicts 8 of 9 poly-A sites of AD2 genome (35937 bp.) and overpredict
   4 false (Compare with method of poly-A site prediction 
   (CABIOS 1994,10,597-603), which for 8 true predicted sites gives 968 
   false positive sites).

SUBMITTING SEQUENCES VIA EMAIL:

   For email submission the sequences must have the following format:  

Name of your  sequence
ccatctctgtcttgcaggacaatgccgtcttctgtctcgtggggcatcctcctgctggca
ggcctgtgctgcctggtccctgtctccctggctgaggatccccagggagatgctgcccag
aagacagatacatcccaccatgatcaggatcacccaaccttcaacaagatcacccccaac
ctggctgagttcgccttcagcctataccgccagctggcacaccagtccaacagcaccaat
atcttcttctccccagtgagcatcg...............

   (The line length must be less than 80 letters).



   You have to send the file containing the sequence to: 
   service at theory.bchs.uh.edu
   Subject line must be:polyah

   Example: mail -s polyah service at bchs.uh.edu < test.seq

POLYAH output:		

   1st line - name of your sequence; 2nd line - Length of your sequence
   Next lines - positions of predicted sites and their 'weights',
   Position shows the first nucleotide of the AATAAA consensus in the
   predicted region

   FOR EXAMPLE:	

 HSG11C4A     1741 bp    DNA             PRI       21-FEB
 Length of sequence-      1741
     1 potential polyA site was predicted
 Pos.:    988 LDF-  4.06
 
REFERENCE:

Salamov A.A., Lawrence C.B., Solovyev V.V. Recognition of 
	3'-end cleavage and polyadenilation region of human mRNA precursors.
	(1995) (in preparation). 


Questions:solovyev at cmb.bcm.tmc.edu

===============================================================
The other services are 
===============================================================
FGENEH - search for gene structure with exons assembling by dynamic programming 
FEXH   - search for 5'-, internal and 3'-exons
HEXON  - search for internal exons 
HSPL   - search for splice sites
RNASPL - prediction exon-exon junctions in cDNA sequences
CDSB   - prediction of Bacterial coding regions
HBR    - recognition of human and bacterial sequences to test a library 
         for E. coli contamination by sequencing example clones

SSP    - prediction of a-helix and b-strand in globular proteins
	 by segment-oriented approach.
NSSP   - prediction of a-helix and b-strand segments in globular proteins
         by nearest-neighbor algorithm.



More information about the Bio-www mailing list