New WWW,Email service:POLYAH - Recognition of 3'-end cleavage and polyadenilation region

Victor V. Solovyev solovyev at
Wed Aug 30 20:51:15 EST 1995

New BCM Gene-Finder service:
	POLYAH - Recognition of 3'-end cleavage and polyadenilation region 
of human mRNA precursors
   Department of Cell Biology, Baylor College of Medicine

	Analysis of uncharacterized human sequences is available 
through WWW:

 or by sending your file containing a sequence 
(the sequence format is described below) to University of Houston 

and soon to Weizmann Institute of Science Email services:

service at   or services at
     with the subject line "polyah". 

Examples: mail -s polyah service at < test.seq

mail -s polyah services at < test.seq

where test.seq a file with the sequence.

   Algorithm predicts  potential position poly-A region by linear discriminant
   functions combining characteristics describing various contextual
   features of these sites. The default LDF threshold in the server is equal 0.

The accuracy has been estimated for the set of 131 poly-A reegions
       and 1466 non-poly-A regions of human genes, having AATAAA 
  For 86% accuracy poly-A region prediction the algorithm has 8% false 
     predictions (Sp=50%; C=0.62). For example, with threshold 0.7 it
     predicts 8 of 9 poly-A sites of AD2 genome (35937 bp.) and overpredict
     4 false (Compare with method of poly-A site prediction 
     (CABIOS 1994,10,597-603), which for
     8 true predicted sites gives 968 false positive sites).


  For email submission the sequences must have the following format:  

Name of your  sequence

   (The line length must be less than 80 letters).

   You have to send the file containing the sequence to: 
   service at
   Subject line must be:polyah

   Example: mail -s polyah service at < test.seq

POLYAH output:		

   1st line - name of your sequence; 2nd line - Length of your sequence
   Next lines - positions of predicted sites and their 'weights',
   Position shows the first nucleotide of the AATAAA consensus in the
   predicted region


 HSG11C4A     1741 bp    DNA             PRI       21-FEB
 Length of sequence-      1741
     1 potential polyA site was predicted
 Pos.:    988 LDF-  4.06

Salamov A.A., Lawrence C.B., Solovyev V.V. Recognition of 
	3'-end cleavage and polyadenilation region of human mRNA precursors.
	(1995) (in preparation). 

Questions:solovyev at

The other services are 
FGENEH - search for gene structure with exons assembling by dynamic programming 
FEXH   - search for 5'-, internal and 3'-exons
HEXON  - search for internal exons 
HSPL   - search for splice sites
RNASPL - prediction exon-exon junctions in cDNA sequences
CDSB   - prediction of Bacterial coding regions
HBR    - recognition of human and bacterial sequences to test a library 
         for E. coli contamination by sequencing example clones

SSP    - prediction of a-helix and b-strand in globular proteins
	 by segment-oriented approach.
NSSP   - prediction of a-helix and b-strand segments in globular proteins
         by nearest-neighbor algorithm.

More information about the Bio-soft mailing list