FEXH: find exons (human)
Dan Davison
dbd at THEORY.BCHS.UH.EDU
Tue Jul 26 17:37:33 EST 1994
The Baylor College Of Medicine Computational Biology Group
Houston, TX
announces a new service
FEXH
(find exons)
Prediction of internal, 5'- and 3'- exons in Human DNA sequences
NOTE: This service is temporarily being provided through the
University of Houston Gene-Server. Only two jobs will be run at a
time.
Analysis of uncharacterized human sequences is available by sending a
file containing a sequence name as the first line and a sequence (with
no more than 80 characters/line) to
service at theory.bchs.uh.edu
with the subject line "FEXH".
Example: mail -s FEXH service at theory.bchs.uh.edu < test.seq
where test.seq a file with the sequence.
Method description:
**********************
Algorithm firstly predicts all internal exons in a given sequence
by linear discriminant function combining characteristics
describing donor and acceptor splice sites, 5'- and 3'-intron
regions and also coding region for each open reading frame flanked
by GT and AG base pairs. Potential 5'- and 3'- exons are predicted
by corresponding discriminant functions on the left side of the
first internal exon and on the right side from last internal exon,
respectively.
Accuracy:
***************
The accuracy of exon recognition have been estimated for a set
of 1016 exons from 181 complete genes.
It contains nucleotide sequences from -150 bp before the first coding
region and until +150 bp after the last coding region.
Test: Fexh Grail-2
Exact exon prediction 70% 40%
Exon nucleotides 85%(0.84) 77%(0.76)
The numbers in () are the correlation coefficients.
It must be mentioned that this program does not assemble the predicted exons
and it is more reliable for a case of exon missing (for example due to sequence
errors). For a gene model prediction you can use "fgeneh" program from the
Gene-Server (it has a better accuracy for complete gene structure prediction);
or if you have only internal part of a gene sequence, internal exons may be
predicted by server "hexon" program.
Submitting sequences via email:
********************************
For email submission the sequences must have the following format:
Name of the sequence
ccatctctgtcttgcaggacaatgccgtcttctgtctcgtggggcatcctcctgctggca
ggcctgtgctgcctggtccctgtctccctggctgaggatccccagggagatgctgcccag
aagacagatacatcccaccatgatcaggatcacccaaccttcaacaagatcacccccaac
ctggctgagttcgccttcagcctataccgccagctggcacaccagtccaacagcaccaat
atcttcttctccccagtgagcatcg...............
(Restrict the line length to 80 characters or less).
You have to send the file containing the sequence to:
service at theory.bchs.uh.edu
Subject line must be:
fexh
Example: mail -s fexh service at theory.bchs.uh.edu < test.seq
Fexh output:
*******************************
1st line - name of your sequence
2nd line - length of your sequence
3d line - number of potential exons
4th line and next - positions of predicted exons and their weights
For example:
HUMALPHA 4556 bp ds-DNA PRI 15-SEP-1
length of sequence - 4556
number of potential exon: 10
380 - 516 w= 9.10
611 - 727 w=11.10
839 - 954 w=12.33
1147 - 1321 w= 7.70
1819 - 1953 w= 7.90
2053 - 2125 w=12.51
2254 - 2388 w =6.66
2470 - 2661 w=10.11
2881 - 2997 w= 8.87
3120 - 3562 w= 9.92
Reference:
1. Solovyev V.V.,Salamov A.A., Lawrence C.B.
Predicting internal exons by oligonucleotide composition and
discriminant analysis of spliceable open reading frames.
(Nucl.Acids Res.,1994, in press).
2. Solovyev V.V., Salamov A.A., Lawrence C.B.
The prediction of human exons by oligonucleotide composition and
discriminant analysis of spliceable open reading frames.
in: The Second International conference on Intelligent systems
for Molecular Biology, (eds. Altman R., Brutlag D.,Karp R., Latrop R.
and Searls D.), AAAI Press, Menlo Park, CA 1994, (in press)
Problems, comments, and suggestion:
can be mailed to solovyev at cmb.bcm.tmc.edu.
More information about the Bio-soft
mailing list