Genes/Operons in pathogenic organisms:
Victor Solovyev
softberry at softberry.com
Tue Nov 19 15:32:37 EST 2002
Genes/Operons in pathogenic organisms: Mycobacterium
tuberculosis, Yersinia
pestis and others
Applying Softberry fgenesB-annotator script that
predicts genes and find
similar
proteins in public databases, we present annotations
for several pathogenic
organisms
at:
http://www.softberry.com/berry.phtml?topic=fgenesb_ann
Mycobacterium tuberculosis H37Rv, complete genome
Mycobacterium tuberculosis CDC1551, complete genome
Yersinia pestis strain CO92, complete genome
Yersinia pestis KIM, complete genome
Bacillus anthracis A2012 main chromosome
Example of annotation of Yersinia pestis KIM
Prediction of potential operons and genes in
microbial genomes
Time: Mon Nov 18 11:07:36 2002
Seq name: gi|22123922|ref|NC_004088.1| Yersinia
pestis KIM, complete genome
Length of sequence - 4600755 bp
Number of predicted genes - 4011, with homology -
3927
Number of transcription units - 2364, operons 799
N Tu/Op Conserved S Start
End Score
pairs(N/Pv)
1 1 Op 1 2/0.311 - CDS 21 -
461 375 ## COG0716
Flavodoxins
2 1 Op 2 . - CDS 554 -
1015 362 ## COG1522
Transcriptional regulators
3 2 Tu 1 . + CDS 1185 -
2177 1148 ## COG2502
Asparagine synthetase A
New FgenesB is the fastest (E.coli genome analyzed
in ~14 sec) and most
accurate ab initio Bacterial gene prediction program
available.
http://www.softberry.com/berry.phtml?topic=fgenesb
It uses parameters learned for different bacteria by
FgenesB-train script,
which input is just new bacterial sequence. It will
automatically create
file with gene prediction parameters for the
analyzed organism.
It takes only ~10 minutes to create such file for
such genome as
E.coli using its sequence. If you need parameters
for your new bacteria,
please contact Softberry Inc., we can include them
in the WEB list.
Algorithm based on pattern recognition of different
types of signals
and Markov chain models of coding regions. Optimal
combination of these
features is then found by dynamic programming and a
set of gene models
is constructed along given sequences.
In the current FgenesB version operon prediction
model is realized
based on gene distances. It can recognize accurately
70% of single
transcription units and define exactly about 43% of
operons (~92%
partially).
---
=====
Moderated
bionet.genome.gene-structure
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com
More information about the Genstruc
mailing list