Protcomp 6.0 Finding sub-cellular localization of Eukaryotic proteins: Animal- Plants
softberry at softberry.com
softberry at softberry.com
Thu Oct 21 13:31:37 EST 2004
Protcomp 6.0 Finding sub-cellular localization of Eukaryotic proteins: Animal-
New Version 6. of Procomp is available to run at
Softberry releases ProtComp ver. 6. The new version of popular program for
of protein subcellular localization, ProtComp, has overall prediction accuracy
Prediction accuracy of prokaryotic version, ProtCompB ver. 2, is 95%.
ProtComp combines several methods of protein localization prediction - neural
prediction, direct comparison with updated base of homologous proteins of
prediction of certain functional peptide sequences, such as signal peptides,
GPI-anchors, transit peptides of mitochondria and chloroplasts and
and search for certain localization-specific motifs. The program includes
trained recognizers for animal/fungal and plant proteins, which dramatically
recognition accuracy. The following table provides approximate prediction
for each compartment of animal and fungal proteins.
Testing was performed on a sample of 1128 proteins of known
localization which were NOT included in training sample for the program.
Compartment Sample Size Percent predicted correctly (example for Animal)
ver. 5 ver. 6
Nucleus 200 88 91
Plasma Membrane 200 87 100
Extracellular 200 83 86
Cytoplasm 199 63 88
Mitochondria 129 82 89
Endoplasmic Reticulum 107 83 82
Peroxisome 34 97 91
Lysosome 12 91 100
Golgi 47 77 91
ProtComp Version 6. Identifying sub-cellular location (Animals&Fungi)
Seq name: QUERY, Length=376
Significant similarity in Location DB - Location:Cytoplasmic
Database sequence: AC=P08319 Location:Cytoplasmic DE Alcohol dehydrogenase
class II pi chain precurs
Score=14845, Sequence length=391, Alignment length=365
Predicted by Neural Nets - Extracellular (Secreted) with score 2.4
Integral Prediction of protein location: Cytoplasmic with score 14.7
Location weights: LocDB / PotLocDB / Neural Nets / Tetramers / Integral
Nuclear 0.0 / 0.0 / 0.71 / 0.00 / 0.71
Plasma membrane 0.0 / 0.0 / 0.73 / 0.00 / 0.73
Extracellular 0.0 / 0.0 / 2.42 / 0.00 / 2.42
Cytoplasmic 14845.0 / 18465.0 / 0.83 / 8.50 / 14.68
Mitochondrial 0.0 / 0.0 / 0.70 / 0.00 / 0.70
Endoplasm. retic. 0.0 / 0.0 / 0.70 / 0.50 / 1.21
Peroxisomal 0.0 / 0.0 / 0.49 / 0.00 / 0.49
Lysosomal 0.0 / 0.0 / 0.33 / 0.00 / 0.33
Golgi 0.0 / 0.0 / 0.40 / 0.00 / 0.40
LocDB are scores based on query protein's homologies with proteins of known
PotLocDB are scores based on homologies with proteins which locations are not
experimentally known but are assumed based on strong theoretical evidence.
Neural Nets are scores have been assigned by neural networks.
Tetramers are scores based on comparisons of tetramer distributions calculated
for QUERY and DB sequences.
Integral are final scores as combinations of previous four scores.
While interpreting output results, it must be kept in mind that:
1. ProtComp's scores per se, being weights of complex neural networks, do not
represent probabilities of protein's location in a particular compartment.
2. Significant homology with protein of known location is a very strong
indicator of query protein's location.
3. For neural networks scores, their relative values for different
compartments are more important than absolute values, i.e. if the second best
score is much lower than the best one, prediction is more reliable, regardless
of absolute values.
4. If both neural networks and homology predictions point to the same
compartment, this is very reliable prediction.
More information about the Bio-www