pI distribution of E.coli cell extracts
> If you want a comprehensive but more theoretical answer
> and you have the computing resources you could predict
> the pI for each of the E.coli ORF protein sequences
> from the total K12 genome and work out the proportions
> from there. The pI prediction (and it is just a prediction)
> is fairly trivial so all you need is the disk space.
> If I have room on my Linux box I'll give it a try over
> the weekend.
My apologies! I underestimated the zeal of the E.coli people
in analysing their data. They have already predicted the pI
values for all of the predicted 4290 protein-encoding ORFs.
I went to the Wisconsin K12 page at:
http://www.genetics.wisc.edu/html/k12.html
and grabbed the Excel spreadsheet from:
ftp://ftp.genetics.wisc.edu/pub/analysis/m52orfs.zip
Just to make things easier for you the pI values are grouped
as follows:
pI Number of ORFs
== ==============
<4 27
4 - 5 534
5 - 6 1057
6 - 7 875
7 - 8 358
8 - 9 573
9 - 10 638
10 - 11 178
11 - 12 44
>12 6
So *theoretically* about 33.5% of E.coli proteins will have
pI values greater than 8. So, you should be reasonably
correct in your assumption that most pI values are 4 - 10
as only 5.4% lie outside this range.
Bear in mind that many ORFs are only hypothetical,
post-translational modification is not accounted for, and
the calculated pI values do not always correspond well with
the actual values.
I hope that this helps (I know I learnt something),
