protein sequence generation

plantscigroup at my-dejanews.com plantscigroup at my-dejanews.com
Sat Mar 27 10:08:37 EST 1999


In article <36FA980A.143E4866 at fuerst.de>,
  "Frank Fürst" <frank at fuerst.de> wrote:
> Hi,
>
> Andy Phillips wrote:
> >
> > Cornelius Krasel wrote:
> > >
> > > Be aware that for a protein with n
> > > residues there are approximately n! different sequences (somewhat
> > > less because of repetitions).
> > >
> >
> > Err..shouldn't that be 20^n (20 to the nth power) different sequences??
>
> You're right. But on the other hand, if we come back to the original
> question (which I snipped out above...), even with a molecular weight
> tolerance of +- zero, one gets n! different sequences. Plus the
> sequences with multiple mutations that compensate each other, each of
> them again n! times. So n! gets somewhat important, because for every
> sequence a man would find calculating with his pencil, the computer
> would find n!

No, sorry, it isn't even close to n! - the fact that there are on average n/20
repetitions makes the number much smaller. Consider a protein of length
100 - there are 20^100 (~= 10^130) possible sequences, and 100! (~= 10^157)
possible permutations. Consider a sequence of length Z, in which there are
A amino acids of type 1, B of type 2 etc...
The number of distinct sequences would be:

       Z!
---------------
A! B! C! ... T!


Gary M.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    




More information about the Bio-soft mailing list