Thorsten,
What you propose is very reasonable. It has already been implemented in the
fasta package of programs. There are a number of programs in that package
that perform randomisation tests of one kind or another. You can get the
package from:
ftp://ftp.virginia.edu/pub/fastaftp://ftp.virginia.edu/pub/fasta/fasta20u41.shar.Z (unix)
ftp://ftp.virginia.edu/pub/fasta/mac/fasta20u4.cpe.bin (mac)
ftp://ftp.virginia.edu/pub/fasta/dos/fa20u416.zip (dos 16-bit)
ftp://ftp.virginia.edu/pub/fasta/dos/fa20u432.zip (win95 32-bit)
ftp://ftp.virginia.edu/pub/fasta/dos/fa20u4sr.zip (sources)
ftp://ftp.virginia.edu/pub/fasta/dos/fa2u4doc.zip (docs)
Hope this helps.
James
thorsten burmester wrote:
>> Dear all,
>> I would like to have your comments on the following idea:
>> One often reads in the literature speculations about possible
> relationships of proteins with only some 15 to 20% identity scores.
> Recently, I thought that a possible method to evaluate the
> significance of such low similarity scores would be to randomise the
> sequences of these proteins by keeping the relative amino acid
> composition. If one does this several times (with one or both of the
> sequences), and re-align these randomised sequences with the same gap
> creation and gap length weights, in case this original alignment was
> significant, the new similarity/identity scores should be
> significantly lower. However, if the observed identity is just due to
> similar amino acid compositions, the scores should be similar.
>> My questions:
>> 1. Does this sound reasonable, and has anybody ever tried a similar
> approach before?
> 2. Do you know any program that can randomise an amino acid sequence
> as described above?
>> Thanks for your help.
>> Thorsten
>> --
> Thorsten Burmester - thorsten at erfurt.thur.de
--
=========
James O. McInerney email: J.mcinerney at nhm.ac.uk
Molec. Biol. Comput. Officer, phone: +44 171 938 9247
Department of Zoology, Fax: +44 171 938 9158
The Natural History Museum,
Cromwell Road,
London SW7 5BD.
=========