Sequences Similar or Chance?

Geoff Barton gjb at bioch.ox.ac.uk
Mon Feb 6 10:38:30 EST 1995


In article <Pine.A32.3.91c.950123112943.26162A-100000 at homer22.u.washington.edu>, Peter Myler <mylerpj at u.washington.edu> writes:
|> On 22 Jan 1995, Mary Berbee wrote:
|> 
|> > We are looking for a program to tell us if two sequences are likely to be
|> > similar by chance or by homology.  I assume there would be a method by
|> > randomly shuffling the bases (ie, keeping the base ratio the same but
|> > randomizing their order, then comparing the percent remaining the same
|> > when aligned).
|> > 
|> > Any suggestions on this will be greatly appreciated!
|> > 
|> > Dave Carmean  carmean at sfu.ca
|> > Mary Berbee  berbee at unixg.ubc.ca
|> > 
|> > 
|> There is a DOS program called RDF2 which does pretty much what you have 
|> described.  It used a FASTA type alignment, I believe.  I don't know 
|> where to get it on the NET, but I have a copy which I can send if you 
|> want to e-mail me.
|> 
|> Peter
|> 
|> ===============================================================================
|> Peter J. Myler                                 phone: (206) 284-8846x332
|> Seattle Biomedical Research Institute          FAX: (206) 284-0313
|> 4 Nickerson Street                             e-mail: MYLERPJ at U.WASHINGTON.EDU
|> Seattle, WA  98109-1651
|> ===============================================================================
|> 
|> 

If you do use RDF2 or any shuffling method to compare protein sequences 
bear in mind that the mean SD score expected for UNRELATED proteins 
is around 3.0.  

---
Geoff Barton




More information about the Bio-soft mailing list