all against all matching of some 1000 proteins

Gaston Gonnet gonnet at inf.ethz.ch
Mon May 17 11:23:23 EST 1993


In article <1t82vvINN26f0 at rs1.rrz.Uni-Koeln.DE> khofmann at biomed.biolan.uni-koeln.de writes:
>Dear Netpeople,
>
>In the course of the construction of a specialized sub-database of 
>protein informations (based on swiss-prot), I have the following problem:
>
>It is necessary to group about 1000-1500 of proteins (all contained in
>swissprot) into families of related sequences. The criteria for the
>similarity grouping can be fairly simple, no sophisticated treatment of
>borderline cases is necessary.
>
>Two possible solutions for this task come to mind:
>
>- there have been several attempts to do an all-against-all matching of
>  SWISSPROT or other protein databases.
====================================
We have done it for many versions of SP.  The newest one we have
is for version 24.

>  Is there any possibility to access a subset of the matching scores?
================================================
We are willing to share all the data we have on the all-against-all.
You just need to specify a criteria for selecting the interesting
matches (beware that there are about 2M interesting matches!)
>
>- I could do an all-against-all matching of the proteins of interest
>  myself, although I have the impression that with 1500 sequences this
>  could take some time..
=================================================
Yes, it will take you quite a bit of time or a MasPar machine
of a good size.

Gaston H. Gonnet, ETH Zurich.




More information about the Bio-soft mailing list