all against all matching of some 1000 proteins
gonnet at inf.ethz.ch
Mon May 17 11:23:23 EST 1993
In article <1t82vvINN26f0 at rs1.rrz.Uni-Koeln.DE> khofmann at biomed.biolan.uni-koeln.de writes:
>In the course of the construction of a specialized sub-database of
>protein informations (based on swiss-prot), I have the following problem:
>It is necessary to group about 1000-1500 of proteins (all contained in
>swissprot) into families of related sequences. The criteria for the
>similarity grouping can be fairly simple, no sophisticated treatment of
>borderline cases is necessary.
>Two possible solutions for this task come to mind:
>- there have been several attempts to do an all-against-all matching of
> SWISSPROT or other protein databases.
We have done it for many versions of SP. The newest one we have
is for version 24.
> Is there any possibility to access a subset of the matching scores?
We are willing to share all the data we have on the all-against-all.
You just need to specify a criteria for selecting the interesting
matches (beware that there are about 2M interesting matches!)
>- I could do an all-against-all matching of the proteins of interest
> myself, although I have the impression that with 1500 sequences this
> could take some time..
Yes, it will take you quite a bit of time or a MasPar machine
of a good size.
Gaston H. Gonnet, ETH Zurich.
More information about the Bio-soft