Database entry cross references

Tom Schneider toms at fcs260c2.ncifcrf.gov
Thu Oct 31 17:02:51 EST 1991


In article <1991Oct30.000312.2646 at ccu.umanitoba.ca> frist at ccu.umanitoba.ca writes:

>A problem arises, however, when you merge two entries, each of which
>contain features with the same label. If both of the Iron-sulfur
>ferredoxins with the label frxA were, parts of the same plastid genome, and
>were later merged with other sequences to form the complete genome, then
>both features would have the same name.
>
>While it may not be the ideal solution, I can suggest one way out of this
>dilemma. Perhaps someone else can do better. Anyway, one answer is to
>incorporate the accession number into the label field, such as
>
>/label=M22345:frxA
>
>This has the advantages of keeping a mnemonic label, while at the same
>requiring no real change in the existing syntax. Furthermore, it's guaranteed
>to be unique within the database.

A simpler solution is to push the idea of labels to its limits and ELIMINATE
accession numbers!  Thus the names WITHIN a species become unique, which is
what geneticists are trying to work out anyway (it's part of their job).

Complete label consists of:
   species          E. coli
   strain           K12
   genetic region   lacZ
   object           ribosome binding site
(am I missing anything?)

So there would never be a name conflict when merges are made (or a merged view is
constructed).  In contrast, many introns and exons in the current database all
have the same name, and merging is impossible.

>My final point is this. While there may be some problems inherent in using
>labels in the database, they are solveable, and their advantages far 
>outweigh these probelms. 

Agreed!

  Tom Schneider
  National Cancer Institute
  Laboratory of Mathematical Biology
  Frederick, Maryland  21702-1201
  toms at ncifcrf.gov



More information about the Bioforum mailing list