Software to extract annotation fields from EMBL/GenBank entries.
frist at cc.umanitoba.ca
Fri Jun 7 12:35:34 EST 1996
Brian Robertson wrote:
> The amount of bacterial genome data available as sequenced cosmids of
> 30-40 kb is increasing rapidly. Our problem is that we need to keep track
> of newly discovered genes as they appear, so they can be incorporated into
> our research program as appropriate. For this we need to create lists of
> probable genes identified in the annotations for each cosmid. This can
> then be circulated to laboratory workers.
> An example of this kind of annotation is shown below. We would like to
> extract the "/note" field, which contains the probable function of the
> gene, and create a list of these for each cosmid.
> FT CDS_pept complement(3043..4155)
> FT /note="MTCY190.03c, probable anthranilate
> FT phosphoribosyltransferase, trpD, len: 370, similar to eg
> FT SW:TRPD_LACCA P17170, (43.2% identity in 308 aa overlap),
> FT initiation codon uncertain, gtg at 4086 favoured by
> FT homology but this has no clear ribosome binding site"
> Does anyone know of a way of extracting this information from database
> entries and creating a list? Is there any software avaialable that has
> this as one of its options, or would a shell script be needed?
You might try the FEATURES program from the XYLEM package, which was
Fristensky, B. (1993) Feature expressions: creating and manipulating
sequence datasets. Nucl. Acids Res. 21:5997-6003.
FEATURES is a program that can read GenBank Features Tables and
extract the corresponding sequences, Feature expressions, and
annotation lines. FEATURES is a Unix program, which can be run from
the command line, as a text-based interactive program, or from a GDE menu.
To see an example of how FEATURES works, and to retrieve the XYLEM package,
XYLEM can also be downloaded from directory 'psgendb' at ftp.cc.umanitoba.ca.
Brian Fristensky |
Department of Plant Science | Best advice I've heard in a long time:
University of Manitoba |
Winnipeg, MB R3T 2N2 CANADA | "Don't confuse having a career with
frist at cc.umanitoba.ca | having a life."
Office phone: 204-474-6085 |
FAX: 204-261-5732 |
More information about the Bio-soft