Translating from nucleic acid to amino acid.
John Powell
jip at helix.nih.gov
Thu Mar 2 16:14:59 EST 1995
We have placed a modified version of the GDE translate program (for
translating from nucleic acid to amino acid) on our anonymous FTP site.
The original code has been modified to handle packed fasta formated input
files. We also "purified" the code - no known memory leaks. The program
processes the input a sequence at a time.
We did a 6-frame translation of gbest - all 86K+ sequences in one pass on
a SPARC2 with 65M swap space with the following command:
zcat gbest.seq.Z |gb2fasta - |translate -tbl 1 -frame 6 - |compress -c > gbest_6f.seq.Z
The program is available via anonymous ftp from milo.dcrt.nih.gov
(128.231.129.60) under pub/translate as the compressed tar file
translate.tar.Z.
The man page for the modfied program follows:
------------------------------------------------------------------------------
TRANSLATE(1) USER COMMANDS TRANSLATE(1)
NAME
translate - translates from nucleic acid to amino acid
SYNOPSIS
translate [-tbl codon_table] [-frame #] [-min_frame #]
[-3] [-gde] [-noc] infile|-
translate [-h[elp]]
DESCRIPTION
Translate program translates the selected sequences from
DNA/RNA to Amino Acid. Translate can be used with sequences
in either packed FASTA or GDE format. Output is written to
standard output.
Note that the frame number is appended to the sequence name
as ".framenumber".
OPTIONS
[-tbl codon_table]
stop codon table to use:
1 = Universal 2 = Mycoplasma
3 = Yeast 4 = Vert. mito.
Default is Universal.
[-frame #] Nucleic acide "frame" to translate:
1 = first frame 2 = second frame
3 = third frame 6 = all six frames
Defaults to all six frames.
[-min_frame #]
minimum open reading frame (i.e. shortest amino
acid sequence to translate). Default is zero (no
minimum).
[-3] use triple letter codes. Default is single
letter codes.
[-gde] input sequence is in GDE format. Default is
FASTA format. GDE format with '#' or '%' in the
first line is not recognized.
[-noc] do not include the sequence description/comments
from the first line of sequence in the output.
Useful only with FASTA format.
infile|- input sequence can be either a packed FASTA
sequence file or can be taken from standard
input (-) through a pipe, or a GDE format file
with -gde switch.
Keywords:
--
--------
John Powell phone: (301) 496-2963
Building 12A, Room 2033 FAX: (301) 402-2867
National Institutes of Health
Bethesda, MD 20892 Internet: jip at helix.nih.gov
More information about the Bio-soft
mailing list