Readseq, multi-format sequence reader/writer, is updated
gilbertd at chipmunk.bio.indiana.edu
Wed Dec 30 15:39:37 EST 1992
* ReadSeq -- 30 Dec 92
* Reads and writes nucleic/protein sequences in various formats.
Readseq has been updated. There have been a number of enhancements
and a few bug corrections since the previous general release in Nov 91
(see below). If you are using earlier versions, I recommend you update to
Readseq is particularly useful as it automatically detects many
sequence formats, and interconverts among them.
Formats added to this release include
+ MSF multi sequence format used by GCG software
+ PAUP's multiple sequence (NEXUS) format
+ PIR/CODATA format used by PIR
+ ASN.1 format used by NCBI
+ Pretty print with various options for nice looking output.
As well, Phylip format can now be used as input. Options to
reverse-compliment and to degap sequences have been added. A menu
addition for users of the GDE sequence editor is included.
This program is available thru Internet gopher, as
browse into the IUBio-Software+Data/molbio/readseq/ folder
select the readseq.shar document
Or thru anonymous FTP in this manner:
my_computer> ftp ftp.bio.indiana.edu (or IP address 184.108.40.206)
password: my_username at my_computer
ftp> cd molbio/readseq
ftp> get readseq.shar
readseq.shar is a Unix shell archive of the readseq files.
This file can be edited by any text editor to reconstitute the
original files, for those who do not have a Unix system or an
Unshar program. Read the top of this .shar file for further
The brief usage of readseq is as follows.
readSeq (27Dec92), multi-format molbio sequence reader.
usage: readseq [-options] in.seq > out.seq
-a[ll] select All sequences
-c[aselower] change to lower case
-C[ASEUPPER] change to UPPER CASE
-degap[=-] remove gap symbols
-i[tem=2,3,4] select Item number(s) from several
-l[ist] List sequences only
-o[utput=]out.seq redirect Output
-p[ipe] Pipe (command line, <stdin, >stdout)
-r[everse] change to Reverse-complement
-v[erbose] Verbose progress
-f[ormat=]# Format number for output, or
-f[ormat=]Name Format name for output:
1. IG/Stanford 10. Olsen (in-only)
2. GenBank/GB 11. Phylip3.2
3. NBRF 12. Phylip
4. EMBL 13. Plain/Raw
5. GCG 14. PIR/CODATA
6. DNAStrider 15. MSF
7. Fitch 16. ASN.1
8. Pearson/Fasta 17. PAUP
9. Zuker 18. Pretty (out-only)
Pretty format options:
-wid[th]=# sequence line width
-tab=# left indent
-col[space]=# column space within sequence line on output
-gap[count] count gap chars in sequence numbers
-nameleft, -nameright[=#] name on left/right side [=max width]
-nametop name at top/bottom
-numleft, -numright seq index on left/right side
-numtop, -numbot index on top/bottom
-match[=.] use match base for 2..n species
-inter[line=#] blank line(s) between sequence blocks
* Copyright 1990 by d.g.gilbert
* biology dept., indiana university, bloomington, in 47405
* e-mail: gilbertd at bio.indiana.edu
* This program may be freely copied and used by anyone.
* Developers are encourged to incorporate parts in their
* programs, rather than devise their own private sequence
* This should compile and run with any ANSI C compiler.
* Please advise me of any bugs, additions or corrections.
Don Gilbert gilbert at bio.indiana.edu
biocomputing office, biology dept., indiana univ., bloomington, in 47405
More information about the Bio-soft