Readseq, multi-format sequence reader/writer, is updated

Don Gilbert gilbertd at chipmunk.bio.indiana.edu
Wed Dec 30 15:39:37 EST 1992


 * ReadSeq  -- 30 Dec 92
 * Reads and writes nucleic/protein sequences in various formats. 

Readseq has been updated.   There have been a number of enhancements
and a few bug corrections since the previous general release in Nov 91
(see below).  If you are using earlier versions, I recommend you update to
this release.

Readseq is particularly useful as it automatically detects many
sequence formats, and interconverts among them.
Formats added to this release include
  + MSF multi sequence format used by GCG software
  + PAUP's multiple sequence (NEXUS) format
  + PIR/CODATA format used by PIR
  + ASN.1 format used by NCBI
  + Pretty print with various options for nice looking output.

As well, Phylip format can now be used as input.  Options to
reverse-compliment and to degap sequences have been added.  A menu
addition for users of the GDE sequence editor is included.

This program is available thru Internet gopher, as

  gopher ftp.bio.indiana.edu
  browse into the IUBio-Software+Data/molbio/readseq/ folder
  select the readseq.shar document

Or thru anonymous FTP in this manner:
  my_computer> ftp  ftp.bio.indiana.edu  (or IP address 129.79.224.25)
    username:  anonymous
    password:  my_username at my_computer
  ftp> cd molbio/readseq
  ftp> get readseq.shar
  ftp> bye

readseq.shar is a Unix shell archive of the readseq files.
This file can be edited by any text editor to reconstitute the
original files, for those who do not have a Unix system or an
Unshar program.  Read the top of this .shar file for further
instructions.

The brief usage of readseq is as follows.

readSeq (27Dec92), multi-format molbio sequence reader.
usage: readseq [-options] in.seq > out.seq
 options
    -a[ll]         select All sequences
    -c[aselower]   change to lower case
    -C[ASEUPPER]   change to UPPER CASE
    -degap[=-]     remove gap symbols
    -i[tem=2,3,4]  select Item number(s) from several
    -l[ist]        List sequences only
    -o[utput=]out.seq  redirect Output
    -p[ipe]        Pipe (command line, <stdin, >stdout)
    -r[everse]     change to Reverse-complement
    -v[erbose]     Verbose progress
    -f[ormat=]#    Format number for output,  or
    -f[ormat=]Name Format name for output:
         1. IG/Stanford           10. Olsen (in-only)
         2. GenBank/GB            11. Phylip3.2
         3. NBRF                  12. Phylip
         4. EMBL                  13. Plain/Raw
         5. GCG                   14. PIR/CODATA
         6. DNAStrider            15. MSF
         7. Fitch                 16. ASN.1
         8. Pearson/Fasta         17. PAUP
         9. Zuker                 18. Pretty (out-only)

   Pretty format options:
    -wid[th]=#            sequence line width
    -tab=#                left indent
    -col[space]=#         column space within sequence line on output
    -gap[count]           count gap chars in sequence numbers
    -nameleft, -nameright[=#]   name on left/right side [=max width]
    -nametop              name at top/bottom
    -numleft, -numright   seq index on left/right side
    -numtop, -numbot      index on top/bottom
    -match[=.]            use match base for 2..n species
    -inter[line=#]        blank line(s) between sequence blocks

 * Copyright 1990 by d.g.gilbert
 * biology dept., indiana university, bloomington, in 47405
 * e-mail: gilbertd at bio.indiana.edu
 *
 * This program may be freely copied and used by anyone.
 * Developers are encourged to incorporate parts in their
 * programs, rather than devise their own private sequence
 * format.
 *
 * This should compile and run with any ANSI C compiler.
 * Please advise me of any bugs, additions or corrections.
-- 
Don Gilbert                                     gilbert at bio.indiana.edu
biocomputing office, biology dept., indiana univ., bloomington, in 47405




More information about the Bio-soft mailing list