readseq & MSF format problem - here's the patch.

L.H. Bell lhb at s-crim1.dl.ac.uk
Mon Oct 10 10:50:07 EST 1994


-- 

Dear All,

	I wrote to this group some time ago about a problem I had
found with readseq's reformatting of sequences into MSF format. 

The problem is that the number of lines readseq outputs in the MSF
format is determined by the last sequence so if this is shorter than
the other sequences, they get truncated - an example is given below.

The problem was with the 1Feb93 version of readseq and happened if
used interactively or as

'readseq input -pipe -all -form=msf > output'.

I e-mailed to this newsgroup but nobody had a fix so I e-mailed Don
Gilbert, author of readseq, who has recently got back to me with a
patch that fixes this problem. I enclose the patch at the end of this
email.

Hope this helps,

Lachlan Bell
------
L H Bell   *    Phone: 0925 603492	* 	e-mail: L.Bell at daresbury.ac.uk 
SEQNET




------------------------------ Input file  ------------------------------
>DL;New
New     100 bp     DNA               25-SEP-1994, 100 bases, 1DFC393E checksum.
 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
 tttttttttt tttttttttt tttttttttt tttttttttt tttttttttt*
>DL;New2
New2      41 bp     DNA               25-SEP-1994, 41 bases, 77C5E16C checksum.
 cccccccccc cccccccccc cccccccccc cccccccccc c*

------------------------------ Output file  ------------------------------
 a.nbrf  MSF: 41  Type: N  January 01, 1776  12:00  Check: 6622 ..

 Name: New              Len:   100  Check:  8935  Weight:  1.00
 Name: New2             Len:    41  Check:  7687  Weight:  1.00

//

            New  aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
           New2  cccccccccc cccccccccc cccccccccc cccccccccc c

------------------------------             ------------------------------

------------------------------    PATCH    ------------------------------
sunflower% diff readseq.c.old  readseq.c
*** 713,719 ****
  {
  boolean   closein = false;
  short     ifile, nseq, atseq, format, err = 0, seqtype = kDNA,
!           nlines, seqout = 0, phylvers = 2;
  long      i, skiplines, seqlen, seqlen0;
  unsigned long  checksum= 0, checkall= 0;
  char      *seq, *cp, *firstseq = NULL, *seqlist, *progname, tempname[256];
--- 713,719 ----
  {
  boolean   closein = false;
  short     ifile, nseq, atseq, format, err = 0, seqtype = kDNA,
!           nlines, maxlines=0, seqout = 0, phylvers = 2;
  long      i, skiplines, seqlen, seqlen0;
  unsigned long  checksum= 0, checkall= 0;
  char      *seq, *cp, *firstseq = NULL, *seqlist, *progname, tempname[256];
***************
*** 1031,1036 ****
--- 1031,1037 ----

            indexout();
            nlines = writeSeq( fout, seq, seqlen, outform, seqidptr);
+         if (nlines>maxlines) maxlines=nlines;
            seqout++;
            }

***************
*** 1095,1101 ****

      indexout();  noutindex--; /* mark eof */

!     for (leaf=0; leaf<nlines; leaf++) {
        if (outform == kMSF && leaf == 1) {
          fputs("//\n\n", foo);
          }
--- 1096,1102 ----

      indexout();  noutindex--; /* mark eof */

!     for (leaf=0; leaf<maxlines; leaf++) {
        if (outform == kMSF && leaf == 1) {
          fputs("//\n\n", foo);
          }










More information about the Bio-soft mailing list