Readseq and GCG

B.L.Cohen gbga13 at udcf.gla.ac.uk
Wed Nov 15 08:19:55 EST 1995


The question has been correctly answered below as I know to my cost, having
been around this problem at least twice on separate occasions and now being
permanently bald on top!
B


In article <199511132241.OAA10959 at socrates.ucsf.EDU>,
postmaster at CGL.UCSF.EDU (Mail Delivery Subsystem, by way of
lonetto at cgl.ucsf.edu (Michael A. Lonetto)) wrote:

> Sorry for posting this, mail to the sender bounced:
> 
> Sequences not recognized because they are not quite right.
> 
> Two problems:  1) GCG uses "." for gaps, this file has "-".  This can
> be fixed with a text editor search/replace.
> 
> 2) Checksum problem.  This can be fixed by  running it through
> "reformat -msf gde842_9.msf{*}"
> 
> Note that if you run reformat (at least through V8.0) without first
> changing the gap character it will remove all your gaps (probably
> not what you want).
> 
> The third problem is that ReadSeq saves all MSF file as nucleotide, so
> protein files have to have type changed manually (not a problem here).
> 
> 
> >Does anyone know why MSF files formatted by readseq are not recognized by GCG
> >Version 8.1?  (I don't know whether they were recognized by earlier versions).
> >
> >The file test.msf as written by readseq (through gde) is as follows:
> >
> > gde842_9  MSF: 100  Type: N  January 01, 1776  12:00  Check: 9397 ..
> >
> > Name: test1            Len:   100  Check:  1581  Weight:  1.00
> > Name: test2            Len:   100  Check:  6389  Weight:  1.00
> > Name: test3            Len:   100  Check:  1427  Weight:  1.00
> >
> >//
> >
> >          test1  AAACGATGCA CATATGTATT GTGCTCTAGA TACAGCATCA ---AGCTCTA
> >          test2  AAATGATGCA CACATGTACT GTGCTTTAGA TACAGCACAA CAGAGTGCTA
> >          test3  AAAAAGTGGT GCGGAATCTC TGGCAGCTAT TACCCGCGAC GCTAACATTA
> >
> >          test1  CTGCAGGAGC AACT------ ACATCTGTTA TGGTAAAAAA TGAAAATTTA
> >          test2  CTAATGGTGC AACATTAGCT TCATCTGTTA TGATAAAAAA TGAAAATTTA
> >          test3  CTGAG----- -------ACC AATTACTTCG TAGTCAAAAT TGAGAAATTA
> >
> >
> >If I try to use "distances" on this file I get:
> >
> > *** ERROR, bad sequence format in test.msf ! ***
> > *** No files in test.msf ! ***
> >
> >Thanks in advance to anyone who can help.
> >
> >--
> >---
> >Basil Allsopp                       |  E-mail   basil at ovisun.ovi.ac.za
> >Onderstepoort Veterinary Institute  |  Phone    +27 12 5299385
> >Onderstepoort 0110, South Africa    |  Fax      +27 12 5299431
> 
> -------------=-=-=-=-=-=-=-------------=-=-=-=-=-=-=-------------
> Michael Lonetto ** 415-476-1493 ** lonetto at cgl.ucsf.edu
> UCSF Depts. of Stomatology and Micro.,San Francisco,CA 94143-0512
> =-=-=-=-=-=-=----- http://terminator.ucsf.edu/ -----=-=-=-=-=-=-=

-- 
Bernie Cohen                   Phone (+44) (0)141 339 8855 ext. 5103/5101
Molecular Genetics              Fax               330 5994
University of Glasgow
56 Dumbarton Rd,
Glasgow G11 6NU
Scotland, UK.




More information about the Bio-soft mailing list