GCG7 to Paup translation? (A)

Doug.Eernisse at UM.CC.UMICH.EDU Doug.Eernisse at UM.CC.UMICH.EDU
Sun Oct 4 13:21:20 EST 1992


Jeremy John Ahouse asks:
>I need to find a program that will take me from the multiple sequence
> alignments that I can get out of GCG 7's pileup program to the format
>-required
> by Paup on the Mac.
>
>         Do you all do this by hand  Do you have a good/better multiple
> sequence alignment tool on the Mac?
 
My HyperCard 2.x stack, DNA Translator, does that conversion. Be
sure you get the most recent version (v. 1.0k6) because the last
version (1.0k5) was one of the first to support Pileup and there 
was one line of code out of place introduced when I also added support
for input/output formats for Jotun Hein's (Unix) Treealign program. 
Other multiple sequence alignment formats that can be converted to 
either Paup (Nexus) or simple named string format are simple 
"interleave" format, Eugene (or Prophet), ClustalV, and Phylip 
"outfile" format.
 
The most recent version of my stack is usually on my account:
ftp as "anonymous" (password "guest") to "um.cc.umich.edu"
then "cd gdef" and "get dnastack.hqx"
You will need to debinhex the file, as is usual for Mac file 
transfers, and this will result in a self-extracting archive.
On ftp.bio.indiana.edu you can find the same thing either
in /Incoming/DNAstacks.10k6.hqx or eventually in
/molbio/mac, but I do not upload current versions there as
frequently.
 
This also includes a manual sequence aligner stack called Aligner.
Perhaps the most useful features of Aligner are the various
nucleotide or amino acid coloring routines. These are not especially
fast but have options that I haven't seen in other packages.
For example, you can color nucleotides in triplets according to
their inferred amino acid coding. This, combined with the
ability to match the sequences with dashes where they are identical
to the first sequence, gives an informative view of a protein-
encoding DNA sequence alignment. You can also color according
to hydrophobicity, chemical, function, etc. groupings of the
amino acids. Be sure to increase the memory partition of HyperCard
to a minimum of 3 MB if you want to do much of this.
The input format for your sequences is the "named string" format
resulting from the above-mentioned conversions:
 
name_1<space(s)>ACTG...<cr>
name_2<space(s)>ACCG...<cr>
etc.
 
A descriptions of these stacks was published recently in CABIOS
8:177-184, but that was before the above-mentioned features were
added.
 
Doug Eernisse
Doug_Ee at um.cc.umich.edu
Museum of Zoology, Univ.of Michigan
Ann Arbor, MI 48109 USA
 
Almost forgot to mention: The stacks and many of the custom
external commands/functions employed come with "free for
noncommercial use only" restrictions.




More information about the Bio-soft mailing list