Importing sequences

James Bonfield jkb at mrc-lmb.cam.ac.uk
Thu Dec 3 13:16:04 EST 1998


In article <36663A9C.29A06C4D at mailer.uni-marburg.de> Andreas Doll <doll at mailer.uni-marburg.de> writes:
>How can I import sequences in FASTA format into the staden package.
>Is there a possibility to covert it into a SCF file?

We cannot directly load FASTA files (although it would appear to be a most
obvious thing to add!). We can load plain text files though, with no
header at all. So just strip off the ">id" bit from the fasta file.

If you have one single fasta file with lots of entries, then try using
the fasta2exp script, held in $STADENROOT/src/scripts. If you don't
have a copy of it, here it is.

	James


#!/usr/bin/nawk -f
BEGIN {
    file="";
}

/^;/ {
    next;
}

/^>/ {
    gsub(">", "");
    if (file != "") {
        print "//" >> file;
        close(file);
    }
    file=$1".exp";
    print "ID   "$1 > file;
    print "SQ" >> file;
    print "Creating", $1;
    next;
}

{
    print "     "$0 >> file;
}

END {
    print "//" >> file;
    close(file);
}
--
James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/



More information about the Staden mailing list