Fasta sequence input?

James Bonfield jkb at mrc-lmb.cam.ac.uk
Wed Jan 29 11:52:27 EST 1997


Michele Clamp wrote:

>Is there any way I can input fasta format sequence into gap4 for 
>assembling? I haven't got any trace files but I would like to try 
>and assemble using just the sequence.

We don't have a standard program for reading fasta format files. If your
sequence input is a single fasta file containing multiple sequences then you
could use a simple script to split them up. The following should work:

-----------------------------------------------------------------------------
#!/usr/bin/nawk -f
BEGIN {
    file="";
}

/^;/ {
    next;
}

/^>/ {
    gsub(">", "");
    if (file != "") {
        close(file);
    }
    file=$1;
    print "Creating", $1;
    next;
}

{
    print $0 >> file;
}

END {
    close(file);
}
-----------------------------------------------------------------------------

This is a 'new awk' script. You may need to adjust that first like to awk,
gawk, mawk (or whatever is available on your system). The script is rather
simplistic, so the best strategy is to make a new directory first.

Eg:

mkdir split
cd split
fasta-split < ../fasta_seqs

I hope this helps.

	James
-- 
James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/



More information about the Staden mailing list