Consed, sequence assemble

Tim Cutts timc at chiark.greenend.org.uk
Sun Oct 17 04:20:07 EST 1999


In article <7u5hg0$g1l$1 at news.tamu.edu>, Mei <hmpeng at ppserver.tamu.edu> wrote:
>Hello,
>
>I am interested in assemble some of the EST sequences that I have downloaded
>from Entrez.  So far, I am using “csplit” command in unix, then use a perl
>script to rename files.  Finally, use a shell script to generate fake phd
>files for Consed.  This approach works well if I have less than 100
>sequences, because csplit only split up to 99 files.  I’d like to know how
>to split and rename the fasta file according to the gi numbers in the
>definition lines when I have large number of sequences to assemble.  A hint
>in how to write a perl script for this purpose will be greatly appreciated.

Incidentally, the GNU version of split does not have these limitations.
It's part of the GNU textutils package, I think.

Tim.





More information about the Bio-soft mailing list