large DNA sequences --> smaller overlapping sequences
walker at ncbi.nlm.nih.gov
Thu Jun 11 09:48:56 EST 1998
> Does anybody know of a program to split up a large DNA-sequence file
> (4 Mb) into smaller files/sequences of 200 kb with 10 kb overlap?
The SEALS package
contains a number of little widgets for common manipulations of
sequence such as this.
If your sequence was in FASTA format in a file called 'chromosome.fa'
you could split it with this command
fenestrate chromosome.fa -window= 200_000 -overlap=10_000
To save each subsquence in different files you might try
fenestrate chromosome.fa -window= 200_000 -overlap=10_000 | \
shatter -word= -5
More information about the Bio-soft