Theobroma cacao

Marta Matvienko marta at
Mon Aug 2 10:53:21 EST 2004

<x-flowed>A new public project is available at our website.

Cacao ESTs from the GenBank collection were processed using the PyMood
Sequence Processor:

New cacao FASTA files and cacao tab-delimited files containing sequence
and data information on good, bad, and masked sequences are available
for download from our website at:

PyMood Sequence Processor function removes and masks undesired (low
quality, vector, contaminant, specific motifs, etc) sequences from FASTA
files.  It sorts DNA and protein sequences according to:

     1. Their level of homology to reference sequence files. The
reference FASTA file can contain vector sequences, repeats, primers, or
any other undesired motifs (nucleotide or proteins) that need to be
excluded from subsequent analyses.
     2. The sequence composition: length, percentage of "N" letters, and
"GC" content.

Questions and suggestions are welcomed.

Marta Matvienko

Marta Matvienko, Ph.D
Allometra, LLC
Phone: 530-792-8864
Fax: 530-753-1152

Email: marta at


More information about the Arab-gen mailing list