missing sequences in yeast_gb.fasta

Mike Cherry cherry at genome.stanford.edu
Mon Mar 19 23:15:21 EST 2001

For many years I have scanned GenBank three times a week and reported
any new or updated Saccharomyces cerevisiae sequences to this

This process was used by SGD to create a dataset containing all
S. cerevisiae sequences found in GenBank.  This dataset is
made available via FTP at:


This dataset is also available for searching via WU-BLAST
from the SGD BLAST form:


I recently discovered that this dataset was missing many sequences it
should have contained.  The error started around January 1, 2001 and
was fixed on March 11, 2001.  The problem was caused be my error and
has been fixed.  If you downloaded the yeast_gb.fasta file in the past
three months please retrieve the current copy.  There are currently
18,695 S. cerevisiae entries in the dataset.  Also if you searched the
"GenBank Sequences" dataset on the SGD BLAST page I suggest that you
redo that search to be sure nothing was missed.

This problem was limited to this GenBank sequences dataset.  The ORF,
genomic, all protein, and UTR datasets are correct and were not



J. Michael Cherry, Ph.D.         Internet: cherry at stanford.edu
Associate Professor (Research)   Department of Genetics
Medical Center, Room M341        Stanford University School of Medicine
Voice:    650-723-7541           Stanford, California  94305-5120
FAX:      650-723-7016           http://genome-www.stanford.edu/~cherry


YEAST bionet newsgroup see: http://www.bio.net/hypermail/YEAST/
YEAST e-mail: messages sent to yeast at net.bio.net
subscribe: e-mail biosci-server at net.bio.net with: subscribe yeast
unsubscribe: e-mail biosci-server at net.bio.net with: unsubscribe yeast
YEAST on the WWW: http://genome-www.stanford.edu/Saccharomyces/VL-yeast.html
problems with the YEAST newsgroup? E-mail the moderator: francis at cmmt.ubc.ca

More information about the Yeast mailing list