Lincoln Stein lstein at
Thu Apr 14 21:20:56 EST 1994


Release Two of the Whitehead Institute/MIT Center for Genome Research
human physical mapping data is now available. 

This data release contains YAC screening data for 1598 STSs.  For each
STS, we report addresses for the YACs found to contain the STS.
Because the STS coverage is not yet sufficiently dense, it is not yet
possible to construct extensive contigs directly from these data.
However, these data should be useful for identifying YACs containing a
specific marker.  In addition, the data can be combined with STS
content data from other sources (such as the CEPH/Genethon screening)
to generate more complete coverage.

The 1598 STSs screened fall into the following categories:

(i)  667 Genetically mapped polymorphic STSs.  Priority has been 
given to screening published, genetically mapped STSs  against the 
YAC library. These STSs allow contigs to be anchored to the genetic 
map; their wide use makes them of immediate use to the human 
genetics community. No change has been made to these STSs; we use 
the published oligonucleotide sequences. The genetically mapped 
polymorphic STSs consist of 369 CA repeat markers from Jean 
Weissenbach[2] , 71 CA repeat markers from Jim Weber, 149 
tetranucleotide repeats from the Cooperative Human Linkage Center 
(CHLC) and 78 additional published markers from several other 

(ii) 434 Random Genome-wide STSs.  These STSs were generated at 
the Center by sequencing random genomic clones from a small insert 
library.  Sequences are analyzed by a series of computer programs to 
eliminate common repetitive sequences, and PCR primers are picked 
by the PRIMER program[3]. The PCR assays are then tested under 
standard conditions.  Assays giving one band in ethidium-bromide 
staining proceed to chromosomal assignment and library screening. 
STSs are assigned to a chromosome by use of the NIGMS 
Human/Rodent Somatic Cell Hybrid Mapping Panel #1. 
Approximately 75% of STSs can be unambiguously assigned to a 
chromosome using this panel. The process for chromosomal 
assignment is not yet complete for some of these STSs and will be 
made available in subsequent data releases. The random genome-
wide STSs are designed to minimize biases in clone location, which 
should aid in obtaining as complete and correct a map as possible. 

(iii) 282 unpublished CA-repeat-containing STSs.  In the course of 
human genetic map construction, Weissenbach and colleagues have 
identified many CA-repeats that were not sufficiently polymorphic to 
be genotyped for the Genethon human genetic map.  These STSs have 
been generously provided by J. Weissenbach. These STSs have not 
been previously published or assigned D-number locus names; any 
use of the STSs themselves should refer to [J. Weissenbach, 
unpublished data]. These markers are also in the process of being 
mapped to chromosomes using the NIGMS Human/Rodent Somatic 
Cell Hybrid Mapping Panel #1.

(iv) 82 new chromosome 22 STSs.  With the goal of testing 
approaches for fine structure ordering and closure , we plan to 
generate a higher density of STSs in one or a few regions.  
Specifically, we generated STSs from sequences derived from flow-
sorted chromosome libraries, in collaboration with the Human 
Genome Center for Chromosome 22 in Philadelphia.  YACs 
corresponding to 133 chromosome 22 STSs included in this release 
already provide considerable coverage of the long arm of 
chromosome 22. 

(v) 134 STSs from public data bases. 124 STSs have been taken from 
the Genome Data Base (GDB) and are used as described and a further 
10 were derived from sequences appearing in Genbank.  

STS content mapping was carried out in the CEPH mega-YAC library, 
which was generously provided by Daniel Cohen and colleagues.  We 
screened plates 709 to 972, encompassing the set of about 25,000 
YACs with average inserts in the 1Mb range.  We have maintained 
the original plate nomenclature used by CEPH so that YAC names 
reported here should correspond to those used by CEPH and others 
screening copies of this library. 

Because the STS coverage is not yet very dense, we have not reported
any contig analysis in this data release (although we observe some
STSs linked by at least two YACs).  We expect to report such contig
analysis in future releases. This data release consists of a simple
flat-file, which is adequate for reporting YAC addresses.  In
subsequent data releases, we plan to add an email query facility (like
that already available for the Center's mouse genomic map data) to
assist investigators in finding contigs in regions of interest.


The data can be obtained in two ways:

1.  Via anonymous ftp to  Log in as "anonymous" and
use your e-mail address as password.  The release can be found in the
directory /distribution/human_STS_releases/apr94.  The file "README"
describes the file formats and gives other information.

2.  Via a "World-Wide Web" browser.  Point your wWW client (e.g. NCSA
Mosaic) at "" and follow the links
"Genome Center Data", "Human".

This project is an ongoing one.  As new STS's are screened they will
be released on a quarterly basis on approximately the following

1 July 1994
1 October 1994
1 January 1995

Please address questions and comments to me at the address below.

Lincoln Stein

Lincoln D. Stein                Whitehead Institute/MIT Genome Center
lstein at	One Kendall Square, Building 300
617-252-1916                    Cambridge, MA 02139
Lincoln D. Stein                Whitehead Institute/MIT Genome Center
lstein at	Cambridge, MA 02142

More information about the Bionews mailing list