Replies to NAASC letter
Joe Ecker
jecker at ATGENOME.BIO.UPENN.EDU
Thu Jun 2 14:08:27 EST 1994
Several more replies to the NAASC letter.
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Response # 10
1. Do you think that we are ready to begin some level of directed genome
sequencing in the US?
Althought I am not a member of the Arabidopsis community, I am actively
involved in sequencing as part of the Human Genome Project so though I'd
add my 2-cents to this discussion. My bottom line is DEFINITELY YES
with a few notes. The real live cost per base of final sequence at
sequencing centers (ours at the University of Oklahoma, the WashU and
the Sanger Center) is somewhere between 50 cents and $1. By the time
an RFA is written and applications are eventually funded, the cost will
be very close to 50 cents/base final sequence (4-5 fold coverage and
accuracy >99.99%) if done in centers dedicated to sequencing. The cost
for sequencing done in individual research labs is 5-10 fold higher
because of the lack of economy of scale, lack of standardized protocols,
and lack of fully trained personnel dedicated to the task at hand, i.e.
meg-a-base sequencing. A few facilities such as that set up for C. elegans
would be ideal for large scale sequencing of the Arabidopsis genome at an
annual budget of say $5million for each center (set up 4 such centers).
I also think this should be done for the human genome asap as well.
The plan would be that individual groups who have mapped their favorate
regions, containing their favorate gene(s), then would send (or take)
their physical contigs (cosmids/P1's) to a regional center for sequencing.
With a budget of $5million, cosmids can be sequenced presently at the rate
of one per one to two days via shotgun with 16 ABI sequencers (12 for the
initial shotgun cloning and 4 for closure and error correction. This
approach is extremely cost effective (as demonstrated by the C. elegans
groups) and would yield >5 million bases of completed sequences per center
per year.
2. How important is genome sequencing in terms of funding priorities (vs.
placing cDNAs on the map, completion of the physical map, adding more
PCR-based markers to the map, etc.)?
Sequencing from reagions already mapped will yield this information
directly.
3. Who should support systematic genome sequencing if it is a big-$ effort?
USDA but this is a political issue.
4. What impact on Arabidopsis research will be incurred if sequencing does
not begin today (in 2 years; in 5 years, in 10 years)?
The Arabidopsis community will lack the detailed information needed
for the "real" biology that they want to do.
5. What type of organizational model for genome sequencing would you
support: sequencing centers vs. individual interested labs?
Definitelty ONLY in sequencing centers. Let the mapping and biology
be done in individual labs but sequencing is only cost effective if done
in centers.
6. What quality standards would you expect for the sequence: high or low
accuracy (high accuracy = higher cost)?
Highly accurate sequences is the most desirable and can be done. In fact,
accuracy of the actual bases is not as great a problem as one would suspect.
With 4-5 fold coverage the accuracy is >99.99 or 1 error per 10,000 bases.
The real problem is in closing gaps between shotgun generated contigs not
in error correction. Thus, a 100kb region with 10 gaps of 100 bases each
gap would have 1000 "errors" (because they are unknown) bases and the
sequence would be 99% accurate over 100kb. Would this be useful
information? For some it might be, but then again you might be missing
the more interesting information that would be contained in the unsequenced
gaps. My preference is for contigous regions of highly accurate sequence
that can be obtained by 4-5 fold coverage and a resonable closure strategy.
7. ANY SPECIFIC OR GENERAL COMMENTS THAT YOU WOULD LIKE TO MAKE!
It is now well documented that sequencing is both cost effective and
extremely informative. If cosmid or P1 contig maps are available, these
regions should be sequence immediately in centers. It turns out that
we can sequence faster than the mappers can map so go for it!! However,
the ultimate problems lie in dealing with the data and a massive
computational issue arises, that is something that only now is being
addressed. How will the final end user, the biologist, be able to view
the data, search for new features, make new discoveries, etc?
Cheers to one and all as you struggle with these very important issues.
--bruce
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
\ Bruce A. Roe Professor of Chemistry and Biochemistry /
/ Dept. of Chem. and Biochem. INTERNET: BROE at aardvark.ucs.uoknor.edu \
\ University of Oklahoma BITNET: BROE at uokucsvx /
/ 620 Parrington Oval, Rm 208 AT&TNET: 405-325-4912 or 405-325-7610 \
\ Norman, Oklahoma 73019 FAXnet: 405-325-6111 /
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Response # 11
1. Do you think that we are ready to begin some level of directed
genome sequencing in the US?
YES, BUT NOT AT THE EXPENSE OF ANY OTHER PROJECTS. THIS SHOULD
ONLY BE DONE WITH NEW MONEY.
2. How important is genome sequencing in terms of funding
priorities (vs. placing cDNAs on the map, completion of the
physical map, adding more PCR-based markers to the map, etc.)?
THE PHYSICAL MAP IS THE MOST IMPORTANT, cDNAs ARE NEXT AND
SEQUENCING LAST.
3. Who should support systematic genome sequencing if it is a big-
$ effort?
IT DOES NOT MATTER, BUT IT SHOULD ONLY BE DONE WITH NEW MONEY. NO
MONEY SHOULD BE TAKEN FROM EXISTING PROGRAMS.
4. What impact on Arabidopsis research will be incurred if
sequencing does not begin today (in 2 years; in 5 years, in 10
years)?
IF ARABIDOPSIS IS TO CONTINUE TO COMPETE WITH OTHER MODEL SYSTEMS
SUCH AS WORMS AND FLIES, SEQUENCING SHOULD START NOW. IF
ARABIDOPSIS IS VIEWED ONLY AS A MODEL SYSTEM FOR PLANTS,
SEQUENCING CAN START IN TWO YEARS.
5. What type of organizational model for genome sequencing would
you support: sequencing centers vs. individual interested labs?
MULTIPLE CENTERS WILL PROBABLY BE THE MOST EFFICIENT.
6. What quality standards would you expect for the sequence: high
or low accuracy (high accuracy = higher cost)?
LOW ACCURACY NEEDS TO BE DEFINED IN TERMS OF MISTAKES/KB BEFORE
THIS CAN BE ANSWERED.
More information about the Arab-gen
mailing list