Replies to NAASC letter

Joe Ecker jecker at ATGENOME.BIO.UPENN.EDU
Thu Jun 2 14:08:27 EST 1994


Several more replies to the NAASC letter.

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Response # 10

1. Do you think that we are ready to begin some level of directed genome 
   sequencing in the US?   

   Althought I am not a member of the Arabidopsis community, I am actively 
   involved in sequencing as part of the Human Genome Project so though I'd
   add my 2-cents to this discussion.  My bottom line is DEFINITELY YES
   with a few notes.  The real live cost per base of final sequence at
   sequencing centers (ours at the University of Oklahoma, the WashU and
   the Sanger Center) is somewhere between 50 cents and $1.  By the time
   an RFA is written and applications are eventually funded, the cost will
   be very close to 50 cents/base final sequence (4-5 fold coverage and
   accuracy >99.99%) if done in centers dedicated to sequencing.  The cost
   for sequencing done in individual research labs is 5-10 fold higher
   because of the lack of economy of scale, lack of standardized protocols,
   and lack of fully trained personnel dedicated to the task at hand, i.e.
   meg-a-base sequencing.  A few facilities such as that set up for C. elegans
   would be ideal for large scale sequencing of the Arabidopsis genome at an
   annual budget of say $5million for each center (set up 4 such centers).
   I also think this should be done for the human genome asap as well.
   The plan would be that individual groups who have mapped their favorate
   regions, containing their favorate gene(s), then would send (or take)
   their physical contigs (cosmids/P1's) to a regional center for sequencing.
   With a budget of $5million, cosmids can be sequenced presently at the rate
   of one per one to two days via shotgun with 16 ABI sequencers (12 for the
   initial shotgun cloning and 4 for closure and error correction.  This
   approach is extremely cost effective (as demonstrated by the C. elegans
   groups) and would yield >5 million bases of completed sequences per center
   per year.
    
2. How important is genome sequencing in terms of funding priorities (vs. 
   placing cDNAs on the map, completion of the physical map, adding more 
   PCR-based markers to the map, etc.)?

   Sequencing from reagions already mapped will yield this information
   directly.
  
3. Who should support systematic genome sequencing if it is a big-$ effort?
   
   USDA but this is a political issue.

4. What impact on Arabidopsis research will be incurred if sequencing does 
   not begin today (in 2 years; in 5 years,  in 10 years)? 
   
   The Arabidopsis community will lack the detailed information needed
   for the "real" biology that they want to do.

5. What type of organizational model for genome sequencing would you 
   support: sequencing centers vs. individual interested labs?
   
   Definitelty ONLY in sequencing centers.  Let the mapping and biology
   be done in individual labs but sequencing is only cost effective if done
   in centers.

6. What quality standards would you expect for the sequence: high or low  
   accuracy (high accuracy = higher cost)?
  
   Highly accurate sequences is the most desirable and can be done.  In fact,
   accuracy of the actual bases is not as great a problem as one would suspect.
   With 4-5 fold coverage the accuracy is >99.99 or 1 error per 10,000 bases.
   The real problem is in closing gaps between shotgun generated contigs not
   in error correction.  Thus, a 100kb region with 10 gaps of 100 bases each
   gap would have 1000 "errors" (because they are unknown) bases and the
   sequence would be 99% accurate over 100kb.  Would this be useful
   information?  For some it might be, but then again you might be missing
   the more interesting information that would be contained in the unsequenced
   gaps.  My preference is for contigous regions of highly accurate sequence
   that can be obtained by 4-5 fold coverage and a resonable closure strategy.

7. ANY SPECIFIC OR GENERAL COMMENTS THAT YOU WOULD LIKE TO MAKE!
   
   It is now well documented that sequencing is both cost effective and
   extremely informative.  If cosmid or P1 contig maps are available, these
   regions should be sequence immediately in centers.  It turns out that
   we can sequence faster than the mappers can map so go for it!!  However,
   the ultimate problems lie in dealing with the data and a massive
   computational issue arises, that is something that only now is being
   addressed.  How will the final end user, the biologist, be able to view
   the data, search for new features, make new discoveries, etc?

Cheers to one and all as you struggle with these very important issues.
--bruce
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 \  Bruce A. Roe                 Professor of Chemistry and Biochemistry /
 /  Dept. of Chem. and Biochem.  INTERNET: BROE at aardvark.ucs.uoknor.edu  \
 \  University of Oklahoma       BITNET:   BROE at uokucsvx                 /
 /  620 Parrington Oval, Rm 208  AT&TNET:  405-325-4912 or 405-325-7610  \
 \  Norman, Oklahoma 73019       FAXnet:   405-325-6111                  /
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Response # 11

1. Do you think that we are ready to begin some level of directed 
genome sequencing in the US?  

YES, BUT NOT AT THE EXPENSE OF ANY OTHER PROJECTS.  THIS SHOULD 
ONLY BE DONE WITH NEW MONEY.


2. How important is genome sequencing in terms of funding 
priorities (vs.  placing cDNAs on the map, completion of the 
physical map, adding more  PCR-based markers to the map, etc.)?

THE PHYSICAL MAP IS THE MOST IMPORTANT, cDNAs ARE NEXT AND 
SEQUENCING LAST.


3. Who should support systematic genome sequencing if it is a big-
$ effort?

IT DOES NOT MATTER, BUT IT SHOULD ONLY BE DONE WITH NEW MONEY.  NO 
MONEY SHOULD BE TAKEN FROM EXISTING PROGRAMS.


4. What impact on Arabidopsis research will be incurred if 
sequencing does not begin today (in 2 years; in 5 years,  in 10 
years)?

IF ARABIDOPSIS IS TO CONTINUE TO COMPETE WITH OTHER MODEL SYSTEMS 
SUCH AS WORMS AND FLIES, SEQUENCING SHOULD START NOW.  IF 
ARABIDOPSIS IS VIEWED ONLY AS A MODEL SYSTEM FOR PLANTS, 
SEQUENCING CAN START IN TWO YEARS.


5. What type of organizational model for genome sequencing would 
you support:  sequencing centers vs. individual interested labs?

MULTIPLE CENTERS WILL PROBABLY BE THE MOST EFFICIENT.  


6. What quality standards would you expect for the sequence: high 
or low accuracy (high accuracy = higher cost)?

LOW ACCURACY NEEDS TO BE DEFINED IN TERMS OF MISTAKES/KB BEFORE 
THIS CAN BE ANSWERED.




More information about the Arab-gen mailing list