making a "true" est consensus

Martin Jones rmiller at
Thu Dec 19 00:07:05 EST 1996

 Hi there,

 Got some nucleotide sequence alignment/search/database questions for you :

 How do we link 3' EST to 5' EST fragments from the same clone in order 
to make the linked consensus useful for subsequent searching, alignment
and/or translation?
 We're developing a set of EST consensus sequences to submit to a public
database, and naturally we'd like these to be of the greatest utility
possible.  We are thinking about the most useful format for the submission.

 What is the  best way to link data for ESTs which come from the same 
clone -- a way that will preferably result in gaps inserted in the linker 
region when someone comes along and searches the database with the 
sequence of the full clone ?

 Specifically, we'll be creating artificial consensus sequences from two
EST consensuses, e.g. a 5' EST AAAAAAAAAAAAAA and a 3' EST ZZZZZZZZZ.

 So our questions are:

  * What are the ramifications of 

     using NNN's (unassigned) :


     or using ----'s (gap) :

           AAAAAAAAAAAAA-----------------ZZZZZZZZZZZZZZZZ  ???
     between the two sequences ?

   * how many characters would be ideal ?  

   * what else could be used ?

 We invite any helpful comments, and feel free to e-mail a copy of 
your reply to info at to make certain we see it.

                                thanks in advance,


Robert T. Miller, Ph.D.                          rmiller at

Manager - Durban Satellite - South African National Bioinformatics Institute 

Faculty of Medicine / Dept of Virology / University of Natal 
Private Bag 7 / Congella 4013 / Durban / South Africa 
phone +27 (031) 3603743                     fax +27 (031) 3603744 or 2604441 

More information about the Comp-bio mailing list