rd at sanger.ac.uk
Mon Sep 25 07:47:31 EST 2000
It is a bit harsh to say that Keith's letter should have been removed
by the moderator. I believe moderation of this news group should be
to keep junk mail and off topic messages out of the group, not to
remove misunderstanding, which would be a daunting duty.
In fact I would like to clarify Jean's reply, which is (of course)
correct in saying that we handle the whole IUPAC code for nucleic acid
ambiguitities (parsing and displaying), and that 'X' is not an IUPAC
nucleotide ambiguity code (I think it is the protein ambiguity code).
> n is sometimes shown as - in displays
might create some confusion. We store four binary bits per DNA
position, one for each of A, C, G, T. 'n' is used for 1111, i.e. could
be anything, whereas '-' is used for 0000, i.e. nothing. Strictly
this is a gap character - a good use for exammple would be to indicate
gaps in multiple alignments. But we also use it when no data has been
Perhaps acedb should initialise all DNA arrays to 1111, so that if no
data is added you get a sequence of n's not of dashes?
character, which could be used for e.g. padded alignment display. We do use it for regions where we have not been given any sequence.
More information about the Acedb