Comments on Cytogenetic Data in AceDB May 10 kirbym@harwell-rad

b.arnold at ic.ac.uk b.arnold at ic.ac.uk
Thu May 12 04:22:18 EST 1994


Hello ,
I've recently joined the ACEDB community (about a month ago) so bear with me if
it seems that I'm making erroneous assumptions or outright mistakes.

Just a few comments prompted by Michelle Kirby's message of a couple of days ago


I'm currently working with Peter Little's group on chromosome 11 specifically to
look at the possibility of using acedb to display our data. The ideas that Peter
has had regarding how the data should be displayed are as follows :

1. As much of the data obtained is relational between genes, cosmid clones YACs 
etc there is little or no chance of being able to map anything to specific 
points on the chromosome, it is far simpler to map things with regard to 
intervals along the chromosome. (i.e. everything is mapped relative to its 
neighbours and not to pre-determind numerical positions.)

2. As more is found out about the order of loci/genes etc. along the chromosome 
this data could be entered and remains fixed until further information is found 
that leads to a change in the order of the loci/genes. 

With the above in mind it would be nice if acedb had a display function that
could recognise such words as left of, fight of ,or between (yes, I don't use
proximal or distal too often), and create a map based solely on these positional
pointers rather than relying on an exact numerical location to be properly 
positioned. This would require something akin to a linked-list (my c/c++ 
programming is practically nil so this term is used in the hope that it is correct), where as the data is read in to the display matrix the order in which data 
is entered is not relevant but can be placed into the matrix at the required 
point relative to its neighbours and is displayed, on an arbitrary scale where 
distance data is not known, and at specific separations where map distances/
sequence length data is known.

This approach is a little more general to the one you propose but could probably
be more easily tailored to a wider variety of mapping problems with regard to 
relative positions  rather than point locations.

An example would be :

It is known that genes a-h map in interval 1-3, with a,b and c mapping to 
interval 1-2 within interval 1-3, the order of a,b and c has been determined as 
b a c and no further information is known about d-h. 

The display Peter and I envisage would look something like the following :

      1	-	
	|  b
	|  a
	|  c
      2	-	d,e,f,g,h
	|
	|
	|
	|
	|
	|
      3	-

If it were later found that d and e mapped closer to 3 than to 2 it would be
possible to break interval 2-3 down into two intervals called 2.1  and 2.2 with
d and e being mapped between 2.2 and 3 and f-h mapping between 2 and 2.1.

As can be seen this would eventually lead to smaller and smaller intervals until
one would eventually arrive at the actual sequence level of display to which a 
particular gene or locus would map. 

A tentative model may look something like :


?Interval Name text
	  Position Left_of ?Interval XREF Right_of #overlap
	  	   Right_of ?Interval XREF Left_of #overlap
		   

Where overlap is a constructed type that indicates to what degree the intervals 
overlap. (A rough percentage overlap would do).
	  


Please feel free to contact me if there is anything you don't understand above
and I'll try and explain it better.

Benedict Arnold		(email b.arnold at ic.ac.uk)




More information about the Acedb mailing list