Notes from ACEDB Workshop, part 2 of 2

John L. McCarthy jlmccarthy at lbl.gov
Wed Sep 8 11:44:30 EST 1993


This is the second part of some very rough outline notes I took at the
ACEDB Workshop in Boston last month. I cannot vouch for their accuracy, but
hope I have not misquoted anyone too badly. Please post corrections and
clarifications to this newsgroup.

The full set of notes are about 20 pages long. I have broken them into two
parts to accomodate my news reader and mail software.

If you have questions about particular points, you probably should send
them directly to the people who made those points at the workshop.  A list
of workshop attendees with Email addresses was posted to this newsgroup a
few weeks ago.

A copy of this document is available via anonymous ftp
on weeds.mgh.harvard.edu  under pub/acedb-dev/doc

-John McCarthy, Lawrence Berkeley Laboratory


====================FRIDAY=============================
AM demos & tutorial (very popular!)

GENETIC MAPS
an item gets put on genetic map if it has a name and coordinates
can set center and extent within ?Chromosome
interesting links to 2 and 3 point data, which displays distance difs
 nice quick visual summary of how well map fits expmtl data
menu items for show/highlight selected objects (from KeySet)

show = only display ones on keyset, highlight keeps others

[need better names of menu items "show Keyset items", "Highlight ]

middle button lets you recenter, also at bottom of page it scrolls 1/2
page at a time
middle button to left of left green bar (locator), also zooms
if you move it left and right with middle button, scales continuously

in ?Chromosome  Centre  x  y  y=initial spread when loci are displayd
                          Extent x y map coords of left & rt ends of
black bar
                  Error_factor  x=number that multiplies times 2nd number
    (treat with grain of salt because things get bumped right]
  Bigwood: might be nice to have this not be linear
Otto knows how this all works
          can be set different for dense maps

also can put anchor loci beside black bar by setting tag Marker gene
in ?Gene class
[really need Richard and Jean to define tags like these!!]

Private maps ... may go away (no longer in RD's 2.0)
physical chromo map: 
contigs can have gMap coords & get displayed on map
if you choose this item, it strings together contigs with gaps
shows boxes to left of scale representing clones, cDNA's etc via Keyset
in order to see distribution along the physical map
tied via pMap tags

Chrom-band class: can have dark, light, and color
also can set centromere, don't have telomere's

deficiencies and duplications
1,2, or thick bar  -- set as tag in rearrangement class
click to see name displayed above
GMap Data button shows things inside or outside rearrangements
displayed in dif colors - linked via ??? tags

linkage data has been tailored to C. elegans, but some use elsewhere
2ptdata, 3ptdata will be changed a lot in 2-0
2pt_data is for "selfed"
Wild Type A B AB
mendelian ratio 9 3 3 1
slots to show calculated distance and errors
they have 19 dif types of crosses represented in 2-0 data file
there is a description file that goes along with it (wdoc--should be
part of help/hints/metadata)

3pt_data: 
A-non-B recombinants
B-non-A recombinants
one or more unknown  markers
a5c12b
a 0 c 8 b   does not unambiguously determine order

with new system, can enter multi-point data, not just 3-point
could use to test RFLP's between known markers
on gMap, main markers shown by large squares, others by small
function lets you display relative likelihoods for selected locus
red line end points constrained by mult point data, def/dup, pMap

framework probably appropriate even if detailed calculations are not

Frank: what happens if you have contradictions (highlight in dark)
RD: can fix certain loci and recalculate others via keyset
[would also be nice to have screen dump with lines to explain each
component]

can throw out any particular data item and recalculate local set
Jean may be putting in additional info re LOD scores, etc for Genethon
for dif kinds of experimental data

======================lunch=============================
DATA LOADING Dave Matthews
asks RD to eliminate setenv ACEDB_SU
should give full main menu for everyone with write access

RD: C.elegans has moved towards on-line updates & master is db
DM: would like transaction report
         new objects created explicitly, implicitly
         objects changed, new fields added, fields replaced
         links made to existing objects
         deletes, renames of individual objects, data fields
SL: before a load, she dumps current db, then dumps new & does dif
  also knows what things should not happen.
Otto: could you query the lexique to see if new objs created?
    
RD: in past has thought would be good idea to have dry run input
Mike: does script to look up objects he expects to find in db
Brad: in prior project, did a transaction log and tools to operate on it 
     it took a lot of time to do, and was very little used
RD: thinks it would be easier to produce big file and tools to use it
maybe code in front of each input line to indicate what happens to it
Otto: what about dump option including session # for each object&tag
Mike: essentially an ACEDIFF where you do dump before you save
can do query to get all newly created & updated objects
RD: this gets back to yesterday's idea to be able to do keyset of
changed objects.
JM: can we do search for empty objects?
RD: other thing would be to do a more efficient acedif
make list of all things touched, acedump, quit
RD: it would be a good idea and not hard to do query for all items
that have been changed, and/or all empty items (italicized) also
thinks its reasonable to do acefile annotated with what happens tool
could pass over this to create summary report of what happened Problem
with dry run is that it would not reflect load A, then XREF A
Brad: no one has proposed automated solution to all this Wants data
providers to give him a text file that explains data file that can be
understood by undergraduate.  Thinks .ace files are going to be the
database for some time to come.  Needs to establish new strategies for
data collection in the lab.

=================IMAGES
Otto: Alex virtual file system can keep correspondence of virtual and
real machines
Frank: most neuroscience is 3-D image data, acquired via dif methods
hard to register dif types of info with one another (same scales, etc.)
proposes new standard class ?Image
?Image
           Text         //path to where image is stored
            Class      //type of image (postscript, gif, pict)
             size      //bits
            preferred display program
            might give many images
            Box      x  y   width height      parts of image
another thing you want to do is overlay, with approx registration

PACS - picture archiving and communications systems
most use Sybase, etc. 

Dave and Brad want to integrate geographic maps where stuff grows

RD: Hotspot  x y w h #Class
#Class Locus ?Locus
           Clone ?Clone
           etc. for all classes              (could be useful elsewhere)

Frank: general... how to indicate hotspots-draw box (or shade) 
as mouse passes over it, menu choice for display hotspots, 

Eric: medical community just beginning to do smart things with images,
so not likely to learn much from PACs, but GIS has already done a lot
and may have more to offer.

============break=============
METADATA: Cartinhour, McCarthy
Sam: already a fair amount of metadata in acedb
model itself, help files, hints
can extend in several directions...
e.g., to document tags and data fields
e.g., Chrom_band 
                gMap ?Chromosome int int

=====================================================
DOCUMENTATION
Mike: can take many forms ... 
Ian Dunham has chap in book on physical map databases
FAQ, tutorial, on-line help, clean up doc/ directory, doc list,
4 ace docs: intro, installation guide, config guide, syntax
Sam & John white papers on metadata
Sam: need new docs on table-maker, query language, tools
         perhaps a list of canned queries to cut & paste
       along with explanation of what it was intended to do
     (perhaps via //comment on query line)  RD has some too
programmers guide, how to do a new module, application
     RD: biblio.c is a nice eg for this
weeds has anonymous ftp incoming
Mary Callahan has written intro thing with RD
Sam & Mike contrib to book
RD not too enthusiastic about making help more of a database
Dave M: README would be even more helpful for installation
RD: this is currently in "Notes"
Mike: other newsgroups have had one person do FAQ and then put on
newsgroup & get redone periodically
Brad: volunteers to put together intial FAQ
other volunteers: Marc, Otto, Lisa, Sam, John, Richard, Jean, David,
Staffan, Brad, Edie (for review, etc. -- worked on commercial doc)
Eric: what about TeX info ala Gnu project?
Sam: will do intro manual re how to do create models
Doug: need virgin users to critique documents, interfaces, etc.

========================================================
ACEDB PLATFORMS & PERFORMANCE (Richard)
                                                     Performance
Sun Sparc
    I  (ipc)
    II (ipx, 470)
    10                                         



More information about the Acedb mailing list