A framework for Cytome exploration

Peter Van Osta pvosta_NO_SPAM_ at _NO_SPAM_cs.com
Tue Jan 18 10:44:34 EST 2005


Besides my on-line version of my article on the Human Cytome Project and
the application of cytomics in medicine and drug discovery I now start an
article on a concept for large scale cytome exploration. It is at this
moment just the beginning

http://ourworld.compuserve.com/homepages/pvosta/hcpframe.htm and

A framework for cytome exploration

By Peter Van Osta
Here I want to present and discuss some ideas on the exploration of the
cytome and the conversion of the spatial, spectral and temporal properties
of the cytome and its cells into their in-silico digital representation.
We want to go from physics to quantitative features and finally come to an
interpretation and understanding of the underlying biological process. We
want to extract attributes from the physical process which are giving us
information about the status and development of the process and its
underlying structures.

First we have to create an in-silico digital representation starting from
the analogue reality captured by an instrument. The second stage (after
creation of an in-silico representation) is to extract meaningful parts
(objects) related to biologically relevant structures and processes.
Thirdly we apply features to the extracted objects, such as area and
(spectral) intensity, which represent (relevant) attributes of the
observed structure and process. Finally we have to separate and cluster
objects based on their feature properties into biologically relevant
subgroups, such as healthy versus disease.

In order to quantify the physical properties of space and time of a
biological sample we must be able to create an appropriate digital
representation of these physical properties in-silico. This digital
representation is then accessible to algorithms for content extraction.
The content or objects of interest are then to be presented to a
quantification engine which associates physical meaningful properties or
features to the extracted objects. These object features build a
multidimensional feature space which can be inserted into feature
analysers to find object/feature clusters, trends, associations and
correlations. Managing the flow

My personal interest is to build a framework in which acquisition,
detection and quantification are designed as modules each using plug-ins
to do the actual work and which operate on objects being transferred
through the framework. Data representing space, time and spectral sampling
are distributed throughout a data management system to be processed.  The
focus is not on the individual device to create the data or on individual
algorithms, but on the management of the dataflow through a distributed
system to convert spatial, spectral and temporal data into a feature
(hyper-) space for quantitative analysis. A software framework manages the
flow and transformation of data from physics to features. Up- and
downscaling of cell-based research is dynamically managed by the system as
the scale of processing does not require a change in basic design. I will
mostly focus on imaging technology, but the basic principles should be
applicable on any digitized content extraction process. Images are digital
information matrixes of a higher order; they only become images as such
when we want to look at them. Probing the sample
When applying digital imaging technology to a biological sample, a clear
understanding of the physical characteristics of the sample and its
interaction with the “sampling” device is a prerequisite for a
successful application of technology.

The basic principle of a digital imaging system is to create a digital
in-silico representation of the spatial, temporal and spectral physical
process which is being studied. In order to achieve this we try to let
down an equidistant sampling grid on the biological specimen. The physical
layout of this sampling grid in reality is never a precise isomorphic
cubical sampling pattern. The temporal and spectral sampling inner and
outer resolution is determined by the physical characteristics of the
sample (electromagnetic spectral range and spectral sampling layout) and
the interaction with the detection technology being used.
The instrument which converts the spatial (scale, dimensions), spectral
(electromagnetic energy, wavelength) and temporal continuum of the sample
into its digital representation allows us to take a view on biology beyond
the capacity of our own perceptive system. It rescales space, spectrum and
time into a digital representation accessible to human perception
(contrast-range, colour) and ideally also to quantification. Instruments
rescale spatial dimensions, spectral ranges and time into a scale which is
accessible to the human mind. The digital image acts as a see-through
window on a part of the physical properties of the biological sample, not
on the instrument as such.

We want to insert a probe system into the sample which changes its state
according to the physical characteristics of the sample. The changes in
the probe system are ideally perfectly aligned in a spatial-spectral and
temporal space with the physical properties of the sample itself. Each
probe system senses the state of the specimen with a finite aperture and
so provides us with a view on the biological structure. As such all
sensing is done in XYZ, spectrum and time, it is the inner an outer
resolution of our sampling which changes. When we do 2D imaging, this the
same as 3D with the 3rd dimension collapsed to one layer, but due to the
Depth of Focus (D.O.F.) this represents a physical Z-slice.

In the spectral domain we probe electromagnetic energy along the spectral
axis with a certain inner and outer resolution. We slide up and down the
spectral axis within the limits of one spectral probing system, which
transforms electromagnetic energy. A single CCD camera probes the visible
spectrum in one sweep. A 3CCD camera uses 3 probes to do its spectral
sampling. However increasing or decreasing the density of the spectral
sampling is only a matter of spectral dynamics. We tend to use “spectra
imaging” for anything which samples the visible spectrum with more than
the spectral resolution of a 3CCD camera. Up-and downscaling our spectral
sampling from broad to narrow, parallel or sequential, continuous or
discontinuous is a matter of applying an appropriate detector array. A
system can manage 1 to n spectral probing devices such as cameras or
PMT’s each sampling a part of the spectrum and spatially aligned allows
to probe the spectrum in a dynamic way.

The time axis is also probed with a varying temporal inner and outer
resolution and depending on the characteristics of the detection device;
the time-slicing can be collapsed or expanded. Time can be sampled
continuously or discontinuously (time-lapse).
The result is a 5-dimensional system expanding or collapsing each
dimension (XYZ, lambda, time) according to the requirements of
exploration. The device attached to the exploration core, imposes the
inner and outer resolution limits upon the system. In-silico these are
only high-order matrix arrays representing a 5D space. We could call this
a continuously variable in-silico representation.
The inner an outer resolution of the probing system is determined by the
physical XYZ sampling characteristics of the sampling device, such as its
point spread function (PSF). For a digital microscope the resolving power
of the objective (XYZ) and its depth of view/focus are important issues in
experimental design and determining the application range of a device. The
interaction of the detection device with the image created by the optics
of the system such as Nyquist sampling demands, distribution of spectral
sensitivity, dynamic range, also plays an important role. The pixel or
voxel representation in-silico however is basically “unaware” of this
meta-information about how the digital density pattern was created.
Detection and quantification algorithms act on the digital information as
such and only the back-translation into physical meaningful data requires
a back-propagation into the real-world layout and dimensions.
How do we physically organize the sampling of biological specimen? The
exploration of cellular or tissue samples is organised in an
array-pattern, ranging form a single tissue slice on a glass slide up to a
large scale grid of for instance a cell or tissue expression arrays. The
granularity or density of the array pattern is determined by the
experimental demands and upstream and downstream processing capacity. Of
course the optical characteristics of the sample carrier (glass, plastic)
will determine the spatial sampling limits in its inner and outer
resolution. The optical and mechanical characteristics of the device used
to explore the (sub) cellular physical domain will also lead to a spatial,
spectral and temporal application domain. The coarse grid-like pattern of
samples on a sample carrier is being explored at each array position at
the appropriate inner and outer resolution, within the optical physical
boundaries of the device used to capture the data. The outer resolution
barrier of the individual detector in space and time is extended by both
spatial and temporal tiling at a range of intervals. Spectral multiplexing
is being done by using spectral selection devices with the appropriate
spectral characteristics for the spectral profile of the sample.

The resulting discrete representation of the sampled spatial, spectral and
temporal grid at each array position is being sent to a storage medium to
provide an audit trail for quality assessment and data validation.
The detection of appropriate objects for further quantification is done
either in-line within the acquisition process or distributed to another
process dealing with the object selection.

The selected objects are sent to a quantification module which attaches an
array of quantitative descriptors (shape, density …) to each object.
Objects belonging to the same biological entity are tagged to allow for a
linked exploration of the feature space created for each individual
object. The resulting data arrays can be fed into analytical tools
appropriate for analysing a high dimensional linked feature space or
feature hyperspace.

Copyright notice and disclaimer

My web pages represent my interests, my opinions and my ideas, not those
of my employer or anyone else. I have created these web pages without any
commercial goal, but solely out of personal and scientific interest. You
may download, display, print and copy, any material at this website, in
unaltered form only, for your personal use or for non-commercial use
within your organization. Should my web pages or portions of my web pages
be used on any Internet or World Wide Web page or informational
presentation, that a link back to my website (and where appropriate back
to the source document) be established. I expect at least a short notice
by email when you copy my web pages, or part of it for your own use. Any
information here is provided in good faith but no warranty can be made for
its accuracy. As this is a work in progress, it is still incomplete and
even inaccurate. Although care has been taken in preparing the information
contained in my web pages, I do not and cannot guarantee the accuracy
thereof. Anyone using the information does so at their own risk and shall
be deemed to indemnify me from any and all injury or damage arising from
such use. To the best of my knowledge, all graphics, text and other
presentations not created by me on my web pages are in the public domain
and freely available from various sources on the Internet or elsewhere
and/or kindly provided by the owner. If you notice something incorrect or
have any questions, send me an email.

First on-line version published on 9 Jan. 2005, last update on 10 Jan.

Email: pvosta_NOJUNK_ at _NOJUNK_cs.com  remove the _NOJUNK_ before sending
an email.

The author of this webpage is Peter Van Osta, MD.

More information about the Cellbiol mailing list