What is 'hidden' in the HMMs?

joe at removethispart.gs.washington.edu
Mon Nov 18 17:53:17 EST 2002

In article <arb61p$3l6$1 at mercury.hgmp.mrc.ac.uk>,
Thorsten Burmester  <t_burmester at yahoo.com> wrote:
>Probably a simple question:
>Why are the Hidden Markov Models called "hidden"? What exactly is the
>hidden layer?

These models have states that cannot be seen directly.  For example, in
modelling a DNA sequence that has GC-rich and AT-rich regions, we might
have an underlying state "In GC-rich region" or "In AT-rich region",
and imagine occasional changes between those two states.  But what we see
are the bases.  There will be some G's and C's in an AT-rich region (and
vice versa).  So we cannot immediately see which kind of region we are in.

A model that makes the next base dependent on the previous one will
have trouble modelling GC-richness or AT-richness, for a G in an
AT-rich region would then tend to be followed by other G's and C's.

But with the "hidden" underlying state, we can easily imagine that occasional
G's and C's would show up in AT-rich regions without changing the
fact that the next base is still likely to be an A or a T.

These models make it easy, with few parameters, to model long-range
correlations that would otherwise require elaborate models with
multi-step correlations.

Joe Felsenstein         joe at removethispart.gs.washington.edu
 Department of Genome Sciences, University of Washington,
 Box 357730, Seattle, WA 98195-7730 USA

