Send bugs to: Ali Rahimi (ali@mit.edu)
Rabiner's excellent tutorial on hidden markov models [1] contains a few subtle mistakes which can result in flawed HMM implementations. This note is intended as a companion to the tutorial and addresses subtle mistakes which appear the sections on ``scaling'' and ``multiple observations sequences.'' Following is a summary of the terms introduced in the tutorial and corrections to some of the equations.
Section III.A of Rabiner introduces in eq (r18) the forward
variable
:
Section V.A introduces the scaled forward variable
and the scaled backward variable
. These variables are easy to compute on modern machines
and will not result in underflows. Section V.A also describes how to
use the unscaled variables to compute
and
.
Rabiner's eqs (r91-r92b) for computing
are misleading,
and no recursion is provided for computing
. This
section derives recursions for both
and
.
We are looking for a recursion to calculate a variable
such that
The following is a corrected version of the recursion of eqs (r91-r92b)
![]() |
|||
we also define the term
which we will use to scale
, and
observe its realtionship to
:
The following recursion produces desired values of
if we
satisfy ourselves with defining
:
![]() |
|||
Note that defining
is not the same
as imposing the requirement
![]() |
The proof that the recursion produces the desired result is again inductive:
![]() |
|||
| (8) |
We have shown recursions for computing
and
, with
and
defined by eqs
(5,6). The next section uses provides
alternative ways of computing
and
using these
variables.
Substituting the scaled variables in the definition for
, we get:
can be computed from
using eq (2):
![]() |
|||
| (11) |
These two entities can be used as-is in the Baum-Welch and Viterbi algorithms.
Section V.B of Rabiner explains how to use multiple
observations sequences for training. In the M step of Baum-Welch, a
new state transition matrix is computed according to eq (r109):
Onces
and
have been computed, these equations can be
used directly to update the state transition matrix and the emission
probabilities. Equation (r111) incorrectly substitutes eqs
(2) and (11) into (r109). Equations
(13) and (14) are easy to use
and should be used for computing the updates. However, for the sake of
completeness, equations analogous to (r111) with the correct
substitutions are included here:
![]() |
|||
![]() |
There are two salient corrections proposed in the paper: the first corrects Rabiner's notation for computing the scaled variables. The second correction is in the way the HMM parameters are updated in the M step under multiple observation sequences. This note also provides an inductive proof that the recursions provide the desired results.
If you notice bugs in this note, please inform the author.