It is one of the more elaborate ML algorithms – a statical model that analyzes the features of data and groups it accordingly.

The HMM is based on augmenting the Markov chain.

– A Markov chain is a model that tells us something about the probabilities of sequences of random variables, states, each of which can take on values from some set.

– These sets can be words, or tags, or symbols representing anything, like the weather.

– A Markov chain makes a very strong assumption that if we want to predict the future in the sequence, all that matters is the current state. – The states before the current state have no impact on the future except via the current state.

– Example: It’s as if to predict tomorrow’s weather you could examine today’s weather but you weren’t allowed to look at yesterday’s weather.

It finds use in Pattern Recognition, Natural Language Processing (NLP), data analytics, etc.

### A simple weather model

The probabilities of weather conditions (modeled as either rainy or sunny), given the weather on the preceding day, can be represented by a transition matrix:^{[3]}

The matrix *P* represents the weather model in which a sunny day is 90% likely to be followed by another sunny day, and a rainy day is 50% likely to be followed by another rainy day. The columns can be labelled “sunny” and “rainy”, and the rows can be labelled in the same order.The above matrix as a graph.

(*P*)_{i j} is the probability that, if a given day is of type *i*, it will be followed by a day of type *j*.

Notice that the rows of *P* sum to 1: this is because *P* is a stochastic matrix.

#### Predicting the weather

The weather on day 0 (today) is known to be sunny. This is represented by a vector in which the “sunny” entry is 100%, and the “rainy” entry is 0%:

The weather on day 1 (tomorrow) can be predicted by:

Thus, there is a 90% chance that day 1 will also be sunny.

The weather on day 2 (the day after tomorrow) can be predicted in the same way:

or

General rules for day *n* are:

#### Steady state of the weather

In this example, predictions for the weather on more distant days are increasingly inaccurate and tend towards a steady state vector. This vector represents the probabilities of sunny and rainy weather on all days, and is independent of the initial weather.

The steady state vector is defined as:

but converges to a strictly positive vector only if *P* is a regular transition matrix (that is, there is at least one *P*^{n} with all non-zero entries).

Since the **q** is independent from initial conditions, it must be unchanged when transformed by *P*. This makes it an eigenvector (with eigenvalue), and means it can be derived from *P*. For the weather example:

and since they are a probability vector we know that

Solving this pair of simultaneous equations gives the steady state distribution:

In conclusion, in the long term, about 83.3% of days are sunny.

Let’s discuss the rest in the comments!