Hidden Markov Model applied to biological sequence. Part 1

Introduction on Markov Chains Models

The Markov Chains (MC) [1][2] and the Hidden Markov Model (HMM) [3] are powerful statistical models that can be applied in a variety of different fields, such as: protein homologies detection [4]; speech recognition [5]; language processing [6]; telecommunications [7]; and tracking animal behaviour [8][9].

HMM has been widely used in bioinformatics since its inception. It is most commonly applied to the analysis of sequences, specifically to DNA sequences [10], for their classification [11], or the detection of specific regions of the sequence, most notably the work made on CpG islands [12].

Overview

The Markov Chain models can be applied to all situations in which the history of a previous event is known, whether directly observable or not (hidden). In this way, the probability of transition from one event to another can be measured, and the probability of future events computed.

The Markov Chain models are discrete dynamical systems of finite states in which transitions from one state to another are based on a probabilistic model, rather than a deterministic one. It follows that the information for a generic state X of a chain at the time t is expressed by the probabilities of transition from the time: t-1.

Continue reading “Hidden Markov Model applied to biological sequence. Part 1”