Technical Report

Um Modelo Oculto de Markov para Encontrar Promotores em Seqüências de DNA

We present a Hidden Markov Model (HMM) to find binding sites, like promoters, in a DNA sequence. This approach allows variable-length spacers between the consensus sequences. The model was built using 150 known promoters of the {\em Escherichia coli} genome and uses the Expectation-Maximization (EM) algorithm to reestimate parameters. In order to test the model, we used 30 regions of {\em E.~coli}, each one known to contain a promoter. By cutting randomly these regions, we produced 20 sets of 30 sequences. The model was able to determine the correct or nearly correct (within 6 bp) 78$\%$ of the consensus sequences of a set, on average. The program is available through the WWW and can be useful as a tool to find a promoter in any procaryotic DNA sequence.

1998

Created by Quid Worker - IC Technical Reports loader/updater at 2015-06-09 18:20:49.0.

Simplified Graph of Associations

Research Paradigm

Authors

Joao Carlos Setubal

Nalvo Franco de Almeida Jr.

Venue

IC Technical Reports 1998

Location

Universidade Estadual de Campinas . Instituto de Computação