CMP 243 Homework 3, Part 1

Due: Thurs. Oct 28

Pairwise sequence alignment, Markov models, introduction to HMMs

New Reading:

  1. Chapter 3 of the text

Written part:

  1. Do problem 2.9 on page 22 of the text. Note: here and below, for the problems that ask you to align two amino acid sequences, use the BLOSUM50 score values on page 16. For DNA, use the score function suggested in the problem.
  2. Do problem 2.10 on page 32 of the text.
  3. Suppose you are given the DNA sequence X = ACTCCGATCG. Using the Markov models defined by the tables at the bottom of page 50 of the text, calculate the probability of the sequence X using the Markov model of CpG islands, call it M+, and also the probability of X using the Markov model of regular DNA, not in CpG islands, (call this model M-). You may assume that the probability distribution for the first nucleotide is uniform for each of these models. These Markov models are the type that run forever; there is no end state. Then calculate the log odds ratio

    log_2 (P(X|M+)/P(X|M-)).

    This gives the log odds in total bits. What is the difference between this score and the scores plotted in the histogram on page 53 of the text? Where would this sequence go in that histogram?

  4. Using the hidden Markov model in the example on page 54 of the text, calculate the probability that in four die rolls, the first 3 tosses are from a fair die, and they are 3, 1 and 5, and the fourth toss is from a loaded die, and it is 6. Assume the probability of starting with a fair die is 0.99.

    This a an example of a hidden Markov model of the type that runs forever, without an end state. In the probability calculation for such HMMs, we do not include the probability of a transition to the end state, as is done in Equation (3.6) on page 54. If you prefer, you can add an end state to this model, and add transitions to this end state from the "Fair" and "Loaded" states with any probability that you choose. Be sure to adjust the outgoing probabilities for each of these states so they sum to 1. Then you may apply Equation (3.6) directly to this problem. Be sure to specify that you are solving the problem this way, if you choose to do this.

  5. Calculate the probability of getting a pair of sixes in two die rolls from the casino in this example on page 54. Note that here I am not specifying the hidden state, i.e. I am not telling you if the each die is fair or not. You must consider the possible cases. Again, assume the probability of starting with a fair die is 0.99. As above, specify if you are using a "runs forever" model or a model with a stop state, and if a model with a stop state, what the new transition probabilities are.
  6. Do problem 3.4 on page 57 of the text.

Questions regarding about page content should be directed to haussler@cse.ucsc.edu.
Last modified October 21, 1999.

Back to the CMP 243 Class Page.