Week 6 Homework -- Nov 22, 2011
					 see images: 06week_HW_Q*.jpg


01 Max Likelihood Question

 Using Maximum Likelihood, fit a Markov model and compute the probabilities
  of all the transitions given these three observation sequences:

	         --  total=5 transitions in each
  	ABCABC   -- A->A=0, A->B=2, A->C=0, B->B=0, B->C=2, C->A=1
	AABBCC   -- A->A=1, A->B=1, A->C=0, B->B=1, B->C=1, C->A=0, C->C=1
	AAACCC   -- A->A=2, A->B=0, A->C=1, B->B=0, B->C=0, C->A=0, C->C=2
	      totals     3,      3,      1,      1,      3,      1,      3  = 15
		 total-X->x  7,                      4,              4,

  (note each set of X->n transitions adds to 1):

  P(A0) = 1.0
  P(B0) = 0.0
  P(C0) = 0.0

  P(A->A) = 3/7
  P(A->B) = 3/7
  P(A->C) = 1/7

  P(B->A) = 0/4
  P(B->B) = 1/4
  P(B->C) = 3/4

  P(C->A) = 1/4
  P(C->B) = 0/4
  P(C->C) = 3/4


02 Stationary Distribution Question
   given the Markov model shown (note A->A is 0.9...)
   what is the stationary distribution of each state:

      P(Ainf) = 5/6
      P(Binf) = 1/6

  From video 11.7:
   The probability of the next step is the same as the previous step:
 		P(At) = P(At-1)
    or the total probability as before:
  		P(At | At-1) * P(At-1) + P(At | Bt-1) * P(Bt-1)

      P(Ainf) =
 Replacing  X =  P(At-1)  and believing that P(Bt-1) = 1 - P(At-1)
  solving for X and using the MC P() from the question video:
   		X = P(At | At-1) * X + P(At | Bt-1) * ( 1 - X )
		X =   0.9        * X +      0.5     * ( 1 - X )
		    = 0.9*X + 0.5 - 0.5*X  =  0.4*X + 0.5  =
		X - 0.4X = 0.5 = 0.6X = 0.5 = 0.5/0.6 = 5/6 = 0.8333
		                                    P(Binf) = 0.1666


03 HMM Question
   given the Hidden Markov model shown with the two X,Y observations
    and the intial prob distribution P(A0) = P(B0) = 0.5

   what's the posterior prob of being in A0
    at time zero if we observe X0 at time zero:   P(A0 | X0) = 0.1111 <***

   P(A0|X0) = P(X0 | A0) * P(A0) / P(X0)
                    0.1  *  0.5  / .45  = .05 / .45 = 5/45 = 1/9

      P(X0) = P(X0 | A0) * P(A0) + P(X0 | B0) * P(B0)
	                0.1  *  0.5  +       0.8  *  0.5   = .05 + .4 = .45


   what's the posterior prob of being in A1
    at time one if we observe X0 at time zero:   P(A1 | X0) = 0.5	 <***

   P(A1|X0) = P(X0 | A1) * P(A1) / P(X0) = P(X0) * P(A1) / P(X0)
    X0 doesn't tell anything about future state transitions
    and prob of being in any A is 0.5, yah?
	  (same as being in A at anytime...)

 Actually...he does it with total probability, but it comes out the
 same because the equal transition probs mean you have no history
 preserved in the states:

   P(A1|X0) = P(A1|A0,X0) * P(A0,X0)  +  P(A1|B0,X0) * P(B0,X0)
                  0.5     *   1/9     +     0.5      *   8/9    == 0.5


   what's the posterior prob of being in A1
    at time one if we observe X0 and X1:      P(A1 | X0,X1) = 0.1111 <***

   P(A1|X0,X1) = P(X0,X1 | A1) * P(A1) / P(X0,X1) =
                    P(X0) * P(X1|A1) * P(A1) / P(X0,X1)
				    .45   *     .1   *   .5  / .2025

    prob of X0 | A1 is same as prob of X0
    prob of seeing any X should be 0.45, yah?
	 so seeing two X's is .45*.45 = .2025

  Again....he does it different...
   because of the equal probabilities it is the same as P(A0|X0)
   (so I'm not sure I actually get it???)


04 Particle Filter Question 1

 in a world with four states (2x2)
 in the top two states we observe A with prob 80% (and B with prob 20%)
 in the bottom two states we observe B with prob 80% (and A with prob 20%)
 have three particles, one in each box except top right,
   counterclockwise -- a, b, c 
 we observe A

 what is the prob that we sample "a" given that we just observed "A"?
  which means we are more likely to be in one of (the top states)
 what is the prob that we sample each particle:

		 P( sample a) = .6666
		 P( sample b) = .1666
		 P( sample c) = .1666

 His way (and the other way I did it) is add the raw probabilities
  for each particle getting the A observation and normalize:

                      raw  / etta  = normalized
		 P(A | a)  = 0.80  /  1.2  = .6666
		 P(A | b)  = 0.20  /  1.2  = .1666
		 P(A | c)  = 0.20  /  1.2  = .1666
	    total etta = 1.20
	  					  total :: = .9998


 Using prob of sampling an A -- P(A) calc:
  P(a|A) = P(A | a) * P(a)    / P(A)		= .6666

     to get P(A):
		 P(A | a)  * P(a)
             0.8   * 0.3333 = .2666

		 P(A | ~a) * P(~a)
		     0.2   * 0.6666 = .1333
	 or
		 P(A | b)  * P(b)
		     0.2   * 0.3333 = .0666
		 P(A | c)  * P(c)
		     0.2   * 0.3333 = .0666
			   P(A|~a))    += .1333
		   
			   P(A)        += .3999

  P(b|A) = P(A | b) * P(b)    / P(A)		= .1666
               0.2  * 0.3333  / 0.3999
  P(c|A) = P(A | c) * P(c)    / P(A)		= .1666
               0.2  * 0.3333  / 0.3999

 and it should add up to 1 shouldn't it?



05 Particle Filter Question 2
 same thing and particles as before but looking at state transitions
 number the states Xl-r:1,2 Yt-b:a,b
 take a single random particle with uniform distribution
 and emulate a next state
 where a particle will move at random
  to an adjacent state (NEWS, not diagonal) with prob=1

 having picked a particle, what's the prob that it finds itself in each state:

 		P(a1) = 1/6
 		P(a2) = 2/6
 		P(b1) = 2/6
 		P(b2) = 1/6

 count up where each particle can move -- this gives the numerator --
  and then normalize by the sum of all possible (6) to get P(n)


06 Particle Filter Question 3
  implement a particle filter for robot localizating using only ONE particle

	which statements are true?

	                     measurements are ignored --  Y (missed this one)
	-- the particle is weighted by the measurement but because there
	   is only one particle it is normalized back to 1, so effectively
	   ignores the change --

	                     result is generally poor -- Y
	-- so true... --

	   cannot represent multi-modal distributions -- Y
	-- multimodal has more than one peak,
	   so one particle can't represent both --

	          initial state (if known) is ignored --
	-- probably, but you might have your particle near the initial state
	   so it could be considered in the filtering results --

	                state transitions are ignored --
	-- particle will still propagate according to the state trans --

	                            none of the above --


07 Particle Filter Question 4
  more questions:

	which statements are true?

	                                     they are easy to implement -- Y
	they scale quadratically with the dimensionality of state space -- n
	   (exponentially, not quadratically...I hope)
	   -- yesss!! the Kahlman filter is quadratic --

	              they can only be applied to discrete state spaces -- n
	   -- "clearly wrong" --
	                                              none of the above -- n


08 Max Min Question
 a 2x2 zero sum game with the utilities to the player max as shown
 put the probabilities for max's strategy for each play on the left
 put the probabilities for min's strategy for each play on the top
 tell what the value of the game, the expected utility to max, at the bottom

  payoff matrix --
       		      Min:one   Min:two
        Max:one    max=4     max=-5
		Max:two    max=-3    max=5
   note zero-sum game so don't need both players,
    just values for max where min gets -value

   Mixed strategy since better payoffs for
     Max:one-Min:two and Max:two-Min:one

   for Max, the Prob of playing 1 = P:
      for both Max plays with Min chosing 1:
    4[Max1,Min1]*P + -3[Max2,Min1]*(1-P)
      for both Max plays with Min chosing 2:
    -5[Max1,Min2]*P + 5[Max2,Min2]*(1-P)
    4P + -3(1-P)  = -5P + 5(1-P)
	4P + -3 - -3P = -5P + 5 - 5P
	7P + -3       = -10P + 5
	17P = 8
	P = 8/17 for Max playing 1
	    9/17 for Max playing 2

   for Min: the Prob of playing 1 = Q:
      for both Min plays with Max chosing 1:
    4[Min1,Max1]*Q + -5[Min2,Max1]*(1-Q)
      for both Min plays with Max chosing 2:
    -3[Min1,Max2]*Q + 5[Min2,Max2]*(1-Q)
    4Q + -5(1-Q)  = -3Q + 5(1-Q)
	4Q + -5 - -5Q = -3Q + 5 - 5Q
	9Q + -5       = -8Q + 5
	17Q = 10
	Q = 10/17 for Min playing 1
	     7/17 for Min playing 2

  Expected utility:
   each outcome times the probability thereof
   repeating the payoff matrix with the Probs on the edges:

                    10/17     7/17
       		      Min:one   Min:two
   8/17 Max:one    max=4     max=-5
   9/17	Max:two    max=-3    max=5

     Max1,Min1 = 8/17*10/17 * 4  =  80/289 * 4  =  320 / 289
	 Max1,Min2 = 8/17*7/17  * -5 =  56/289 * -5 = -280 / 289
	 Max2,Min1 = 9/17*10/17 * -3 =  90/289 * -3 = -270 / 289
	 Max2,Min2 = 9/17*7/17  * 5  =  63/289 * 5  =  315 / 289
	                      total  =                  85 / 289 = 5/17 = .2941



09 Scheduling Question
 for the given network of actions
 with the precidence relations between them (blue lines)
 and the duration of each action shown in each box (numbers)

 for each action fill in
  the earliest start time in the upper left
  the latest start time in the upper right

  (start with both 0 at Start...Finish with both 7)
  (see paper scribbles)
 
*** see image: 06week_HW_Q9_rightANS.jpg
*** got the bottom left 1 box LS wrong...should be 2...


10 Game Tree Question -- check this again!!!
    my answer -- see image: 06week_HW_Q10_myANS.jpg

**** Got the leftmost +1 wrong...
****  because min could have gotten a -1 to select there
****  so it needs to check it.

 a game tree for a stochastic two player game, there are
  max nodes (up-triangles),
  min nodes (down-triangles), and
  chance nodes (circles)

  for the chance nodes all the possibilities are equi-probable
   (so the left most has a 1/3 chance of going to each successor)

  in this game the result of each game is either +1, -1, or 0
  and all the players know that these are the only possible outcomes
   so they can take that into account when trying to figure out which
    nodes to prune away

  backup all the values -- fill in a value for each of these nodes
  then checkoff all the nodes that can be pruned away
   by a proceedure that is similar to alpha-beta
   but updated to handle chance nodes
   so what I mean by that is that
    a node can be pruned away if evaluating the node is not necessary
     to figure out what the best moves are for max and min


11 Strategy Question 
  a two player game where each player has three possible moves

  does player A have a dominant strategy -- no
    A:a is dominated by A:b and A:c
	 but A:b is better with B:e and B:f
	 and A:c is better with B:d

  does player B have a dominant strategy -- yes
    B:e is better than both B:d and B:f no matter what A does

  then click all the boxes that are equilibirum points -- A:b,B:e
    A gets 8, if switch either 0 or 5
	B gets 7, if switch either 2 or 4