Probability graphical model
Independent parameters.
How many independent parameters are required to uniquely define the CPD of C (the conditional probability distribution associated with the variable C) in the same graphical model as above, if A, B, and D are binary, and C and E have three values each?
here’s a brief explanation: A multinomial distribution over m possibilities x1,…,xm has m parameters, but m−1 independent parameters, because we have the constraint that all parameters must sum to 1, so that if you specify m−1 of the parameters, the final one is fixed. In a CPD P(X | Y), if X has m values and Y has k values, then we have k distinct multinomial distributions, one for each value of Y, and we have m−1 independent parameters in each of them, for a total of k(m−1). More generally, in a CPD P(X | Y1,…,Yr), if each Yi has ki values, we have a total of k1×…×kr×(m−1) independent parameters. |
Example: Let’s say we have a graphical model that just had X→Y, where both variables are binary. In this scenario, we need 1 parameter to define the CPD of X. The CPD of X contains two entries P(X=0) and P(X=1). Since the sum of these two entries has to be equal to 1, we only need one parameter to define the CPD.
Now we look at Y. The CPD for Y contains 4 entries which correspond to: P(Y=0 | X=0),P(Y=1 | X=0),P(Y=0 | X=1),P(Y=1 | X=1). Note that P(Y=0 | X=0) and P(Y=1 | X=0) should sum to one, so we need 1 independent parameter to describe those two entries; likewise, P(Y=0 | X=1) and P(Y=1 | X=1) should also sum to 1, so we need 1 independent parameter for those two entries. |
Therefore, we need 1 independent parameter to define the CPD of X and 2 independent parameters to define the CPD of Y.
*Inter-causal reasoning.
Consider the following model for traffic jams in a small town, which we assume can be caused by a car accident, or by a visit from the president (and the accompanying security motorcade).
Calculate P(Accident = 1 | Traffic = 1) and P(Accident = 1 | Traffic = 1, President = 1). Separate your answers with a space, e.g., an answer of |
0.15 0.25
means that P(Accident = 1 | Traffic = 1) = 0.15 and P(Accident = 1 | Traffic = 1, President = 1) = 0.25. Round your answers to two decimal places and write a leading zero, like in the example above. |
To calculate the required values, we can apply Bayes’ rule. For instance,
P(A=1|T=1,P=1)=P(A=1,T=1,P=1)P(T=1,P=1)=P(A=1,T=1,P=1)P(A=0,T=1,P=1)+P(A=1,T=1,P=1). We can then use the chain rule of Bayesian networks to substitute the correct values in, e.g.,
P(A=1,T=1,P=1)=P(P=1)×P(A=1)×P(T=1|P=1,A=1) This example of inter-causal reasoning meshes well with common sense: if we see a traffic jam, the probability that there was a car accident is relatively high. However, if we also see that the president is visiting town, we can reason that the president’s visit is the cause of the traffic jam; the probability that there was a car accident therefore drops correspondingly.