Velvet Star Monitor

Standout celebrity highlights with iconic style.

general

Understanding the chain rule in probability theory

Writer Andrew Mclaughlin
$\begingroup$

When my teacher told us about the chain rule I found it quite easy, but when I am trying to prove something based on this rule I kind of get confused about what are the allowed forms of this rule. For example, I can't understand why I can say:

$$ p(x,y\mid z)=p(y\mid z)p(x\mid y,z) $$

I can not understand how one can end up to this equation from the general rule! Can you please help how to think correctly about this rule?


I found this post useful for my question:

Is order of variables important in probability chain rule

$\endgroup$ 5

1 Answer

$\begingroup$

$$p(x,y|z) = \frac{p(x,y,z)}{p(z)} = \frac{p(x|y,z)p(y,z)}{p(z)} = p(x|y,z)p(y|z)$$

On the first step we use the definition of conditional probability. On the second step we use the same definition on the numerator to convert the joint probability $p(x,y,z)$ into a conditional $p(x|y,z)$ and a joint $p(y,z)$. Finally, we divide $p(y,z)$ by $p(z)$ applying once again the definition of conditional probability, and we obtain the result.

Another way of looking at it is that you can just ignore variables that are always on the right side of the conditional sign. In that case the expression is just the usual conditional probability:

$$p(x,y) = p(x|y)p(y)$$

You simply condition all of these probabilities on $z$ and you get your original formula.

$\endgroup$ 10

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy