Artificial intelligent assistant

How is the definition of joint probability not circular? According to the chain rule: $$ P(A,B) = P(A \cap B) = P(A \mid B)P(B) $$ Yet, according to wikipedia, the Kolmogorov definition of conditional probability is (axiomatically) as follows: $$ P(A \mid B) = \frac{P(A \cap B)}{P(B)} $$ Isn't this circular? What is the rigorous definition of $P(A \cap B)$ in terms of, say, the two random variables $X$ and $Y$ for which we will stipulate take values $A$ and $B$ in their respective ranges?

You define conditional probability by the Kolmogorov definition. From the Kolmogorov definition, you can prove $P(A\cap B) = P(A\mid B)P(B)$. Then you just have to define what you mean when you write $P(A,B)$. You've defined it to mean $P(A,B)=P(A\cap B)$. Thus, from the definition of conditional probability and the definition of $P(A,B)$, you can prove the relation $P(A,B)=P(B\mid A)P(A)$. There's nothing circular here, we're just using definitions!

Note that we don't need a "rigorous definition" of $P(A\cap B)$. We already have a rigorous definition of what $P(C)$ means for any set $C$. $A\cap B$ is just another set to plug into that definition: set $C=A\cap B$, and you've got your rigorous definition.

xcX3v84RxoQ-4GxG32940ukFUIEgYdPy cbe05a90d97736d953fa7fc1f2fd3aab