Stats: Conditional Probability
Recall that the probability of an event occurring given that another event has already occurred is called a conditional probability.
The probability that event B occurs, given that event A has already occurred is
P(B|A) = P(A and B) / P(A)
This formula comes from the general multiplication principle and a little bit of algebra.
Since we are given that event A has occurred, we have a reduced sample space. Instead of the entire sample space S, we now have a sample space of A since we know A has occurred. So the old rule about being the number in the event divided by the number in the sample space still applies. It is the number in A and B (must be in A since A has occurred) divided by the number in A. If you then divided numerator and denominator of the right hand side by the number in the sample space S, then you have the probability of A and B divided by the probability of A.
The question, "Do you smoke?" was asked of 100 people. Results are shown in the table.
. Yes No Total
Male 19 41 60
Female 12 28 40
Total 31 69 100
• What is the probability of a randomly selected individual being a male who smokes? This is just a joint probability. The number of "Male and Smoke" divided by the total = 19/100 = 0.19
• What is the probability of a randomly selected individual being a male? This is the total for male divided by the total = 60/100 = 0.60. Since no mention is made of smoking or not smoking, it includes all the cases.
• What is the probability of a randomly selected individual smoking? Again, since no mention is made of gender, this is a marginal probability, the total who smoke divided by the total = 31/100 = 0.31.
• What is the probability of a randomly selected male smoking? This time, you're told that you have a male - think of stratified sampling. What is the probability that the male smokes? Well, 19 males smoke out of 60 males, so 19/60 = 0.31666...
• What is the probability that a randomly selected smoker is male? This time, you're told that you have a smoker and asked to find the probability that the smoker is also male. There are 19 male smokers out of 31 total smokers, so 19/31 = 0.6129 (approx)
After that last part, you have just worked a Bayes' Theorem problem. I know you didn't realize it - that's the beauty of it. A Bayes' problem can be set up so it appears to be just another conditional probability. In this class we will treat Bayes' problems as another conditional probability and not involve the large messy formula given in the text (and every other text).
There are three major manufacturing companies that make a product: Aberations, Brochmailians, and Chompielians. Aberations has a 50% market share, and Brochmailians has a 30% market share. 5% of Aberations' product is defective, 7% of Brochmailians' product is defective, and 10% of Chompieliens' product is defective.
This information can be placed into a joint probability distribution
Company Good Defective Total
Aberations 0.50-0.025 = 0.475 0.05(0.50) = 0.025 0.50
Brochmailians 0.30-0.021 = 0.279 0.07(0.30) = 0.021 0.30
Chompieliens 0.20-0.020 = 0.180 0.10(0.20) = 0.020 0.20
Total 0.934 0.066 1.00
The percent of the market share for Chompieliens wasn't given, but since the marginals must add to be 1.00, they have a 20% market share.
Notice that the 5%, 7%, and 10% defective rates don't go into the table directly. This is because they are conditional probabilities and the table is a joint probability table. These defective probabilities are conditional upon which company was given. That is, the 7% is not P(Defective), but P(Defective|Brochmailians). The joint probability P(Defective and Brochmailians) = P(Defective|Brochmailians) * P(Brochmailians).
The "good" probabilities can be found by subtraction as shown above, or by multiplication using conditional probabilities. If 7% of Brochmailians' product is defective, then 93% is good. 0.93(0.30)=0.279.
• What is the probability a randomly selected product is defective? P(Defective) = 0.066
• What is the probability that a defective product came from Brochmailians? P(Brochmailian|Defective) = P(Brochmailian and Defective) / P(Defective) = 0.021/0.066 = 7/22 = 0.318 (approx).
• Are these events independent? No. If they were, then P(Brochmailians|Defective)=0.318 would have to equal the P(Brochmailians)=0.30, but it doesn't. Also, the P(Aberations and Defective)=0.025 would have to be P(Aberations)*P(Defective) = 0.50*0.066=0.033, and it doesn't.
The second question asked above is a Bayes' problem. Again, my point is, you don't have to know Bayes formula just to work a Bayes' problem.
However, just for the sake of argument, let's say that you want to know what Bayes' formula is.
Let's use the same example, but shorten each event to its one letter initial, ie: A, B, C, and D instead of Aberations, Brochmailians, Chompieliens, and Defective.
P(D|B) is not a Bayes problem. This is given in the problem. Bayes' formula finds the reverse conditional probability P(B|D).
It is based that the Given (D) is made of three parts, the part of D in A, the part of D in B, and the part of D in C.
P(B and D)
P(B|D) = -----------------------------------------
P(A and D) + P(B and D) + P(C and D)
Inserting the multiplication rule for each of these joint probabilities gives
P(B|D) = -----------------------------------------
P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C)
However, and I hope you agree, it is much easier to take the joint probability divided by the marginal probability. The table does the adding for you and makes the problems doable without having to memorize the formulas.