Pat'sBlog: Does Your Doctor Know Statistics

Conditional probability is hard,; doctors, it seems, don't understand the implications of it with tests they do on a regular basis. In the new book, The Numbers Game, (with a long sub-title) by Michael Blastland and Andrew Dilnot,(see below, and order here and I will donate the profit to charity)... they report that only two of twenty-four physicians could correctly work out the probability that a woman has breast cancer given that she tests positive on a mammagram.

Here are the details as they give them in the book, see if you can figure it out. We know that about 0.8% or .008 of all women in the forty-fifty age group have the condition (I haven't checked that figure, that is eight out of every thousand women, will follow up... I promise). The test is 90% accurate at spotting a person who has the disease... if you have it, 9 times out of ten they will catch it. The false positive rate is only 7%. If you don't have it, there is a 93% chance they will give you a clear bill of health. So you go to the doctor and he gives you that long face and says, "Sorry, you tested positive for breast cancer." Now the million dollar question.... .what is the probability that you really have it? Out of all the women in the forty-fifty age group who get that mind numbing statement, what percent REALLY has cancer?

If you are in my stats class, or ever were, you should know how to solve that, and it seems even a little shocking that some (they actually said "Most were not only wrong, but hopelessly wrong."

My way of addressing these kinds of problems is to make a table, and figure out wher all the labels go. The problem assumes that only 8 out of a thousand women really have it, so if we assume a million tests, 8,000 women really have cancer... 992,000 do not. We put this in the table.

Now let's analyze what the test say.... of the 8,000 poeple who DO have cancer, the test will catch 90% or 7,200 of these people (the real victims are the 800 who walk away thinking they are clear and the cancer goes untreated). So we put these numbers in the correct squares of the table.

Now we want to look at the much larger group who do not have cancer; even though the test is 93% accurate in identifying those who are free from cancer, it will falsly tell 7% of the group, or 69.440 of these people that they tested positive for the cancer. We include this information in the table here.

Now we want to look at just the people who got a notice that they tested positive, and we see that there are 76,640 of them; but most of them, 69,440 of them, were false positives. Because such a large group do not have the disease, the number of false positives has swamped the true positives, and less than ten percent of these women really have cancer.

I hope that if these nubmers are anywhere near the true probability, if someone tests positive, the doctor will at least say, "Hey, most of these are false positives..but just to be sure, let's look at a second test..because the probability of another false positive is only 7% (it hasn't changed just because you already had one false positive).. so most of these people will get a clean bill of health on the second trial, but amazingly, there will still be more women who receive a second false positive (46,860) than there were actual sufferers of the disease; over six times as many... maybe there are a few questions you want to ask your G.P. ; like , "When was the last time you took a statistics course?".

4 comments:

Pat's Blog4 March 2009 at 20:40
Ok, I checked and the rate seems to be even a little less than the .7% in the book. Even with that, breast cancer is one of the top three or four cancers for total number of years of life lost... so with knowledge of the possible, errors, get a screaning, but be ready to ask for a second (and third) test to confirm the results if they are negative.
Anonymous4 March 2009 at 23:55
I see variations on the same mistake all the time. Highly educated people often confuse P(A|B) with P(B|A). For example, p-values are the probability of the data given the null hypothesis, not the probability of the null hypothesis given the data. Very few people understand that.
Nate5 March 2009 at 03:53
This assumes independence between tests though. A false positive may be due to a logically imperfect test. The test looks for chemical indications of breast cancer, but these same conditions or chemicals might be present for other reasons. For example, one might fail a drug test because of a poppyseed muffin. The right chemicals are there, but they may not indicate what they are assumed to. For this reason, I would argue that a false positive cannot be assumed to occur randomly, and a woman whose test is a false positive is probably more than 7% likely to have another false positive. Whatever condition confounded the test the first time is (I suspect) likely to persist in the second test.

Another point I have wondered about is that if 0.008 of the population have the condition, can we assume that 0.008 of the sample have that condition? What if ladies take the test becuase they suspect they may have breast cancer? Or, are the women who pursue routine checkups systematically healthier (or less) than the population mean?
Pat's Blog5 March 2009 at 18:54
Nate is exactly right, false positives are almost never a random result, and so a followup positive is not independent of the original result.
For mammography, there are lots of potential complications (confounding variables) that can seem to indicate the presence of a cancer that isn't there, or obscure one that is.
Some researchers suggest that the true false positive rate may be upwards of 15% for these reasons.

But Breast Cancer is still one of the top three or four in terms of years of life lost... so get the scan, and if you get a positive result, keep in mind that it is probably a false positive, and get another, and another if needed, to be sure.
Don't want to lose any of my readers.

Wednesday, 4 March 2009

Does Your Doctor Know Statistics

4 comments: