Sunday 2 September 2007

Contrary to Popular Opinion, You can NOT prove ANYTHING (to most people) with Statistics.

Honest, the first day back at school, and barely half way though the day, at a school improvement meeting, someone quotes the oft cited, “You can prove anything with statistics.”. I’ve always felt that was untrue, and held a sort of Lincoln-ish idea that “You can fool some people with bad statistics all the time, and you can fool lots of people with bad statistics most of the time, but you can’t fool everybody with bad statistics all of the time. But today I found evidence to suggest that in fact, you may not be able to prove anything to the general public with statistics.
Ok, so it is the start of the year and I’m searching out stories with data for my classes and along the way I came across studies (good ones mostly) that suggest a whole bunch of stuff that you probably couldn’t “prove” to folks in general. Here are four in particular that might seem to go against the “common intelligence” , and some of the evidence that supports them. How many do you belive?
1) Participation in Sport may actually increase crime, and most certainly does not reduce it!

2) Seventy-five thousand people (raise your hand if that is more than the population of your home town) in the US could have been saved from Hospital caused Death if Doctors had applied an inexpensive treatment that has been available (in Medical Journals) since 2004.

3) Your GM and Ford car dealer has the option of marking up the rate on your financing from the rate that GMAC or FOMOCO would offer you, and give them a big percentage for cheating you. Average jack-up for white males is about $300, and minorities get soaked two to three times as much. (Think before you read below, how you might find this out with statistics).

4) The number of citations a law review article gets is reduced if the length of the title is long, or (wait for it) if the article has an equation, it gets cited way less (OK, math teachers knew this already).

1) This is one that has come up several times in the past few years, the most recent by a French political scientist, Sebastien Roche. He has come out in opposition to the proposition that sport is a constructive socializing tool for young people. He was writing in opposition to a French Govt. motion to increase funding for sports in neighborhoods that were involved in the rioting in 2005. His research suggests: “the practice of sport never reduces the number of crimes” and, he extends the statement to add that sports “give the opportunity to develop physical abilities useful for street crime: running, how to use impulsive behaviour, how to master the use of force.” He also points out that a large percentage (so large perhaps, it was not included) of the young people interviewed
WOW, but wait, in case anyone on the coaching staff is still talking to me, he also wrote, “As well as sport can be a time of gathering, a time to party, practising sport can involve moments of high emotion that can push people to get their revenge after a humiliating loss or when they consider that the referees are responsible for their loss. This is the logic of the hooliganism: “They won but didn’t deserve it. Justice must be done”. For this young and alert population, it is the logic of the escalation of violence”.

There is even a (tongue in cheek) video

2-4 all came about from a new field of statistics and economics made possible by the power of computing and the availability of large amounts of data about… well, about everything… The process is called “data mining” and come from a new book out called Super Crunchers: Why Thinking-By-Numbers Is the New Way to Be Smart.

The article on life saving (or the failure to save) health practices comes from an article at the Freakonomics blog site and states:
“In December 2004, [Berwick] brazenly announced a plan to save 100,000 lives over the next year and a half. The “100,000 Lives Campaign” challenged hospitals to implement six changes in care to prevent avoidable deaths. He wasn’t looking for subtle or sophisticated changes. He wasn’t calling for increased precision in surgical operations. No … he wanted hospitals to change some of their basic procedures.

Berwick’s most surprising suggestion, however, is the one with the oldest pedigree. He noticed that thousands of ICU patients die each year from infections after a central line catheter is placed in their chests. About half of all intensive care patients have central line catheters, and ICU infections are deadly (carrying mortality rates of up to 20 percent). He then looked to see if there was any statistical evidence of ways to reduce the chance of infection. He found a 2004 article in Critical Care Medicine that showed that systematic hand-washing (combined with a bundle of improved hygienic procedures such as cleaning the patient’s skin with an antiseptic called chlorhexidine) could reduce the risk of infection from central-line catheters by more than 90 percent. Berwick estimated that if all hospitals just implemented this one bundle of procedures, they might be able to save as many as 25,000 lives per year. “

The article on Car Loan rates comes from the same book and he explains the process :
many do not know that auto lenders, such as Ford Motor Credit or GMAC, often give dealers the option of marking up a borrower’s interest rate. When a car buyer works with the dealer to arrange financing, the dealer normally sends the customer’s credit information to a potential lender. The lender then responds with a private message to the dealer that offers a “buy rate” — the interest rate at which the lender is willing to lend. Lenders will often pay a dealer — sometimes thousands of dollars — if the dealer can get the consumer to sign a loan with an inflated interest rate …

They used the fact that many states now allow you to pay a fee to obtain drivers license information, and plugged in the ethnicity of thousands of buyers to get the minority differences. And they wonder why people don’t trust car dealers!

The section of the book that deal with the number of citations (references in other articles) that a law article gets is heavily correlated with a small set of data variables. They found, in general:

Articles with shorter titles and fewer footnotes were cited significantly more, whereas articles that included an equation or an appendix were cited a lot less. Longer articles were cited more, but the regression formula predicted that citations per page peak for articles that were a whopping fifty-three pages long…. (you see, the rate of increase, decreases until a point is reached in which the derivative of the citation function is equal to zero…….WAKE UP!)
Law review editors who want to maximize their citation rates should also avoid publishing criminal and labor law articles, and focus instead on constitutional law. And they should think about publishing more women. White women were cited 57 percent more often than white men, and minority women were cited more than twice as often. (Hey, one for the ladies...and you thought we weren't paying attention)

I figure it is the equation aversion that probably explains why doctors were not aware of the new Catheter cleaning details, they used statistics to explain the results. And I think it is why you can not prove ANYTHING to most folks without statistics. Prove comes from a process nested in trust, and the average person deals with statistics from a position of fear, loathing, and occasionally irrational hatred. Hence, you explain about the distribution of the data, talk a little about correlation coefficients, and their eyes are rolling back in their heads as they comtemplate chocolate sauce drizzling over a mountain of ice cream... (don't laugh, don't you know we're in the middle of an Obesity Epidemic?)

Oh well, Smile, School starts on Monday....

No comments: