Wednesday, 16 July 2008

BI-MODAL???


Several years ago at the AP Stats reading I remember Ann Watkins (Cal State, Northridge) talking about how rare bimodal distributions are, at least that lots of things we think might be bimodal, really aren’t. The classic example we always talked about in class (before enlightenment) was distribution of heights when both men and women were included.

Turns out it just isn’t so… according to Ann’s speech . So I took some statistics from the National Health Service in the US that said the average height of white males over 20 was 70.2 inches, and for females it was 64.6 inches. Then I pulled out my trusty TI-84. It seems incredible that almost none of the pages which give average height pay any attention to the standard deviation; but since the distributions shown led me to believe it was around 2.5 inches for both groups, I used a generous 3.0 inches for the std. dev. of both groups. In the first image shown the two distributions are shown separately with a window ranging from 60 to 75 inches, with a cursor on the male curve at 67 inches

. If we combine the two, (I simply added the functions and divided by two) as if we measured the nations adults without regard to gender, the distribution looks like this,

So how much would we have to separate them to get a double mode? Well, a lot it seems. Remember that the two means, 70.2 and 64.6, are already almost two standard deviations apart. I decided to see what would happen if we made the males even taller, so I moved them up to THREE standard deviations above the female height, 73.6 inches (ever girls dream, all the guys are over six feet). The resulting combined distribution looked like this.

OK, so why bring this up now? Well, I recently read a report about the entry level salaries of newly minted lawyers (and minted is right for most of them). Most lawyers start out in private practice and make a pretty good salary, (way more than teachers), but some start out in public law and often (almost always) make considerably less. The image posted on the distribution is shown below, and is one of the most striking bimodal distributions I have ever seen for real data.

Yeah, you want to know.. the lower mode is a bridge between the $40,000 and $50,000 (the data seems to have been reported in $5,000 increments). This was 11% of the total data set. And the fat-cats???, there modal hump was at $135,000 to $145,000 (fresh out of the box... wow!) with about17% of the data set.

OK….. TRUTH in Statistics time… data like this is biased by selective reporting… Some of the folks who got low salaries just didn’t send it in…..”Yes, I have the lowest starting salary of anyone in my class.” Others may have lied ( Yes, people do that) and made them a little higher than the truth; but even with very accurate reporting, this looks like one of those true bimodal distributions. OK, keep your eye out for others like this, and when you find them, send me a link… (wow only a month till school starts… )

No comments: