Tuesday 5 March 2019

Islands in the Mist, ---- of Polynomials, and Pretty Geometry



I once read a description of math as like seeing islands in a great ocean covered by a mist. As you learn the subject you work around on an island and clear away some of the mist. Often your education jumps from one island to another at the direction of a teacher and eventually you have mental maps of parts of many separate islands. But at some point, you clear away a fog on part of an island and see it connects off to another island you had partially explored, and now you know something deeper about both islands and the connectedness of math.

I was recently reminded of one of those kinds of connections that ties together several varied topics from the high school education of most good math students. It starts with that over-criticized (and under-appreciated, i) Algebra I technique of factoring.
Almost ever student in introductory algebra is introduced to a "sum and product" rule that relates the factors of a simple quadratic (with quadratic coefficient of one) to the coefficients. The rule says that if the roots are at p and q, then the linear coefficient will be the negative of p+q, and the constant term will be their product, pq. So for example, the simple quadratic with roots at x=2 and x=3 will be x2 - 5x + 6.

I know from experience that if you take a cross section of 100 students who enter calculus classes after two+ year of algebra, very few will know that you can extend that idea out to cubics and higher power polynomials. An example for a polynomial with four roots will probably suffice for most to understand. Because the constant terms in linear factors are always the opposite of the roots, {if 3 is a root, (x-3) is a factor} it is easiest to negate all the roots before doing the math involved (at least for me it always was).
So if we wanted to find the simple polynomial with roots at -1, -2, -3, and -4 (chosen so all the multipliers are +) we would find that the fourth degree polynomial will have 10 for the coefficient of x3 because 1+2+3+4 = 10, just as it works in the second term of a quadratic. After that, the method starts to combine sets of them. The next coefficient will be the sum of the products of each pair of factor coefficients. In the example I created we would add 1x2+1x3+1x4+2x3+2x4+3x4 to get 35x2. The next term sums all triple products of the numbers, 1x2x3 + 1x2x4 + 1x3x4 + 2x3x4 = 50 for the linear coefficient. And in the constant term, we simply multiply all of them together to get 24.

After you've carried that around for a while and maybe forgotten how to get all the other terms, the easy part may remain; the second term is the sum of the opposite of the roots, and the constant term is their product. Then you get to calculus and you learn how easy it is to take the derivative of a polynomial. Then maybe you are playing around with some simple derivatives and you realize that a function f(x) = xn + Axn-1 + ... will have a derivative that is f'(x)=nxn-1 + A(n-1)xn-2. You realize that if f(x) has roots that sum to A, then f'(x) has roots that will sum to (n-1)A/n [If your younger and this seems unclear, note that the roots of f(x) are the same as the roots of n*f(x), for example, y= x2 - 1 has the same roots (+/-1) as 2x2-2 or 3x2-3 etc].

 Much later, you come back across this thought, but because you are at a different place in your understanding of math, you realize that means that the average of the zeros of f(x) is A/n, because there are n of them. So the average of f'(x) must also be A/n because there are n-1 of them... and since f"(x) is related to f'(x) by this same method, A/n must be the average of all the zeros of derivatives of f(x) that do not descend to a constant value.
Because that seems to glib to pass muster with most of my students, an example of these last two paragraphs, to show how interrelated they are.  Take the example f(x) = \(x^4 + 3x^3 + 7x^2 + 2x + 4 \).  We simply inspect to see that the roots have a sum of 3, and since there are four of them, their average is 3/4.  Without knowing the derivatives, we know the roots of f' will sum to \( \frac{3(-3)}{4} = \frac{-9}{4} \) and since there are three of them, their average is ...yeah... -3/4.  We can find f" and the rest by continuing this, but the big flashing light here is that the average stays the same, so the sum of the roots is just the average root times the highest power of the derivative.

 You smugly nest that away in your mind and go on about your business, occasionally refreshing it by relating it to a friend or colleague in the coffee shop or at a conference.

Someday down the line you wonder, or someone you relate it to asks, will that work with numbers that have complex roots, and you quickly convince yourself that it will, and feel pretty smug for knowing all this. Then you stumble across an old copy of Professor Dan Kalman's paper on Marden's Theorem (at least you will if you are as lucky as I was). (Professor Kalman was awarded the 2009 Lester R. Ford Award of the MAA for his 2008 paper on this theorem. Jörg Siebeck discovered this theorem 81 years before Morris Marden wrote about it (1965). However, Prof Kalman writes, "I call this Marden’s Theorem because I first read it in M. Marden’s wonderful book". The theorem says that if you take a trinomial with complex roots (even if the coefficients are complex numbers) there is a really beautiful geometric tie in to the average idea, and more.

I will illustrate with an example that is easy to picture. Suppose we take a trinomial with roots of 2+5i, 2-5i and 6, f(x) =x3 - 10 x2 + 53 x - 174. The derivative will be 3x2 - 20x + 53, with zeros at the complex conjugates x = 1/3 (10-i sqrt(59)) and x = 1/3 (10+i sqrt(59)). Both of these we can see quickly have averages of 10/3 for the zeros, but these first derivative zeros will play a special geometric role a little later in Marden's theorem.

The second derivative of the original cubic gives us 6x-20, with a zero which agrees with the average of the zeros above.

All those little islands with a common algebraic truth seem somehow connected, but then a little more of the mist clears, and the geometry is revealed.

But if we examine these zeros on a complex plane, the three zeros of the original function can form the vertices of a triangle. And the two zeros of the first derivative fall inside that triangle, with the zero of f" bisecting the segment joining them.

So the vertices of the triangle are the roots of f(x), the two red points are f' zeros, and they are the foci of the ellipse shown inscribed in the triangle. And f" has a zero at the center of the ellipse. The ellipse passes through the midpoints of the vertices, and it turns out it is the maximal area ellipse you can inscribe in that triangle, called the Steiner inellipse. (A little algebra, a little calculus, a little geometry, a little trig... maybe there are really no islands, just one math land mass. )

I backed it all up one level by integrating f(x) but the four roots did not appear to relate to the three vertices of the trinomial in any pretty way. They do obey the Gauss-Lucas Theorem. The Gauss–Lucas theorem gives a geometrical relation between the roots of a polynomial P and the roots of its derivative P'. The set of roots of a real or complex polynomial is a set of points in the complex plane. The theorem states that the roots of P' all lie within the convex hull of the roots of P, that is the smallest convex polygon containing the roots of P. When P has a single root then this convex hull is a single point and when the roots lie on a line then the convex hull is a segment of this line. The Gauss–Lucas theorem, named after Carl Friedrich Gauss and Félix Lucas is similar in spirit to Rolle's theorem, another high school calculus basic.

And here is a tie-in for the stats students, the line containing the foci and centroid is the least squares regression line for the three vertices.

If you have only three roots to a higher degree polynomial (one with some or all the roots multiple, such as f(x) = (x-a)J (x-b)K(x-c)J then the ellipse will be tangent at points that divide the segments in ratios of J/K, K/L, and L/J. This is due to Linfield who published it in 1920.

And if you have an n-sided polygon which is tangent to an ellipse at all four midpoints, it seems that there is a complex polynomial with those roots whose derivative has zeros at the foci of the ellipse.  I managed to create an easy example by using the idea of a rhombus centered at the origin.  The polynomial f(x)=x4+3x2 - 4 has zeros at 2i, -2i, 1 and -1.  The derivative, 4x2+6 has zeros at +sqrt(3/2) and -sqrt(3/2).  Using this focus and the point (1/2,1) which is the midpoint of one side of the rhombus I get the ellipse 4x2 + y2=2 which seems to work.

I don't have a clear easy way to recognize what fourth power polynomials would have that property, so if you want to be next to teach me some math, send me what you know.

I communicated several times in 2007 with Professor Kalman when we shared some information about the history of a problem we were both working on.  He went on to include that material, and Marden's Theorem in his wonderful book Uncommon Mathematical Excursions: Polynomia and Related Realms. If you pick it up, check the acknowledgements. There is actually a hat-tip from the professor to yours truly for (a very tiny bit of) assistance with the material for Lill's graphic method of solving for the roots of a polynomial. Still, I'm grateful for any recognition.




1 comment:

DanK said...

Interesting post Pat. Nice to cross paths again, and thanks for mentioning my book.