Thursday 7 February 2013

New Biggest Prime, but NOT Next Biggest

One of my ex-students, Savannah B., wrote to tell me she had read about the discovery of the newest "Largest Prime" recently,  257,885,161 − 1.  This prime was discovered on January 25, 2013 by Curtis Cooper at the University of Central Missouri as part of the Great Internet Mersenne Prime Search or GIMPS. To perform its testing, the project relies on an algorithm developed by Édouard Lucas (You may remember that he was the guy who invented the famous Towers of Hanoi puzzle) and Derrick Henry Lehmer(the younger of the father/son team of number theorists and prime hunters). The algorithm is both specialized to testing Mersenne primes and particularly efficient working with binary computer architectures.

This particular prime is 17,425,170 digits long in it's decimal form. It would take 3,461 pages to display it. If one were to print it out using standard printer paper, single-sided, it would require approximately 7 reams of paper. (and no, children, I did not check that data)

The previous "Largest Prime" was around since 2009, and by comparison required a paltry 12,837,064 decimal digits to describe it. That number was 242643801-1.

When I responded to the note, I assured her that there were many more to be discovered; after all, the primes are infinite. But even more interestingly, that there are many primes guaranteed to be between these last two "Largest" primes. In fact I told her there were several million of them.

"But if they are not yet discovered, how can we know there are so many?" I hear you ask.
 And I reply with one of my favorite rhymes from number theory"

Chebyshev said it, so I'll say it again,
there's always a prime between N and 2N.

This is sometimes called Bertrand's Conjecture, but Chebyshev proved it, so I prefer to give it the stature and deserved name of a theorem. Besides, I love to give his whole name. Pafnuty Lvovich Chebyshev. Now he was a pretty awesome mathematician with lots of notches in his proof belt, and due to the vageries of translating Russian into western tongues, you can find almost as many spellings of his last name, some starting with C, and others with T.

So what does that have to do with "several million" primes between 242643801-1 and 257,885,161-1

What Chebyshev has said is that if you have a number, say 30, then there will always be another prime between that number and 2x30, or 60. Sure enough there are several of them, 31 for example will work. You can try a few for yourself, but it will always work. That's why it's a proof.

Now if we look at any of the Primes of the form used in the GIMPS project, they all have an interesting similarity, they are one less than a power of two. Some very early Mersenne Primes would be 3=22-1, 7=23-1, and 31=25-1. What happened to the forth power? Well the definition only applies to prime exponents. In fact, some prime exponents don't work. 211-1 doesn't work. It's the product of 23 and 89. So what the GIMPS project does is use computers to check which ones work using the algorithm mentioned above. By the way, the Lucas guy in the algorithm name, He proved in 1876 that 2127-1 (which was on Mersenne's original list) is indeed prime, as Mersenne claimed. This was the largest known prime number for 75 years, and is still the largest ever calculated by hand. After that they started using calculation machines, then computers.
This Prime was the milestone for so long, that when BBC announced a new "baby" computer at Manchester University, they used proving it was prime as the ultimate test of the computer. In 1951 Ferrier used a mechanical desk calculator and techniques based on partial inverses of Fermat's little theorem to slightly better this record by finding a 44 digit prime: \(\frac{(2^{148} +1)}{17}\) = 20988936657440586486151264256610222593863921.
In that same year, the age of electronic computer records began when Miller and Wheeler found several new larger primes, including a 79 digit prime.
Another interesting thing about Mersenne Primes is that when you write them in binary, they use only 1's. Seven in binary, for example, is 111.  And that new largest prime, in binary it is written with 57,885,161 consecutive 1's.

But for the beautiful theorem proven by my Russian buddy, Pafnuty, it is important to point out that each is more than twice the last. Let's start with powers of two. 2,4,8,16 etc. Chebyshev's theorem says that there has to be a prime in there between each pair. So when we get to 242643801-1, the second largest known prime, we know that 242643801+1-1 is one more than twice as much... and sing the jingle children... there must be a prime in there somewhere. between 242643801-1 and 242643802-2 in fact. And it turns out that between 242643801-1 and 257,885,161-1 we can expect to find another prime every time we step up a power of two. A total of 57,885,161-42,643,802= 15,241,359 primes at a minimum (and there could be many more).

The trouble is, in that first interval, between 242643801-1 and 242643802-1
there are 12,837,064 numbers to be tested(THIS IS OH SO WRONG, See correction below). Of course you can eliminate half of them as even right off the bat, and when you eliminate the multiples of three you will have narrowed the search down to only a third of that, or about 4 million numbers. So crank up the old PC and get crackin'. Would love for one of mine to discover the next "Second Largest Prime." Keep in mind, it would be the "Largest" non-Mersenne Prime ever found.
Fact checking correction: Joshua Zucker was kind enough to send a comment to correct my gross math error in that count of the numbers to be checked. As he pointed out, the number I gave was a gross underestimation of the number of digits to be checked in that first interval between 242643801-1 and 242643802-1. He points out that there must be N numbers between N and 2N, so we need to check 242643801-1 numbers. This number simply dwarfs the original number into obscurity.
But don't lose hope. Joshua also points out that there are also probably many more primes than the absolute minimum guaranteed by Pafnuty's gem of a theorem.

It is somewhat well known that a good approximation for the number of primes up to some n is given by \(\pi(n) \approx \frac{n}{\ln(n)} \). Actually this approximation tends to under-approximate the number of primes. It has been found to be improved by subtracting one from the denominator. For example the original approximation predicts 144 primes below 1000. Turns out there are actually 168. The adjusted value dividing by log(1000)-1 predicts 169... see, better. The good news is that both approximations are very good when the numbers get large, and 242643801-1 as I have pointed out above, is very large.
So here is the arithmetic as given by my Wolfram alpha app, 5.747041164664005282865876855507854781902617... × 10^12837055 is the number of primes expected up to 242643801-1. The number predicted using twice this amount is 1.149408205979101362235389622701947229078465... × 10^12837056. So the difference leaves almost as many between the two numbers as there are up to the first, or roughly 5.747 × 10^12837055. That is such a huge number, you would think they would be popping up everywhere, so like I said before,

GO ON, You can do it.

It did get me wondering though. What is the largest number that we can say with confidence that there is no undiscovered prime less than that number. I'm sure we know all the primes up into some number of millions. Norman Lehrer, the father of the Lehrer in the theorem at top supposedly published tables of prime numbers and prime factorizations, reaching 10,017,000 by 1909. Surely that has been improved on. So, as the teacher in Ferris Bueller says, Anyone... Anyone?"


Cameron said...

I am prime, after all there is only one of me.

Joshua Zucker said...

The prime number theorem is a much better estimate of how many primes are in that range. There's a LOT more than one prime between N and 2N for almost any N you pick. Your "several million" primes is like saying a rubik's cube has "a couple" possible positions.

And you write "The trouble is, in that first interval, between 2^42643801-1 and 2^42643802-1
there are 12,837,064 numbers to be tested" which is another gross understatement: there are 2^42643801 numbers to be tested! This is an almost unimaginably huge number.