As intriguing as it is to have an actual “explicit” formula for π(x), making use of the formula is another matter. The first term of the formula is Li(x), which is precisely what we want to have as the asymptotic value of π(x). However, the rest of the formula makes it too cumbersome to use for deriving the prime number theorem — since, as it turns out, Chebyshev had in 1850 already found an equivalent form of the theorem which was much easier to work with.

The basis of the alternative formulation is the function ψ(x) defined by the finite sum

ψ(x) = ∑

_{pn≤x}log p

where the sum is over all prime powers less than x. ψ(x) can be defined also as a step function, similar to F(x), that jumps by log p at powers of each prime p. Chebyshev could show that π(x) ~ Li(x) is a consequence of ψ(x) ~ x — but of course he couldn’t show the latter. The key thing to remember is that ψ(x) grows just a little faster than π(x), but is easier to work with. In fact, since Li(x) ~ x/log(x), if π(x) ~ Li(x), then ψ(x) ~ π(x) log(x), and conversely.

Where did this strange definition of ψ(x) come from? It isn’t quite as weird as it may seem. For one thing, ψ(x) may be thought of as a function that counts the number of primes, only in a slightly different way than π(x) does. The latter function increases by 1 every time x is a prime. ψ(x), on the other hand, increases by log(x) every time x is a power of a prime. In exchange for this odd way of counting, the conjectured asymptotic relationship π(x) ~ Li(x) is equivalent to the relationship ψ(x) ~ x, which is a much simpler function of x — almost as simple as possible, in fact. One might therefore expect ψ(x) to be easier to work with.

In any case, we can write ψ(x) = ∑_{1<n≤x} Λ(n) where

Λ(n) = log p if n = p

^{k}for some k ≥ 1, otherwise Λ(n) = 0

Λ(n) was introduced by von Mangoldt and is named after him. But where does it come from? It’s simply the coefficient of n^{-s} in a certain Dirichlet series, namely that of -ζ′(s)/ζ(s) = -(log ζ(s))′. (This is called the “logarithmic derivative” of ζ(s).) The result can be proven with some standard calculus facts from the Euler product of ζ(s), because

-ζ′(s)/ζ(s) = (∑

_{p}log (1-p^{-s}))′ = ∑_{p}p^{-s}(log p) / (1 – p^{-s}) =

∑_{p}∑_{k≥1}(log p)p^{-ks}= ∑_{n≥2}Λ(n)n^{-s}

The expression that relates ζ(s) to ψ(x) is

-ζ′(s)/ζ(s) = s ∫

_{0≤x<∞}ψ(x)x^{-s-1}dx

(This can be shown from the fact that ∫_{n≤x<∞} x^{-s-1} dx = n^{-s}/s, and so

-ζ′(s)/ζ(s) = ∑

_{n≥2}Λ(n)n^{-s}= ∑_{n≥2}sΛ(n) ∫_{n≤x<∞}x^{-s-1}dx =

s∫_{0≤x<∞}(∑_{1<n≤x}Λ(n)) x^{-s-1}dx = s ∫_{0≤x<∞}ψ(x)x^{-s-1}dx)

By manipulations similar to those Riemann used to obtain an explicit formula for π(x), von Mangoldt found from this, in 1894, that

ψ(x) = x – ∑

_{ρ}x^{ρ}/ρ + ∑_{1≤n<∞}x^{-2n}/2n – ζ′(0)/ζ(0)

where ρ in the sum refers (as before) to the nontrivial zeros of ζ(s). This formula can be further simplified to

ψ(x) = x – ∑

_{ρ}x^{ρ}/ρ + log(x^{2}/(x^{2}-1))/2 – log(2π)

This is similar to Riemann’s explicit formula for π(x), but considerably easier to work with. The fact that the leading term is x means it is quite plausible that what we want (per Chebyshev) for the prime number theorem, namely ψ(x) ~ x, is true. An immediate consequence of this formula is that there must be infinitely many zeros in the critical strip, for otherwise ψ(x) would be a continuous function, which it clearly isn’t.

It is required to calculate the limit of ψ(x)/x as x → ∞. The ratio of the last two terms to x in the limit is 0, so in order to have ψ(x) ~ x as desired, it suffices to show that ∑_{ρ} x^{ρ-1}/ρ → 0 also as x → ∞. Suppose the limit of that sum can be evaluated termwise, so it suffices to show that x^{ρ-1} → 0 for each ρ. Then it only remains to show Re(ρ) < 1 for all roots of ζ(s), because then Re(ρ-1) < 0 and so the absolute value

|x

^{ρ-1}| = |x|^{Re(ρ-1)}→ 0

as x → ∞.

It was these last two steps that both Hadamard and de la Vallée Poussin independently accomplished in 1896, completing the proof of the prime number theorem. In other words, considerably less than the Riemann hypothesis needs to be known about the zeros of ζ(s). All that’s required is that none of the nontrivial zeros lie on the boundary of the critical strip, Re(s) = 1 (or equivalently, by the functional equation, Re(s) = 0). The fact that there are no zeros with Re(s) = 1 turns out to be pretty easy to deduce in a variety of ways from the Euler product.

In the time since 1896, many other proofs of the prime number theorem have been found, some of them fairly short. There are even proofs which are “elementary” in the sense that they don’t use complex analysis or Fourier transforms. The first such “elementary” proof was devised by Paul Erdös and Atle Selberg in the late 1940s. There are a number of variations of this proof, but none are especially straightforward or “natural”, nor do they provide much insight into the theorem. Various other short proofs use more sophisticated techniques involving Fourier transfoms and “Tauberian theorems”. Proofs of this sort were pioneered by Norbert Wiener and Shikao Ikehara around 1930.

Yet another kind of short proof of the prime number theorem is based on a Tauberian theorem (that is, a theorem about conditions for convergence of a series) due to A. E. Ingham. The reason that Tauberian theorems play a prominent role in this theory is that they relate the convergence of a series (such as a Dirichlet series) to the size of its coefficients. In particular, the prime number theorem is equivalent to ψ(x) ~ x, which means that ψ(x)/x → 1 as x → ∞. But ψ(N)/N = (1/N)∑_{1<n≤N} Λ(n) is simply the average value (up to the Nth term) of the coeffiecients of the Dirichlet series for -ζ(s)′/ζ(s). Then the average value of the coefficients of that series being 1 is equivalent to the average value of the coefficients of -ζ(s)′/ζ(s) – ζ(s) being 0. So one can write out what those coefficients are exactly, make other estimates on them, and eventually establish the desired result.

However, there is a shortcoming of most of these shorter proofs, in that they are generally unable to say much about the absolute error in the approximation. That is, π(x) ~ Li(x) means

π(x) = Li(x) + O(f(x))

where f(x)/x → 0 and the “O” notation means |π(x)-Li(x)| < Cf(x) for some constant C and all large x. We would like to know as much as possible about f(x). It could be as large as x^{1-ε} for some ε, or something much smaller, like log(x). Most of the shorter proofs of the prime number theorem don’t give much information about this, because it really depends on the location of the zeros of ζ(s).

So we want to know as much as possible about the location of the zeros of ζ(s), to know the best estimates that can be made for bounds on the approximation of π(x) by Li(x) — with the best bounds being achieved if the Riemann hypothesis is true. So let’s take a closer look at that.