Review of The Man Who Loved Only Numbers


When Paul Erdős died in 1996 Paul Hoffman had known him for about 10 years and interviewed him a number of times. Hoffman had also known and interviewed many of Erdős’ friends and associates. So it’s fair to say that Hoffman had a lot more knowledge of the subject of his biography than most biographers (unless they’re family or close friends) ever do. The biography is indeed quite good, and provides a clear and informative portrait of the very unique, appealing, and colorful individual that Erdős was.

However, what defined Paul Erdős even more than his idiosyncrasies and his humanity was that he was quintessentially a mathematician – a singularly talented and prolific one. Another biography of Erdős, by PhD physicist Bruce Schechter (My Brain is Open: The Mathematical Journeys of Paul Erdos), was published at almost the same time as Hoffman’s. (I have reviewed that as well, and I’ll try not to duplicate much from that other review.) Hoffman freely concedes that he is not a mathematician. While Schechter isn’t a professional mathematician, he is more able to give a complete and accurate account of Erdős’ work than Hoffman.

The picture of Erdős that comes across in both biographies is much the same, since both books are based on a lot of the same source material. Hoffman’s book has the advantage that the source of all Erdős quotes (except those Hoffman got directly) are documented in footnotes. But because Hoffman also can report many anecdotes from his interviews with Erdős himself and with his associates, this biography offers a somewhat clearer picture of the subject as a person, apart from his work.

One anecdote is especially revealing. On an occasion in the late 1960s Erdős (who was about 55 at the time) was staying with an old friend from Hungary in southern California, very near the beach. One day Erdős went out to walk on an esplanade above the beach. Only ten minutes later his hosts received a phone call, from someone who lived close to the esplanade about five blocks away, to report that Erdős turned up on their doorstep saying he was lost and needed help finding his way back to his friend’s place. So Erdős, in spite of his prodigious memory for details of his mathematics and his mathematical collaborators, couldn’t even remember how to retrace his steps. The natural explanation is that he was so absorbed in his mathematical thoughts that no vestige of his short walk had registered in his memory.

As other anecdotes made clear, Erdős was fully capable of recalling details of mathematical conversations he’d engaged in years before, and he could also keep track of more or less simultaneous conversations he carried on with several different mathematicians at the same meeting. He could also recall details of perhaps thousands of technical papers he’d read decades before. It seems reasonable to conclude that his ability to concentrate and to recall mathematical detail had a great deal to do with his singular power as a mathematician – as exemplified by his ability either to solve quickly new mathematical problems or at least to judge accurately their level of difficulty almost effortlessly.

In contrast to this clear portrait of Erdős as a person, Hoffman’s lack of mathematical background means he must rely on the testimony of others he interviewed to describe Erdős’ mathematics. The result is a somewhat less satisfactory account. One example is what Hoffman calls “friendly numbers”. He says this means a pair of numbers (a and b) where the sum of proper divisors of a is equal to b and the sum of proper divisors of b is equal to a. This is actually the definition of “amicable numbers”. The example given is the pair 220 and 284, which was known to Pythagoras. Those are indeed “amicable numbers”. The accepted definition of “friendly numbers” is numbers that have the same ratio between themselves and their own sum of divisors. By this definition, 220 and 284 aren’t “friendly”.

An even more serious problem is the discussion of Bernhard Riemann’s non-Euclidean geometry. Hoffman writes “He [Riemann] builds a seemingly ridiculous assumption that it’s not possible to draw two lines parallel to each other. His non-Euclidean geometry replaces Euclid’s plane with a bizarre abstraction called curved space.” It’s not actually bizarre at all, since the surface of any sphere is one example. Straight lines on the surface of a sphere are (parts of) “great circles” (which are by definition the largest circles that can be drawn on a sphere, like the Earth’s equator). Great circles are never “parallel”, since they always intersect.

Apart from these and a few other mathematical glitches, Hoffman’s book is almost free of trivial proofreading errors. But there’s one glaring exception (at least in the paperback edition). Sixteen pages of pictures are included in the middle of the book, and given appropriate page numbers. However, the footnotes at the end of the book are all keyed to page numbers, and they haven’t been corrected to account for the picture pages, so that all references to pages after the pictures are off by 16 – a very annoying problem if one wants to actually check these references.

In spite of these problems, Hoffman’s book provides a fine portrait, based on personal experiences, of Erdős the man.

Posted in Book reviews, Number theory | Tagged , , | Leave a comment

Review of Mathematics without Apologies


Michael Harris’ book is definitely one that mathematicians shouldn’t miss. It’s also a must-read for people who aren’t professional mathematicians but do have a deep appreciation for the subject and have more than a passing interest in understanding what mathematics is “about”. It will not, however, win any converts among people who think they “don’t like math” and would require serious persuasion to change their minds.

As the title should make clear, the book is intended as a counterpoint to G. H. Hardy’s A Mathematician’s Apology. Hardy poses the question “[W]hy is it really worthwhile to make a serious study of mathematics? What is the proper justification of a mathematician’s life?” Hardy’s answer is somewhat defensive and hedged. He rejects a justification based on practical usefulness of mathematics, which he doesn’t defend robustly. Instead, he concludes that the justification for the life of any “real” mathematician is to “have added something to knowledge, and have helped others to add more; and that these somethings have a value which differs in degree only, and not in kind, from that of the creations of the great mathematicians, or of any other artists, great or small, who have left some kind of memorial behind them.”

Harris, on the other hand, rejects a defensive stance and aims to justify mathematics and the work of mathematicians “without apology”. He stands with Hardy in likening mathematics to other fine arts. But he goes further to justify both mathematics and other art forms not in utilitarian or moralistic terms, but in aesthetic terms of the pleasure from creating and appreciating art of high quality. His argument is first set out in Chapter 3 (“Not Merely Good, True, and Beautiful”). And after excursions through diverse mathematical, philosophical, biographical, artistic, and literary topics, the argument concludes in Chapter 10 (“No apology”). Along the way, it’s noted that other top tier scientists besides mathematicians have also justified their work in terms of pleasure. There is, for instance, Richard Feynman, one of whose books is The Pleasure Of Finding Things Out. Although the things studied in physics and mathematics are rather different (though some like Max Tegmark (Our Mathematical Universe) disagree), there is a distinct pleasure associated with important discoveries in any scientific field. Creative art also involves discovery and “finding things out”. Both the creative artist/scientist and others who simply learn to appreciate their work experience the associated pleasure.

Michael Harris’ most noteworthy mathematical accomplishment is his work within a vast generalization of number theory (in more classical forms of which Hardy also made his mark many decades ago). This work is part of an extensive series of related conjectures called “The Langlands Program”, after Robert Langlands who conceived it and has contributed to its continual expansion. The LP is highly abstruse, technical mathematics, and Harris makes no serious attempt in this book to sketch its outlines. He touches only briefly on some of the more concrete manifestations of the program in connection with what are known as “elliptic curves”. A very concise way to describe the LP is in terms of elaborate “correspondences” between seemingly unrelated mathematical objects such as “Galois representations” and “automorphic forms”. There’s a more complete semi-technical discussion of these correspondences in Edward Frenkel’s Love and Math, and a technical introduction (accessible only to math graduate students and professionals) is An Introduction to the Langlands Program.

Some people have found Harris’ book frustrating because it seems to lack “structure”. In fact, however, it does have structure, but as is often true in advanced mathematics, the structure isn’t always readily apparent. As Harris would point out, that is precisely what appeals to mathematicians about their vocation and avocation: its pleasure lies in discovering hidden structures that aren’t obvious on the surface. The Langlands Program is an apt metaphor for mathematics itself. Earlier examples of the discovery of rich hidden structures in mathematics abound – from Galois theory to Hilbert spaces.

Many topics are discussed in the book, some only cursorily, some in more depth. Here are just a few of them; mathematical Platonism (Harris, like most mathematicians, is an adherent), category theory, Hindu and Buddhist philosophy, literature (such as Thomas Pynchon’s), mathematical tricks, the mind-body problem, economics and finance, mathematical “charisma”, life in Paris, mathematics and sex (in reference to Frenkel’s Love and Math – and much more. Some of this is autobiographical, but there’s almost always some connection, anecdotal or otherwise, with the book’s main themes. It’s a smörgåsbord.

The book definitely isn’t light reading. It’s also heavily footnoted – an average of about 60 per chapter. (Most are worth reading.) There’s also a running series of imagined dialogues – at a very elementary level – on “How to Explain Number Theory at at Dinner Party”. So don’t expect to finish it in a day or two. A better approach is to take it a chapter at a time, and allow that to sink in before proceeding.

Posted in Book reviews, Langlands program, Number theory | Tagged | 4 Comments

Algebraic number theory – index

This is a list of posts in the algebraic number theory series to date, oldest first.

1. A brief history of algebra

2. Numbers – rational and irrational, real and imaginary

3. Diophantine equations

4. Groups and rings

5. Fields and Galois theory

6. Modular arithmetic

7. Rings of algebraic integers

8. Rings and ideals

9. Uniqueness of factorization

10. Failure of unique factorization

11. Why do we care about unique factorization?

12. More concepts from ring theory

13. Factorization of prime ideals in extension fields

14. Splitting of prime ideals in quadratic extensions of ℚ, part 1

15. Splitting of prime ideals in algebraic number fields

16. Roots of unity and cyclotomic fields

17. Cyclotomic fields, part 2



Posted in Algebraic number theory, Number theory | Leave a comment

Cyclotomic fields, part 2

In our previous article on cyclotomic fields we were talking about why the Galois group G of ℚ(μn)/ℚ is isomorphic to (ℤ/nℤ)×, where n∈ℤ and μn is the group of nth roots of unity, the roots of xn-1=0 in some extension of ℚ. (Check here for a list of previous articles on algebraic number theory.)

So far we’ve shown that G is isomorphic to a subgroup of (ℤ/nℤ)×. We still need to show it is actually isomorphic to the whole group, or equivalently that |G|, the order of G, which is equal to the degree of the field extension, [ℚ(μn):ℚ], actually equals |(ℤ/nℤ)×|, which is φ(n), the number of positive integers less than and relatively prime to n.

The group μn is cyclic. Any generator of the group is, by definition, a primitive nth root of unity. We let ζ be an arbitrary but fixed such generator. Then ℚ(μn)=ℚ(ζ). Let f(x) be the minimal polynomial of ζ. f(x)∈ℚ[x] is irreducible over ℚ. All other elements of μn are of the form ζa for some a∈ℤ, where a is well-defined modulo n. Further, ζa is a primitive nth root of unity if and only if a is relatively prime to n, i. e. the greatest common divisor (a,n)=1. The degree of f(x) equals [ℚ(μn):ℚ] and |G|. At this point, all we know about these numbers is that they divide φ(n) (since |G| does).

The group homomorphism j:G→(ℤ/nℤ)× was defined by the relation σ(ζ)=ζj(σ), for σ∈G. We showed that j(σ) is injective and independent of the choice of ζ. Hence G is isomorphic to a subgroup of (ℤ/nℤ)×, and therefore the degree of f(x) (and [ℚ(μn):ℚ] and |G|) is ≤φ(n). G is isomorphic to a proper subgroup of (ℤ/nℤ)×, i. e. not the whole group, if j is not surjective and the degree of f(x) is strictly less than φ(n).

As we noted last time, the problem here is that we don’t yet know that all φ(n) primitive nth roots of unity are zeroes of f(x), so that |G| and the degree of f(x) equal φ(n), and hence G(ℚ(μn)/ℚ) ≅ (ℤ/nℤ)×. Stated another way, we don’t yet know that the field homomorphism on ℚ(μn) induced by mapping ζ to ζa is actually an automorphism of the field, hence an element of the Galois group. It could fail to be if, say, the minimal polynomial of ζa is different from that of ζ, which could happen if the degree of f(x) is less than φ(n) because not all primitive nth roots of unity are zeroes of f(x). In order to rule out this possibility, we will show that the degree of f(x) is ≥φ(n).

There are various ways to prove the isomorphism, and even a number of ways to prove that f(x) has φ(n) distinct roots, so its degree is ≥φ(n). Many of these proofs use machinery (such as discriminants, factorization and ramification of primes, etc.) that we haven’t extensively discussed yet, so I’ll avoid using such things in the proof. However, we’ll get to these topics eventually, and also show a way to construct the automorphisms σ∈G explicitly – after finishing the proof that G(ℚ(μn)/ℚ) ≅ (ℤ/nℤ)×.

So let’s get started. The roots of the minimal polynomial f(x)∈ℚ[x] are all conjugates σ(ζ) for σ∈G, so f(x)=Πσ∈G(x-σ(ζ)). Hence f(x) is monic (leading coefficient 1). The coefficients of f(x) are symmetric functions of all conjugates of ζ, so the coefficients are all left fixed by all σ∈G. f(x) divides xn-1, so all its roots – the conjugates of ζ – are algebraic integers. So the coefficients are also algebraic integers (sums of products of powers of algebraic integers) – members of the ring of integers Oℚ(ζ). They are also in the base field, since they’re left fixed by G. A basic fact is that Oℚ(ζ)∩ℚ = ℤ – any algebraic integer that lies in the base field is necessarily an integer of the base field. Hence f(x)∈ℤ[x].

Suppose that for any a∈ℤ relatively prime to n, i. e. (a,n)=1, ζa is also a root of f(x): f(ζa)=0. Since these ζa with 1≤a<n are distinct primitive nth roots of unity if ζ is, and there are φ(n) of them, the degree of f(x), and hence |G|, is ≥ φ(n). But we already showed |G|≤φ(n), hence |G|=φ(n). Since G is isomorphic to a subgroup of (ℤ/nℤ)×, we must actually have an isomorphism: G ≅ (ℤ/nℤ)×.

So all we have to show is f(ζa)=0 for 1≤a<n and (a,n)=1. The first thing to note is that it suffices to prove this just for primes p with (p,n)=1. For suppose we had that. For general a with (a,n)=1, let p be a prime that divides a. Then (p,n)=1. Consider ζa/p. Since (a/p,n)=1, ζa/p is a primitive nth root of unity with one fewer prime divisor in the exponent than ζa. So by induction on the number of prime divisors of the exponent f(ζa/p)=0. But if the result is true for prime powers of primitive nth roots of unity that satisfy f(x)=0, then f(ζa)=0 since ζa=(ζa/p)p. Alternatively, you can recall that (according to a theorem of Dirichlet), there are infinitely many primes p in the arithmetic progression a+nk for (a,n)=1 and k∈ℤ. Since ζn=1, ζap for all such p.

So let p be prime and (p,n)=1. Then note that f(x)p-f(xp)∈pℤ[x] for any f(x)∈ℤ[x]. This can be proved by induction on the degree of f(x). Suppose the highest degree term of f(x) is Axm, with p∤A. Then (Axm)p-Axmp∈pℤ[x] because Ap≡A (mod p), because (ℤ/pℤ)× is cyclic of order p-1 (Fermat’s theorem). So if h(x)=f(x)-Axn, then we just have to show h(x)p-h(xp)∈pℤ[x]. But that can be assumed true by induction, unless the degree of h(x) is 1. In the latter case, if h(x)=Ax+B, we need (Ax+B)p-(Axp+B)∈pℤ[x]. But all the coefficients in (Ax+B)p except the first and last contain binomial coefficients divisible by p, and the remaining terms are handled with Fermat’s theorem as before.

Finally, then, suppose the opposite of what we want to show, namely that there is a prime p with p∤n and f(ζp)≠0. By what we just showed, f(ζp) is divisible by p in Oℚ(ζ). We have f(x)=Πi∈I(x-ζi) for I={i∈ℤ: 1≤i<n and (i,n)=1}. So f(ζp) divides a product of nonzero factors ζpi. By a lemma we’ll prove in a moment, if J={(i,j): i,j∈ℤ, 0≤i,j<n, i≠j}, Π(i,j)∈Jij) = (-1)n-1nn. Hence f(ζp) divides nn and p|n, contrary to assumption. This contradiction means f(ζp)=0, as required. We’ve now shown f(x) has at least φ(n) roots, hence G(ℚ(μn)/ℚ) ≅ (ℤ/nℤ)×.

Now for the last lemma: We have xn-1=Π0≤i<n(x-ζi). Equating the constant terms gives (-1)n-10≤i<nζi. And by taking derivatives of both sides, nxn-10≤i<nΠ0≤j<n, j≠i(x-ζj). Substituting x=ζk, nζk(n-1)0≤j<n, j≠kkj). Taking products of this for 0≤k<n gives, with the set J as above, Π(i,j)∈Jij) = nn0≤k<nζk)n-1. But the last product on the right side was evaluated above, so finally we are left with (-1)n-1nn on the right (since n-1 has the same even/odd parity as its square).

Well, that was a bit of work, wasn’t it? But nothing too esoteric, apart from a little Galois theory and some classic number theoretical facts. (Thanks to [1, pp 96-8] for the bulk of the proof.)

Actually, it is possible to do this without the lemma, using the theorem on primes in an arithmetic progression. Suppose f(x) is any polynomial in ℤ[x] such that f(ζ)=0 when ζ is a primitive nth root of unity. Then for any a∈ℤ with (a,n)=1, since f(x)p-f(xp)∈pℤ[x] for any prime p∈ℤ, we have 0=f(ζ)p≡f(ζp) mod pOℚ(ζ). But there are infinitely many primes p≡a mod n, and for such p, ζpa. Consequently, f(ζa) is a member of an infinite number of distinct prime ideals, which is possible only if f(ζa)=0. Hence f(x) has degree ≥φ(n), which is the crucial fact we found before.

We can now define the cyclotomic polynomial Φn(x)=Π0<i<n, (i,n)=1(x-ζi), for any primitive nth root of unity ζ. From the foregoing, we know a lot about Φn(x): its roots are precisely all the primitive nth roots of unity (in ℂ), its degree is φ(n), it is irreducible (over ℚ), its coefficients are in ℤ, and it is the minimal polynomial of ζ. The notation Φn(x) is on account of its relation to the Euler function φ(n).

We also have this factorization of xn-1 in ℤ[x]: xn-1 = Πd|nΦd(x). This holds, since the roots of each Φd(x) are precisely the roots of unity in the cyclic group μn that have exact order d for each d that divides n. (Each root has one and only one exact order d satisfying d|n.) This relation is occasionally useful, and it yields interesting facts such as Σd|nφ(d) = n (by taking degrees of polynomials on both sides).

It turns out that the irreducibility of Φn(x) is relatively easy to prove for certain n, namely those that are powers of a single prime. So let p be prime and q=pr for an integer r≥1. Let f(x)=Φq(x). The roots of f(x) are primitive qth roots of unity, namely ζ∈μq such that ζ has order q. There are φ(q) of these and φ(q)=q-q/p=q(1-1/p)=(p-1)pr-1 (because every pth element of the set {0,1,…,q-1} is divisible by p). So clearly f(x)=(xq-1)/(xq/p-1). Let g(x)=(xp-1)/(x-1) and h(x)=g(x+1)=((x+1)p-1)/x=xp-10<j<p(p j)xj-1, where (p j) is a binomial coefficient, which is divisible by p if 0<j<p. Finally, consider the polynomial h(xq/p)=g(xq/p+1).

Suppose f(x) splits in ℚ[x]. Then since f(x)=g(xq/p), the latter splits, and consequently g(xq/p+1)=h(xq/p) does too. But h(xq/p) is what’s known as an Eisenstein polynomial, because the leading coefficient is not divisible by p, the constant term is p (not divisible by p2), and all other nonzero coefficients (the binomial coefficients) are divisible by p. However, Eisenstein polynomials are irreducible over ℚ. This contradiction means f(x) must be irreducible over ℚ. QED.

The fact that Φq(x) is irreducible if q=pr, and hence G(ℚ(μq)/ℚ)≅(ℤ/qℤ)×, can be used as the basis for yet another proof of this isomorphism for arbitrary n, by considering prime power divisors q of n, the corresponding extensions ℚ(μq)/ℚ, and their Galois groups in building up the full extension ℚ(μn)/ℚ and its Galois group. But we won’t go into that now.

In the next installment, we’ll discuss many more fun facts about cyclotomic fields.


[1] Goldstein, Larry Joel – Analytic Number Theory

Posted in Algebraic number theory, Number theory | Leave a comment

Roots of unity and cyclotomic fields

In preparation for many good things that are to come, we need to have a talk about another important class of field extensions of ℚ – the cyclotomic extensions. (Check here for a list of previous articles on algebraic number theory.)

A cyclotomic field in general is a field that is an extension of some base field formed by adjoining all the roots of the polynomial f(x) = xn-1=0 for some specific positive n∈ℤ to the base field. Usually, though not always, this will mean roots that lie in some large field in which f(x) splits completely and that contains ℚ as the base field, such as ℂ, the complex numbers. f(x) is known as the nth cyclotomic polynomial. Mostly the same theory applies if the base field is a finite algebraic extension of ℚ, but we’ll use ℚ as the base field for simplicity.

Since f(1)=0, x-1 is one factor of f(x), and f(x)/(x-1) = xn-1 + … + x + 1 ∈ ℤ[x], with all coefficients equal to 1. If n is even, -1 is also a root of f(x). However, all other roots of f(x) in ℂ are complex numbers that are not in ℝ. Some of these roots, known as the “primitive” nth roots of unity – denoted by ζn (or just ζ if the context is clear) – have the property that all other roots are a power ζk for some integer k, 1≤k<n. So the smallest subfield of ℂ that contains ℚ and all roots of f(x) is ℚ(ζ), known as the nth cyclotomic field.

It is possible to express all the roots of f(x) in the form e2πi/n, where ez is the complex-valued exponential function, which can be defined in various ways. The most straightforward way is in terms of an infinite series, ez = Σ0≤n<∞zn/n!. The exponential function ez can also be defined as the solution of the differential equation dF(z)/dz = F(z) with initial value F(1)=e, the base of the natural logarithms. So there is the rather unusual circumstance that the roots of an algebraic equation can be expressed as special values of a transcendental function. Mathematicians long hoped that other important examples like this could be found (a problem sometimes referred to as “Kronecker’s Jugendtraum“, a special case of Hilbert’s twelfth problem), but that hope has mostly not been fulfilled.

The most well-known nontrivial root of unity is the fourth root, i=&radic(-1), which satisfies x4-1 = (x2+1)(x2-1) = 0.

All complex roots of unity have absolute value 1, i. e. |ζ|=1, since |ζ| is a positive real number such that |ζ|n=1. The set of all complex numbers with |z|=1 is simply the unit circle in the complex plane, since if z=x+iy, then |z|2 = x2+y2 = 1. (Note that the linguistic root of words like “circle”, “cyclic”, and “cyclotomic” is the Greek κύκλος (kuklos).) Since e = sin(θ) + i⋅cos(θ) for any θ, with θ=2πk/n the real and imaginary parts of a general nth root of unity ζ=e2πi⋅k/n are just Re(ζ)=sin(2πk/n) and Im(ζ)=i⋅cos(2πk/n).

There are many reasons why cyclotomic fields are important, and we’ll eventually discuss a number of them. One simple reason is that roots of algebraic equations can sometimes be expressed in terms of real-valued roots (such as cube roots, d1/3 for some d), and roots of unity. See, for example, this article, where we discussed the Galois group of the splitting field of f(x)=x3-2.

The set of all complex nth roots of unity forms a group under multiplication, denoted by μn. This group is cyclic, of order n, generated by any primitive nth roots of unity. (Any finite subgroup of the multiplicative group of a field is cyclic.) As such, it is isomorphic to the additive group ℤn = ℤ/nℤ, the group of integers modulo n. Because of this, many of the group properties of μn are just restatements of number theoretic properties of ℤn. For instance, each element of order n in μn is a generator of the whole group – one of the primitive nth roots of unity. Since μn⊆ℚ(ζn), adjoining all of μn gives the same extension ℚ(ζn) = ℚ(μn).

Now, ℤ/nℤ is a ring, and its elements that are not divisors of zero are invertible, i. e they are units of the ring. They form a group under ring multiplication, which in this case is written as (ℤ/nℤ)× (sometimes Un for short). An integer m is invertible in ℤ/nℤ if and only if it is prime to n, i. e. (m,n)=1 (because of the Euclidean algorithm). The number of such distinct integers modulo n is a function of n, written φ(n). This number is important enough to have its own symbol, because it was studied by Euler as fundamental to the arithmetic of ℤ/nℤ. Thus φ(n) is also the order of the group (ℤ/nℤ)×.

Let ζ=e(2πi)m/n, for 0≤m<n, be an element of μn. The correspondence m↔e(2πi)m/n establishes a group isomorphism between the additive cyclic group ℤ/nℤ and the multiplicative group μn. Modulo n, m generates ℤ/nℤ additively if and only if (m,n)=1, which is if and only if the corresponding ζ generates μn. So the number of generators of μn – which is the number of primitive nth roots of unity – is the same as the order of (ℤ/nℤ)×, i. e. φ(n).

One has to be careful, because the multiplicative structure of μn parallels the additive structure of ℤ/nℤ, not the multiplicative structure of (ℤ/nℤ)×. (Because if ζM and ζN are typical elements of μn then ζM×ζNM+N.) Hence even though there are φ(n) generators of μn, these generators do not form a group by themselves (a product of generators isn’t in general a generator), so the set of them isn’t isomorphic to (ℤ/nℤ)×, even though the latter also has φ(n) elements. Give this a little thought if it seems confusing.

Moreover, the group (ℤ/nℤ)× is not necessarily cyclic. It is cyclic if n is 1, 2, 4, pe, or 2pe for odd prime p, but not otherwise. Confusingly, if the group does happen to be cyclic then integers modulo n that generate the whole group are called “primitive roots” for the integer n. If (ℤ/nℤ)× happens to be cyclic, then only those m∈(ℤ/nℤ)× having order φ(n) are “primitive roots” that generate the group, while all m∈(ℤ/nℤ)× have the property that if ζ∈μn has order n and generates the latter group, then so does ζm, as we showed above. Got that straight, now? This needs to be understood when working in detail with roots of unity.

Another reason for the importance of cyclotomic fields is that the Galois group of the extension [ℚ(ζn):ℚ] is especially easy to describe. Indeed, it is isomorphic to the group of order φ(n) we’ve just discussed: (ℤ/nℤ)×. There’s a little work in proving this isomorphism, but let’s first note what it implies. Let G=G(ℚ(ζ)/ℚ) be the Galois group. It is an abelian group of order φ(n) since it’s isomorphic to (ℤ/nℤ)×. Further, any subgroup of G′ of G is abelian and by Galois theory determines an abelian extension (i. e., an extension that is Galois with an abelian Galois group) of ℚ as the fixed field of G′. Conversely, it can be shown (not easily) that every abelian extension of ℚ is contained in some cyclotomic field. (This is the Kronecker-Weber theorem.)

Half of the proof of the isomorphism is easy. Pick one generator ζ of μn, i. e. a primitive nth root of unity. We’ll see that it doesn’t matter which of the φ(n) possibilities we use. Suppose σ∈G is an automorphism in the Galois group. Since σ is an automorphism and ζ generates the field extension, all we need to know is how σ acts on ζ. Since σ is an automorphism, σ(ζ) has the same order as ζ, so it’s also a primitive nth root of unity. Therefore σ(ζ) = ζm for some m, 1≤m<n. As we saw above, m is uniquely determined and has to be a unit of ℤ/nℤ, with (m,n)=1, in order for ζm to be, like ζ, a generator of the cyclic multiplicative group μn. Hence m∈(ℤ/nℤ)×. Call this map from G to (ℤ/nℤ)× j, so that σ(ζ)=ζj(σ). To see that it’s a group homomorphism, suppose σ12∈G, with j(σ1)=r, j(σ2)=s. Then σ21(ζ)) = σ2r) = (ζs)r = ζsr, hence j(σ2σ1) = j(σ2)j(σ1). j is clearly injective since j(σ)=1 means σ(ζ)=ζ, so σ is the identity element of G. Finally, to see that j doesn’t depend on the choice of primitive nth root of unity, suppose ζm with m∈(ℤ/nℤ)× is another one. Then σ(ζm) = σ(ζ)m = (ζj(σ))m = (ζm)j(σ).

Thus G is isomorphic to a subgroup of (ℤ/nℤ)×. That’s enough to show G is abelian, so the extension ℚ(ζ)/ℚ is abelian. To complete the proof of an isomorphism G≅(ℤ/nℤ)× we would need to show that the injective homomorphism j is also surjective, i. e. every m∈(ℤ/nℤ)× determines some σ∈G such that m=j(σ). We can certainly define a function from ℚ(ζ) to ℚ(ζ) by σ(ζ)=ζm for a generator ζ of the field ℚ(ζ). One might naively think that’s enough, but the problem is that one has to show that σ is a field automorphism of ℚ(ζ).

The map σ defined that way certainly permutes the nth roots of unity in μn, the roots of the polynomial f(x)=xn-1. However, not all permutations of elements of μn, of which there are n!, yield automorphisms of ℚ(ζ). The problem here is that if z(x) is the minimal polynomial of some ζ, i. e. the irreducible polynomial of smallest degree in ℤ[x] such that z(ζ)=0, then by Galois theory the order |G| of the Galois group G is the degree of the field extension, which is the degree of z(x). Since G is isomorphic to a subgroup of the group (ℤ/nℤ)×, and the latter has order φ(n), all we know is that |G| divides φ(n). It could be that other primitive nth roots of unity have minimal polynomials in ℤ[x] that are not the same as z(x), though they have the same degree |G|. For σ to be an automorphism, σ(ζ) needs to have the same minimal polynomial as ζ, and we don’t know that immediately from the relation σ(ζ)=ζm.

We will defer discussion of the rest of the proof that G(ℚ(μn)/ℚ)≅(ℤ/nℤ)× for the next installment, since some new and important concepts will be introduced.

Posted in Algebraic number theory, Number theory | Leave a comment

Splitting of prime ideals in algebraic number fields

*.overline {text-decoration: overline;}

Our series of articles on algebraic number theory is back again. Maybe this time it won’t be so sporadic. Stranger things have happened. The previous installment, of which this is a direct continuation, is here. All previous installments are listed here.

When we left off, we were talking about how to determine the way a prime ideal factors in the ring of integers of a quadratic extension of ℚ. Such a field is of the form ℚ(√d) for some square-free d∈ℤ. We were using very simple elementary reasoning with congruences, and we found a fairly simple rule, namely:

If p∈ℤ is an odd prime (i. e., not 2), and K=ℚ(√d) is a quadratic extension of ℚ (where d is not divisible by a square) then

  1. p splits completely in K if and only if p∤d and d is a square modulo p.
  2. p is prime (i. e. inert) in K if and only if d is not a square modulo p.
  3. p is ramified in K if and only if p|d.

The prime 2 behaves a little more weirdly, but the result is that 2 ramifies if and only if d≡2 or 3 (mod 4); 2 is inert if and only if d≡5 (mod 8); 2 splits if and only if d≡1 (mod 8).

One limitation was that our simple reasoning made it necessary to assume that OK, the ring of integers of K, was a PID (principal ideal domain).

Let’s review what we were trying to do. We were investigating the factorization of a prime ideal (p)=pOℚ(√d) in Oℚ(√d). If Oℚ(√d) is a PID, then there is a simple approach to investigate how p splits. If p splits then (p)=P1⋅P2, where Pi=(αi), i=1,2. Any quadratic extension is Galois, and the Galois group permutes the prime ideal factors of (p). The factors are conjugate, so if α1=a+b√d we can assume α21*=a-b√d. Hence (p)=(α1)⋅(α1*)= (α1α1*)= (a2-db2).

Taking norms (to eliminate possible units ε∈Oℚ(√d)) reduces the problem to a Diophantine equation of the form ±p=a2-db2. With the problem thus reduced, a necessary condition for (p) to split (or ramify) is that the equation can be solved for a,b∈ℤ. A sufficient condition to show that (p) is inert, i. e. doesn’t split or ramify, is to show that the equation can’t be solved.

Let’s look at how that might work. For example, let d=3. Looking at the equations modulo 3, we have ±p≡a2 (mod 3). That is, either p or -p is a square modulo 3. Say p=5. The only nonzero square mod 3 is 1, and 5≢1 (mod 3). However -5≡1 (mod 3), so could we have -5=a2-3b2? Suppose there were some a,b∈ℤ such that -5=a2-3b2. Then instead of looking at the equation modulo 3, we could look at it modulo 5, and find that then a2≡3b2 (mod 5). If 5 divides either a or b, it divides both, and so 25 divides a2-3b2, which is impossible since 25∤5. Therefore 5∤b. ℤ/(5) is a field, so b must have an inverse c such that cb≡1 (mod 5). Therefore, (ac)2 ≡ 3(bc)2 ≡ 3 (mod 5), and so 3 is a square mod 5. But that can’t be, since only 1 and 4 are squares modulo 5. The contradiction implies -5=a2-3b2 has no solution for a,b∈ℤ.

All that does show 5 doesn’t split or ramify in ℚ(√3), hence it must be intert, but this approach is messy and still requires knowing that the integers of ℚ(√3) form a PID. We need to find a better way. Fortunately, there is one. But first let’s observe that this elementary discussion shows there is a fairly complicated interrelationship among:

  1. Factorization of (prime) ideals in extension fields,
  2. Whether a given ring of integers is a PID,
  3. Whether an integer prime can be represented as the norm of an integer in an extension field,
  4. Whether an integer can be represented by an expression of the form a2+db2 for a,b∈ℤ (in the case of quadratic extensions),
  5. Whether, for primes p,q∈ℤ, p is a square modulo q and/or q is a square modulo p.

The problem of representing an integer by an expression like a2+db2 is a question of solving a Diophantine equation, and more specifically is of the type known as representing a number by the value of a quadratic form. This question was studied extensively by Gauss, who proved a remarkable and very important result, known as the law of quadratic reciprocity, which relates p being a square modulo q to q being a square modulo p, for primes p,q.

We will take up quadratic reciprocity soon (and eventually much more general “reciprocity laws”), but right now, let’s attack head on the issue of determining how a prime of a base field splits in the ring of integers of an extension field. We will use abstract algebra instead of simple arithmetic to deal with this question. For simplicity, we’ll assume here that the base field is ℚ, even though many results can be stated, and are often valid, for more arbitrary base fields.

Chinese Remainder Theorem

The first piece of abstract algebra we’ll need is the Chinese Remainder Theorem (CRT). Although it’s been known since antiquity to hold for the ring ℤ, generalizations are actually true for any commutative ring.

Let R be a commutative ring, and suppose you have a collection of ideals Ij, for j in some index set, j∈J. Suppose that the ideals are relatively prime in pairs. In general that means that Ii+Ij=R if i≠j, and further, the product of ideals, Ii⋅Ij, is Ii∩Ij when i≠j. If R is Dedekind, then each ideal has a unique factorization into prime ideals, and they are relatively prime if Ii and Ij have no prime ideal factors in common when i≠j. Let I be the product of all Ij for j∈J, which is also the intersection of all Ij for j∈J, since the ideals are coprime in pairs.

The direct product of rings Ri for 1≤i≤k is defined to be the set of all ordered k-tuples (r1, … ,rk), for ri∈Ri, with ring structure given by element-wise addition and multiplication. The direct product is written as R1×…×Rk, or &Pi1≤i≤kRi.

Given all that, the CRT says the quotient ring R/I is isomorphic to the direct product of quotient rings Π1≤i≤k(R/Ii) via the ring homomorphism f(x)=(x+I1, … ,x+Ik) for all x∈R.

The CRT is very straightforward, since f is obviously a surjective ring homomorphism, and the kernel is I, since it’s the intersection of all Ii. (It’s straightforward, at least, if you’re used to concepts like “surjective” and “kernel”.)

Now we’ll apply the CRT in two different situations. First let R be the ring of integers OK of a finite extension K/ℚ, and Ii=Pi, 1≤i≤g, be the set of all distinct prime ideals of OK that divide (p)=pOK for some prime p∈ℤ. Then (p)=P1e1 ⋅⋅⋅ Pgeg, where ei are the ramification indices of each prime factor of (p). An application of CRT then shows that OK/(p) ≅ Π1≤i≤g(OK/Piei). Recall that for each i, OK/Pi is isomorphic to the finite field 𝔽qi, where qi=pfi for some fi, known as the degree of inertia of Pi. (This field is the extension of degree fi of 𝔽p=ℤ/pℤ.) Further, Σ1≤i≤geifi=[K:ℚ], the degree of the extension. Check here if you need to review these facts. Specifying how (p) splits in OK amounts to determination of the Pi and the numbers ei, fi, and g.

The second situation where we apply CRT involves the ring of polynomials in one variable over the finite field 𝔽p=ℤ/pℤ, denoted by 𝔽p[x]. Let f(x) be a monic irreducible polynomial with integer coefficients, i. e. an element of ℤ[x]. Let f(x) be f(x) with all coefficients reduced modulo p, an element of 𝔽p[x]. f(x) will not, in general, be irreducible in 𝔽p[x], so it will be a product of powers of irreducible factors: Π1≤i≤g(fi(x)ei), where fi(x)∈𝔽p[x]. Each quotient ring 𝔽p[x]/(fi(x)) is a finite field that is an extension of 𝔽p of some degree fi. In general, ei, fi, and g will be different, of course, from the same numbers in the preceding paragraph. But the CRT gives us an isomorphism 𝔽p[x]/(f(x)) ≅ Π1≤i≤g(𝔽p[x]/(fi(x)ei)).

Now, here’s the good news. For many field extensions K/ℚ, there exists an appropriate choice of f(x)∈ℤ[x] such that for most primes (depending on K and f(x)), the numbers ei, fi, and g will be the same for both applications of the CRT. Consequently, we will have OK/(p) ≅ 𝔽p[x]/(f(x)), because for corresponding factors of the direct product of rings, OK/Piei ≅ 𝔽p[x]/(fi(x)ei). As it happens, most primes don’t ramify for given choices of K and f(x), so that things are even simpler, since all ei=1, and all factors of the direct products are fields.

We can’t go into all of the details now as to how to choose f(x) and what the limitations on this result are. However, here are the basics. Any finite algebraic extension of ℚ (and indeed of any base field that is a finite algebraic extension of ℚ) can be generated by a single algebraic number θ: K=ℚ(θ), called a “primitive element”. In fact, &theta can be chosen to be an integer of K. Then the ring of integers of K, OK, is a finitely generated module over ℤ. (A module is like a vector space, except that all coefficients belong to a ring rather than a field.) The number of generators is the index [OK:ℤ[θ]]. (ℤ[θ] is just all polynomials in θ with coefficients in ℤ.) If p∈ℤ is any prime that does not divide [OK:ℤ[θ]], then the result of the preceding paragraph holds. If for some p and some choice of θ p does divide the index, then there may be another choice of θ for which p doesn’t divide the index. Unfortunately, there are some fields (even of degree 3 over ℚ) where this isn’t possible for some choices of p.

The situation is especially nice in the case of quadratic fields, K=ℚ(√d), square-free d∈ℤ. If d≢1 (mod 4); we can take θ=√d and f(x)=x2-d, since OK=ℤ[√d]. If d≡1 (mod 4), then the index [OK:ℤ[√d]]=2, and there’s a possible problem only for p=2. However, we still have OK/(p) ≅ 𝔽p[x]/(x2d) for all p≠2. From that it’s obvious that, except for p=2, (p) ramifies if p|d, (p) splits if d is a square modulo p, or else (p) is inert. That is exactly the conclusion we began with at the beginning of this article, on the basis of elementary considerations. Only now we need not assume that OK is a PID.

There are four important lessons to take away from this discussion.

First, there is a very close relationship between the arithmetic of algebraic number fields and the arithmetic of polynomials over a finite field. Not only do we have the isomorphism discussed above, but it turns out that a number of similar powerful theorems are true for both algebraic number fields and the field of quotients of polynomial rings over a finite field.

Second, a lot of the arithmetic of algebraic number fields can be analyzed in terms of what happens “locally” with the prime ideals of the ring of integers of the field.

Third, many of the results of algebraic number theory are fairly simple if the rings of integers are PIDs (or, equivalently, have unique factorization). Such results often remain true when the rings aren’t PIDs, though they can be a lot harder to prove. Often the path to proving such results involves considering the degree to which a given ring of integers departs from being a PID.

Fourth, and perhaps most importantly, abstract algebra is a very powerful tool for understanding algebraic number fields – and it is much easier to work with and understand than trying to use “elementary” methods with explicit calculations involving polynomials and their roots.

We will see these lessons validated time and again as we get deeper into the subject.

So where do we go from here? There are a lot of directions we could take, so we’ll probably jump around among a variety of topics.

Posted in Algebraic number theory, Number theory | Leave a comment

Splitting of prime ideals in quadratic extensions of ℚ, part 1

Our discussion of algebraic number theory returns by popular demand. Way back last April we presented some generalities on factorization of prime ideals in extension fields. (For explanation of what that means, including other necessary concepts, you’ll have to review earlier installments of this series, which can be found here.)

If this is all Greek to you, I apologize, but that’s unavoidable at this stage of a rather technical subject. You may want to go back to the earliest parts of the series to see how the subject got its start and why it may be interesting.

The discussion in the previous installment probably seems rather dry and abstract, but when we look at simple examples, such as quadratic extensions, why it’s interesting becomes clearer.

Because of how the ramification indexes and inertial degrees are related, for any prime ideal (p) of ℤ there are only three different possibilities for how the ideal factors in the ring of integers of a quadratic extension:

  1. (p)=P1⋅P2, so (p) splits completely. (e=f=1, g=2)
  2. (p)=P is a prime ideal in ℚ(√d), so p is inert. (e=g=1, f=2)
  3. (p)=P2 where P is prime, and p is ramified. (f=g=1, e=2)

It turns out that there are simple criteria for each of these cases. But figuring out what the criteria are is tricky.

Recall that in ℚ(√3) we found (13)=(4+√3)⋅(4-√3) and (-11)=(8+5√3)⋅(8-5√3), so both (11) and (13) split completely. Clearly, (3)=(√3)2, so (3) is an example of a prime ideal if ℚ that is ramified. How about an example of a prime ideal that is inert in the extension? This is a little harder for a couple of reasons. (p) will be inert just in case it neither splits nor is ramified, but we don’t yet have simple criteria to rule out those cases.

So let’s back up a little and look at the details. We found examples where (p) splits in the integers of ℚ(√3) by solving the equation ±p=a2-3b2, because that gave elements a±b√3 whose norm was ±p. Being able to find such elements guaranteed that the prime split. But ℚ(√d) with d=3 is a special case, since here d≡3 (mod 4). In that case, and also if d≡2 (mod 4), the integers of the extension have the form a+b√d with a,b∈ℚ.

If d≡1 (mod 4), integers can also have the form (a+b√d)/2, with a,b∈ℚ, and we might have a factorization like (p) = ((a+b√d)/2)⋅((a-b√d)/2), so we would have also to consider solvability of ±4p=a2-3b2. If we were to look at solvability of the approriate equation, according as to whether or not d≡1 (mod 4), the solvability would be a sufficient condition for (p) to split (or ramify if a=0). Notice that this sufficient condition for (p) to split holds regardless of whether or not ℚ(√d) is a PID.

Now we need to find a convenient necessary condition for (p) to split. Unfortunately, solvability of one simple equation is not a necessary condition in general. It would be, as we’ll see in a minute, if the ring of integers of ℚ(√d) happens to be a PID, as is true when d=3. However, in quadratic extensions where the ring of integers isn’t a PID, being unable to solve the applicable equation doesn’t guarantee (p) cannot split, because there might be non-principal ideals that are factors of (p).

So let’s ignore that problem for a moment and just focus on the case where the ring of integers of ℚ(√d) is a PID. Can we then find a necessary condition on p for (p) to split or ramify, i. e. for (p) to not be a prime ideal of the integers of ℚ(√d)? That is, what must be true about p if (p) splits or ramifies?

If (p) splits or ramifies, then (p)=P1⋅P2 for nontrivial ideals Pi. (The ideals are the same or distinct according as (p) ramifies or splits.) Assuming ℚ(√d) is a PID, then P1 is generated by a+b√d where both a,b∈ℚ, if d≡2 or 3 (mod 4), or else by (a+b√d)/2 with a,b∈ℚ, if d≡1 (mod 4). Likewise, the conjugate ideal P2 is generated by a-b√d or (a-b√d)/2. Since p∈P1⋅P2, by definition of a product of ideals, p is of the form p = ε(a+b√d)(a-b√d) = ε(a2-db2) or p = ε(a+b√d)(a-b√d)/4 = ε(a2-db2)/4 for some integer ε of ℚ(√d).

Recall that the norm of an element of a Galois extension field is the product of all conjugates of the element. So for an element that is also in the base field, the norm (with respect to a quadratic extension, which is always Galois) is the square of the element. Taking norms of both sides of the possible equations, then either p2 = N(ε)(a2-db2)2 or 16p2 = N(ε)(a2-db2)2. For simplicity, consider just the first case. Then N(ε) is a positive integer that has to be 1, p, or p2. If N(ε)≠1 then N(a±b√d) = a2-db2 must be ±1, so a±b√d must be a unit, and both Pi must be non-proper ideals (i. e. equal to the whole ring). Hence N(ε)=1. This will be true also in the other case (when d≡1 (mod 4)), so ±p=a2-db2 or ±4p=a2-db2. Consequently, solvability of the appropriate equation (depending on d mod 4), is a necessary condition for (p) to split or ramify.

So we have a necessary and sufficient condition for (p) to split or ramify in ℚ(√d), in terms of solvability of Diophantine equations, provided Oℚ(√d) is a PID. Since the only other possibility is for (p) to be inert, we also have a necessary and sufficient condition for that.

However, still assuming that Oℚ(√d) is a PID, we can find a further necessary condition for (p) to split or ramify. Take those equations we just found and reduce them modulo p. Then both equations become a2≡db2 (mod p). Since p is prime, ℤ/pℤ is a field. Assume first that b≢0 (mod p). Then b has an inverse in the finite field. So we have d≡(a/b)2 (mod p). In other words, d is a square mod p. This is the additional necessary condition we were looking for on p in order for (p) to split or ramify. Since it’s a necessary condition, if d is not a square mod p, then (p) must not split or ramify, and thus p is inert. And so, for d to be a non-square mod p is a sufficient condition for p to be inert.

(What if b≡0 (mod p)? Then b=b1p. So ±p = a2 – (b1p)2d or else ±4p = a2 – (b1p)2d. Either way, p∣a, hence p2 divides the right side of either equation, and hence the left side also. But that’s not possible unless p=2 – which is always a special case.)

To summarize, then, let p≠2 be prime and d square-free and not 0 or 1. Then the solvability of ±tp=a2-db2 (where t is 4 or 1 according as d≡1 (mod 4) or not), is sufficient for (p) to split or ramify. And if the integers of ℚ(√d) are a PID, then solvability of the appropriate equation provides a necessary and sufficient condition for (p) to split or ramify. Further, in that case, d being a square mod p is necessary for (p) to split or ramify.

Remarkably, d being a square mod p is a necessary and sufficient condition for (p) to split or ramify, even if the integers of ℚ(√d) aren’t a PID, but that’s harder to prove. Since solvability of Diophantine equations is generally not obvious by inspection, it’s very convenient to have a necessary and sufficient conditions for (p) to split or ramify simply in terms of the properties of d mod p.

In the next installment, which hopefully will not be as long in coming as this one, we’ll show a much cleaner way to state necessary and sufficient conditions for (p) to split, ramify, or remain inert, in the case of any quadratic extension of ℚ, whether or not the ring of integers is a PID. This will be done in terms of what has long been called a “reciprocity law”.

However, that will be only the beginning. It turns out that there are far more general kinds of reciprocity laws for many other types of field extensions. That’s what “class field theory” is all about, and why the whole subject is so appealing, once you get the basic ideas.


Posted in Algebraic number theory, Number theory | Leave a comment