Algebraic number theory, which is the subject we are laying the groundwork for writing about, is the theory of numbers that are solutions of certain types of polynomial equations. So we need to have a little chat about different types of numbers we may encounter. Much of this will be familiar to people who have paid attention in high school and college math classes. But even for these folks, the review may help refresh some memories.

The most intuitively “natural” sort of numbers are the “counting numbers”: 1, 2, 3, etc. The precise mathematical term for them, in fact, is **natural numbers**. 0 is also considered a natural number, though the concept was invented in India and seems to have been unknown to the ancient Greeks.

The next most “natural” sort of number includes the negatives of all natural numbers. Collectively all natural numbers and their negatives are known as **integers**. Mathematicians use the symbol ℤ to denote the set of all integers. The idea of negative numbers seems to have existed in China before 400 CE. The Chinese had specific tools for reckoning with negative quantities (e. g. debts), but they had even less algebra than the Greeks. For their part, the Greeks seem to have had no concept of negative quantities as such. Negative numbers may have made their first appearance in the written record in the work of the Indian mathematician Brahmagupta early in the 7th century CE. He seems to have been the first to use both 0 and negative numbers systematically, and even recognized that a negative number could be the root of a quadratic equation. (For instance, both +2 and -2 are solutions of x^{2}-4=0.) But since it was not easy to see a negative number of tangible things or count with negative numbers on one’s fingers, suspicion of them as mere fictions was widespread for centuries in the West. (Just as many today still regard “imaginary” numbers with deep suspicion.)

If the concept of symbolic equations involving unknown quantities had been more well understood, negative numbers would have been accepted much more readily. They provide a means, after all, of solving even the simplest equations, such as x+1=0, a first degree equation in which all the coefficients are natural numbers.

The operation of division is the inverse of multiplication, and so the reciprocal of a nonzero number n is 1/n — 1 divided by n. Negative numbers are merely formed using subtraction (the inverse of addition), since -n = 0-n. So it’s curious that the Greeks didn’t think of negative numbers, though they, and other ancient people, did embrace fractions readily. One assumes this is because fractions arise naturally in geometry, measurement, commerce, and so forth. Fractions are just ratios of two natural numbers, in the form a/b for positive integers a and b, and so they are called **rational** numbers. If the numbers involved were allowed to be negative as well, the rational numbers too could be negative. Mathematicians use ℚ for the set of all such rational numbers. If the Greeks *had* been more capable of thinking abstractly in terms of solutions to equations, it would have been easy to define rational numbers as possible solutions to any linear equation of the form ax+b=0, where a (≠ 0) and b are integers.

Geometry was the most developed form of mathematics in ancient Greece, so it was natural to think of numbers (apart from simple counting) as the lengths of lines, areas of circles, volumes of solids, etc. In other words, it was easy to perceive that arithmetic rules of working with counting numbers behaved in the same way as rules for adding and subtracting the lengths of lines, or computing areas and volumes by multiplication and division. It looked as though, perhaps, all numbers of any consequence should be rational. It thus came as a shocking revelation to the classical Greeks that there were “numbers” that could occur as lenghts of lines in a geometric figure which could *not* be rational numbers. A proof of this was discovered by followers of Pythagoras, specifically that the length of the diagonal of a square whose sides had 1 unit of length could *not* be a rational number. In modern notation this length is simply √2.

The proof that √2 is not rational is simple. Suppose it were rational. Then √2 = a/b for natural numbers a and b. Hence a^{2} = 2b^{2}. We may suppose that the fraction is in lowest terms, so that a and b have no whole number factors in common. (Otherwise, just divide those out.) a^{2} is clearly an even whole number, so a must be even also. (If 2 divides a^{2}, it has to divide a as well, by the rule of **unique factorization** into prime numbers, also known as the **fundamental theorem of arithmetic**. As we shall see, this can be proven fairly easily.) So a is divisible by 2; say a = 2A. Then 4A^{2} = 2b^{2}, hence 2A^{2} = b^{2}. It follows that b^{2} is even, so b must be also. But that is a contradiction, since we could assume a and b had no factor in common. This contradiction means that the original assumption that √2 was rational must be false. QED.

This was so shocking to its discoverers that everyone who learned of it was pledged to secrecy. After all, “not rational”, or “irrational” meant to the Greeks (just as in English) “unreasonable”. This linguistic fluke suggested that the whole field of endeavor of Greek mathematicians was deeply flawed, so it would be devastating to their prestige if this notion became widely known.

In truth, there is nothing inherently contradictory or unreasonable about “irrational” numbers. They simply are not ratios of integers, but they can occur as solutions of polynomial equations with rational coefficients: for example, ±√2, which are solutions of x^{2}-2 = 0. Numbers of this sort are called **algebraic numbers**, for obvious reasons. This class of algebraic numbers is the principal subject dealt with in **algebraic number theory**.

Algebraic numbers clearly exist, since the length of the diagonal of a unit square is certainly a meaningful concept. We’ve just seen the proof that some algebraic rational numbers are not rational. What are they then? In some sense, answering this question is what the subject of algebraic number theory is largely about. The theory attempts to say what they are in terms of mathematical properties they have. We will be spending most of our time on this issue.

Before we dive into that, let’s look at the broader context. Recall the result Gauss proved in his thesis, the fundamental theorem of algebra. This theorem is about the roots of a polynomial equation of the form

a

_{n}x^{n}+ a_{n-1}x^{n-1}+ … + a_{1}x + a_{0}= 0

where n is a positive integer, x is an “unknown”, a_{n} ≠ 0, and for 0≤j≤n all a_{j} are rational (symbolically, a_{j}∈ℚ). Such an equation, as we noted, is said to be of **degree** n. This can be simplified a little, because if a_{n} ≠ 0, then we can divide both sides of the equation by it, without affecting any of the solutions of the equation (known as **roots**), and therefore assume that the coefficient of x^{n} is 1. The polynomial in such an equation is said to be **monic**. For simplicity, we often write the polynomial part of an equation as f(x) so that the equation is f(x) = 0.

What Gauss actually proved is that the polynomial in this equation factors completely into polynomials which have degree at most 2 — even if the coefficients are any **real numbers**, not necessarily rationals. Hence there can be at most as many real roots as the degree of the polynomial. Intuitively, a real number is one that can be represented in decimal form as a whole number plus a fractional part that is an infinite series of decimal digits. The decimal digits in the fractional part may or may not repeat in groups, from a certain point on. (For instance, .000123123123… is a repeating decimal, while the fractional parts of the numbers √2 and π never repeat.) It is not hard to show that a number is rational if, and only if, its fractional part is a repeating decimal. Thus the rational numbers form a subset of all real numbers. The set of all real numbers is denoted by ℝ.

Irrational numbers such as √2 are examples of real numbers which are not rational. However, √2 is an algebraic number because it is a root of x^{2}-2=0. Yet not all real numbers are algebraic. In fact, there are vastly more real numbers than there are algebraic numbers. In some sense, there are just as many rational numbers (or even integers) as there are algebraic numbers, because the set of all algebraic numbers can be put into a 1-to-1 correspondence with either ℤ or ℚ. (Algebraic numbers may actually be “complex”, as will be discussed shortly, but for now just think about *real* algebraic numbers.) All of these sets are subsets of ℝ that are strictly smaller than ℝ, because they cannot be put into 1-to-1 correspondence with all of ℝ. (Proof: If **A** is the set of all (real or complex) algebraic numbers, we can show it has a 1:1 correspondence with positive integers. Any α∈**A** satisfies f(α)=0 for a polynomial f(x) with integer coefficients. For any integer N, there are only finitely many polynomials of degree ≤ N with integer coefficients ≤ N in absolute value, and each such polynomial has at most N roots. In fact there are at most M = (2N+1)^{N+1} such polynomials, hence at most NM roots, so all those roots correspond to a finite number of distinct integers. Hence one can, in principle, write down all members of **A** in some order. Now we can define a new number r as the number whose n^{th} digit to the right of the decimal point is one more than the corresponding digit of the n^{th} member of **A** in the list (or 0 if that digit is 9). r is therefore a real number which cannot appear anywhere in the list of **A**, since it differs from every one of them in at least one place. So there must be elements of ℝ that aren’t algebraic.)

A real number which is not algebraic is said to be **transcendental**. Curiously, even though there is a vast quantity of transcendental numbers, it is quite difficult to prove that specific numbers, such as π, are transcendental. In fact, it was not until 1873 that a “familiar” number (e, the base of the natural logarithms in this case) was shown to be transcendental, by Charles Hermite. π was shown to be transcendental in 1882, by C. F. Lindemann.

Let’s return to Gauss’ fundamental theorem of algebra. It is now possible to prove something more general than what Gauss initially showed. Namely, we can work in any algebraic structure called a **field**. And in an appropriate field, every polynomial of degree n has exactly n roots (counting repeated roots the appropriate number of times). We’ll explain more carefully later what a field is, but for now just accept that it is any system of objects (like numbers) that allow the arithmetic operations of addition, subtraction, multiplication, and division (except division by 0) following rules just like those of rational or real numbers. In this case, let F be any field, and f(x) be a monic polynomial with coefficients in F. Then it is possible to construct a slightly larger field E that contains F by adding certain new elements which are defined by simple polynomial relations. For instance, if F=ℚ, we can add or **adjoin** another element which we will denote by α and which has the property that α^{2}=2. This new field, which we denote by F(α), consists of all possible sums and differences of α with elements of F, as well as all products and quotients of such expressions. There are standard ways to do this construction rigorously and to prove that the result E=F(α) is a field, called an **extension field**. F is said to be a **subfield** of E. (Note that as sets, F⊆E.) You may think of the extension E, if you wish, as a collection of formal expressions of sums, differences, products, and quotients involving α and elements of F, always simplified by using the relationship α^{2}=2 whenever possible. That is, always replace α^{2} by 2 whenever it occurs.

Given all this, it can be shown that there is one root of f(x)=0 in some extension field of the field F that contains the coefficients of f(x). Call this root α, so that f(α)=0. With polynomials, there is a process very much like long division of integers which allows one to compute the quotient of f(x) divided by x-α, yielding another polynomial g(x) = f(x)/(x-α). This algorithm guarantees the coefficients of g(x) are in E=F(α) if the coefficients of f(x) are. (In particular, if the coefficients are actually in F.) Consequently, f(x) = (x-α)g(x), where g(x) is monic and has degree exactly one less than that of f(x). We can repeat this process with g(x), and so after exactly n steps, we will arrive at a complete factorization of f(x) into linear factors with coefficients (the constant terms) that are in some extension field of F. We might have to adjoin n different symbols (the roots of f(x)), but at least it can be done. (In fact, it can be shown there is a single additional element θ, called a **primitive element**, or a **generator** of the field, which is the only element that needs to be adjoined to F to produce an extension field E=F(θ) in which f(x) splits into linear factors. In other words, this field E contains all the roots of f(x)=0.)

Note that unlike other sorts of numbers we considered before, the “numbers” in an extension field of ℚ may be somewhat abstract objects, such as formal expressions. They certainly can’t all be just expressions involving radicals, if the degree of the lowest degree polynomial they satisfy is 5 or more (as Abel and Galois proved). Nevertheless, as long as they are elements of ℝ, i. e. real algebraic numbers, they still “make sense”, say, as the length of a geometric object.

The real numbers themselves are rather abstract objects when one tries to construct them rigorously. There are ways to do this other than using decimal expansions. One such method, involving set theory, is called the method of **Dedekind cuts**, after its inventor Richard Dedekind (1831-1916). (Dedekind’s name will come up again, because he was one of the primary contributors to algebraic number theory.) More generally, we can adjoin to ℚ all possible **limits** of (convergent) **sequences** {a_{n}} of rational numbers to form the **completion** of ℚ considered as a **metric space**. We won’t attempt to describe these abstract constructions further. The point is that once one goes beyond the field ℚ of rationals, larger fields consist of objects which are somewhat more abstract — and to an extent arbitrary, subject only to the rules which define a field.

A perfect example of this is the field of **complex numbers**, which is obtained by adjoining the element i to the field ℝ of real numbers, subject only to the relation i^{2}=-1. So we can say that i=√-1. What is i “actually”? It doesn’t matter. The only thing one needs to know is i^{2}=-1. There’s no cause for any skepticism about such **imaginary** numbers. Their existence is just as secure as any other abstract object of modern mathematics. If we adjoin i to ℝ the field ℂ = ℝ(i) of complex numbers is what we get.

Another way to describe ℂ is as the set of all “numbers” of the form a+bi with a,b∈ℝ, i. e. ℂ = {a+bi | a,b∈ℝ}. Addition and multiplication are defined on this set by the rules (a+bi)+(c+di) = (a+c)+(b+d)i, and (a+bi)×(c+di) = (ac-bd)+(bc+ad)i. This is very much as if i were an “unknown” symbol like x, except that we always simplify expressions by using the relation i^{2}=-1.

There are other ways to think of this field. For instance, we can take it to be the set of all ordered pairs {(a,b) | a,b∈ℝ} where addition and multiplication are given by (a,b)+(c,d) = (a+c,b+d) and (a,b)×(c,d) = (ac-bd,bc+ad), as suggested by the preceding paragraph. In this notation, it is apparent that ℂ is “nothing but” the Cartesian plane ℝ×ℝ with a peculiar sort of multiplication. (Indeed, topologically, ℂ and ℝ×ℝ are the same.)

One of the requirements of a field is that division by any element of the field except 0 is always possible — that is, all nonzero elements have a **multiplicative inverse**. What is 1/(a+bi), the inverse of a+bi? First, we use the notation (a+bi)* = a-bi for the operation of **complex conjugation**. a-bi is said to be the **complex conjugate** of a+bi. This is used quite frequently. Next we note that (a+bi)×(a+bi)* = a^{2} + b^{2}, a non-negative real number that is 0 if and only if a=b=0. So the square root of this is a real number, and we use the notation |a+bi| = √((a+bi)×(a+bi)*). This is called the **norm** of the complex number a+bi. It follows that if a+bi≠0, then its inverse is given by 1/(a+bi) = (a+bi)*/|a+bi|^{2}.

Just a little more terminology and we can move on. The set of all polynomials in one variable that have coefficients in a field F is denoted by F[x]. A polynomial f(x)∈F[x] is said to be **irreducible** over F if it has no factors other than 1 and itself belonging to F[x]. An irreducible polynomial is completely analogous to a prime number in the integers. Suppose an element α is a member of some extension E of F. f(x)∈F[x] is said to be a **minimal polynomial** for α if f(α) = 0 and this is true of no polynomial in F[x] that has degree less than that of f(x). It’s easy to show that when f(α)=0, f(x) is a minimal polynomial for α if and only if f(x) is irreducible over F. α is said to have **degree** n over F if n is the degree of its minimal polynomial. The **degree of an extension** E⊇F, denoted by [E:F], can be defined in various ways, but what it boils down to is that [E:F] is the degree of a primitive element θ such that E=F(θ). In some ways, a better definition of the degree [E:F] comes about when we regard E as a **vector space** over F. This is a concept from **linear algebra**. In these terms, [E:F] is the **dimension** of E as a vector space over F.

Given all that, we note that [ℂ:ℝ]=2 and that i is a primitive element for the extension ℂ⊇ℝ. ℂ has the very special property of being **algebraically closed**. This means that any polynomial in ℂ[x] factors completely into linear factors in ℂ[x]. In other words, there are no irreducible polynomials in ℂ[x] having degree more than 1, and all roots of any f(x)∈ℂ[x] actually lie in ℂ itself. These facts follow from Gauss’ fundamental theorem of algebra. (ℂ does have extensions of infinite degree, such as the field of **rational functions of one variable**, but we won’t go into that now.)

In order to avoid ambiguity, whenever discussing extension fields of some field F, we always assume the extensions are subfields of some fixed algebraically closed field that contains F. A smallest such field is known as an **algebraic closure** of F. For instance, ℂ is an algebraic closure of ℝ. ℚ has an algebraic closure (the field of all algebraic numbers) contained in ℂ that is of infinite degree over ℚ, but much smaller than ℂ itself. (For instance, the algebraic closure of ℚ contains no transcendental (i. e. non-algebraic) numbers.)

We’ve now given an overview, in fairly concrete terms, of the kind of numbers that occur in algebraic number theory. The next installment will be a discussion of **Diophantine equations**. These can be understood in very elementary terms, but actually solving them in many cases requires algebraic number theory, and is one of the principal motivations of the theory.

Tags: algebraic number theory