# Terrence Tao

*Ažurirano:*prije 23 tjedna 3 dana

### Goursat and Furstenberg-Weiss type lemmas

I’m collecting in this blog post a number of simple group-theoretic lemmas, all of the following flavour: if is a subgroup of some product of groups, then one of three things has to happen:

- ( too small) is contained in some proper subgroup of , or the elements of are constrained to some sort of equation that the full group does not satisfy.
- ( too large) contains some non-trivial normal subgroup of , and as such actually arises by pullback from some subgroup of the quotient group .
- (Structure) There is some useful structural relationship between and the groups .

It is perhaps easiest to explain the flavour of these lemmas with some simple examples, starting with the case where we are just considering subgroups of a single group .

**Lemma 1** Let be a subgroup of a group . Then exactly one of the following hold:

- (i) ( too small) There exists a non-trivial group homomorphism into a group such that for all .
- (ii) ( normally generates ) is generated as a group by the conjugates of .

*Proof:* Let be the group normally generated by , that is to say the group generated by the conjugates of . This is a normal subgroup of containing (indeed it is the smallest such normal subgroup). If is all of we are in option (ii); otherwise we can take to be the quotient group and to be the quotient map. Finally, if (i) holds, then all of the conjugates of lie in the kernel of , and so (ii) cannot hold.

Here is a “dual” to the above lemma:

**Lemma 2** Let be a subgroup of a group . Then exactly one of the following hold:

- (i) ( too large) is the pullback of some subgroup of for some non-trivial normal subgroup of , where is the quotient map.
- (ii) ( is core-free) does not contain any non-trivial conjugacy class .

*Proof:* Let be the normal core of , that is to say the intersection of all the conjugates of . This is the largest normal subgroup of that is contained in . If is non-trivial, we can quotient it out and end up with option (i). If instead is trivial, then there is no non-trivial element that lies in the core, hence no non-trivial conjugacy class lies in and we are in option (ii). Finally, if (i) holds, then every conjugacy class of an element of is contained in and hence in , so (ii) cannot hold.

For subgroups of nilpotent groups, we have a nice dichotomy that detects properness of a subgroup through abelian representations:

**Lemma 3** Let be a subgroup of a nilpotent group . Then exactly one of the following hold:

- (i) ( too small) There exists non-trivial group homomorphism into an abelian group such that for all .
- (ii) .

Informally: if is a variable ranging in a subgroup of a nilpotent group , then either is unconstrained (in the sense that it really ranges in all of ), or it obeys some abelian constraint .

*Proof:* By definition of nilpotency, the lower central series

Since is a normal subgroup of , is also a subgroup of . Suppose first that is a proper subgroup of , then the quotient map is a non-trivial homomorphism to an abelian group that annihilates , and we are in option (i). Thus we may assume that , and thus

Note that modulo the normal group , commutes with , hence and thus We conclude that . One can continue this argument by induction to show that for every ; taking large enough we end up in option (ii). Finally, it is clear that (i) and (ii) cannot both hold.
**Remark 4** When the group is locally compact and is closed, one can take the homomorphism in Lemma 3 to be continuous, and by using Pontryagin duality one can also take the target group to be the unit circle . Thus is now a character of . Similar considerations hold for some of the later lemmas in this post. Discrete versions of this above lemma, in which the group is replaced by some orbit of a polynomial map on a nilmanifold, were obtained by Leibman and are important in the equidistribution theory of nilmanifolds; see this paper of Ben Green and myself for further discussion.

Here is an analogue of Lemma 3 for special linear groups, due to Serre (IV-23):

**Lemma 5** Let be a prime, and let be a closed subgroup of , where is the ring of -adic integers. Then exactly one of the following hold:

- (i) ( too small) There exists a proper subgroup of such that for all .
- (ii) .

*Proof:* It is a standard fact that the reduction of mod is , hence (i) and (ii) cannot both hold.

Suppose that (i) fails, then for every there exists such that , which we write as

We now claim inductively that for any and , there exists with ; taking limits as using the closed nature of will then place us in option (ii).The case is already handled, so now suppose . If , we see from the case that we can write where and . Thus to establish the claim it suffices to do so under the additional hypothesis that .

First suppose that for some with . By the case, we can find of the form for some . Raising to the power and using and , we note that , giving the claim in this case.

Any matrix of trace zero with coefficients in is a linear combination of , , and is thus a sum of matrices that square to zero. Hence, if is of the form , then for some matrix of trace zero, and thus one can write (up to errors) as the finite product of matrices of the form with . By the previous arguments, such such matrix lies in up to errors, and hence does also. This completes the proof of the case.

Now suppose and the claim has already been proven for . Arguing as before, it suffices to close the induction under the additional hypothesis that , thus we may write . By induction hypothesis, we may find with . But then , and we are done.

We note a generalisation of Lemma 3 that involves two groups rather than just one:

**Lemma 6** Let be a subgroup of a product of two nilpotent groups . Then exactly one of the following hold:

- (i) ( too small) There exists group homomorphisms , into an abelian group , with non-trivial, such that for all , where is the projection of to .
- (ii) for some subgroup of .

*Proof:* Consider the group . This is a subgroup of . If it is all of , then must be a Cartesian product and option (ii) holds. So suppose that this group is a proper subgroup of . Applying Lemma 3, we obtain a non-trivial group homomorphism into an abelian group such that whenever . For any in the projection of to , there is thus a unique quantity such that whenever . One easily checcks that is a homomorphism, so we are in option (i).

Finally, it is clear that (i) and (ii) cannot both hold, since (i) places a non-trivial constraint on the second component of an element of for any fixed choice of .

We also note a similar variant of Lemma 5, which is Lemme 10 of this paper of Serre:

**Lemma 7** Let be a prime, and let be a closed subgroup of . Then exactly one of the following hold:

- (i) ( too small) There exists a proper subgroup of such that for all .
- (ii) .

*Proof:* As in the proof of Lemma 5, (i) and (ii) cannot both hold. Suppose that (i) does not hold, then for any there exists such that . Similarly, there exists with . Taking commutators of and , we can find with . Continuing to take commutators with and extracting a limit (using compactness and the closed nature of ), we can find with . Thus, the closed subgroup of does not obey conclusion (i) of Lemma 5, and must therefore obey conclusion (ii); that is to say, contains . Similarly contains ; multiplying, we end up in conclusion (ii).

The most famous result of this type is of course the Goursat lemma, which we phrase here in a somewhat idiosyncratic manner to conform to the pattern of the other lemmas in this post:

**Lemma 8 (Goursat lemma)** Let be a subgroup of a product of two groups . Then one of the following hold:

- (i) ( too small) is contained in for some subgroups , of respectively, with either or (or both).
- (ii) ( too large) There exist normal subgroups of respectively, not both trivial, such that arises from a subgroup of , where is the quotient map.
- (iii) (Isomorphism) There is a group isomorphism such that is the graph of . In particular, and are isomorphic.

Here we almost have a trichotomy, because option (iii) is incompatible with both option (i) and option (ii). However, it is possible for options (i) and (ii) to simultaneously hold.

*Proof:* If either of the projections , from to the factor groups (thus and fail to be surjective, then we are in option (i). Thus we may assume that these maps are surjective.

Next, if either of the maps , fail to be injective, then at least one of the kernels , is non-trivial. We can then descend down to the quotient and end up in option (ii).

The only remaining case is when the group homomorphisms are both bijections, hence are group isomorphisms. If we set we end up in case (iii).

We can combine the Goursat lemma with Lemma 3 to obtain a variant:

**Corollary 9 (Nilpotent Goursat lemma)** Let be a subgroup of a product of two nilpotent groups . Then one of the following hold:

- (i) ( too small) There exists and a non-trivial group homomorphism such that for all .
- (ii) ( too large) There exist normal subgroups of respectively, not both trivial, such that arises from a subgroup of .
- (iii) (Isomorphism) There is a group isomorphism such that is the graph of . In particular, and are isomorphic.

*Proof:* If Lemma 8(i) holds, then by applying Lemma 3 we arrive at our current option (i). The other options are unchanged from Lemma 8, giving the claim.

Now we present a lemma involving three groups that is known in ergodic theory contexts as the “Furstenberg-Weiss argument”, as an argument of this type arose in this paper of Furstenberg and Weiss, though perhaps it also implicitly appears in other contexts also. It has the remarkable feature of being able to enforce the abelian nature of one of the groups once the other options of the lemma are excluded.

**Lemma 10 (Furstenberg-Weiss lemma)** Let be a subgroup of a product of three groups . Then one of the following hold:

- (i) ( too small) There is some proper subgroup of and some such that whenever and .
- (ii) ( too large) There exists a non-trivial normal subgroup of with abelian, such that arises from a subgroup of , where is the quotient map.
- (iii) is abelian.

*Proof:* If the group is a proper subgroup of , then we are in option (i) (with ), so we may assume that

As before, we can combine this with previous lemmas to obtain a variant in the nilpotent case:

**Lemma 11 (Nilpotent Furstenberg-Weiss lemma)** Let be a subgroup of a product of three nilpotent groups . Then one of the following hold:

- (i) ( too small) There exists and group homomorphisms , for some abelian group , with non-trivial, such that whenever , where is the projection of to .
- (ii) ( too large) There exists a non-trivial normal subgroup of , such that arises from a subgroup of .
- (iii) is abelian.

Informally, this lemma asserts that if is a variable ranging in some subgroup , then either (i) there is a non-trivial abelian equation that constrains in terms of either or ; (ii) is not fully determined by and ; or (iii) is abelian.

*Proof:* Applying Lemma 10, we are already done if conclusions (ii) or (iii) of that lemma hold, so suppose instead that conclusion (i) holds for say . Then the group is not of the form , since it only contains those with . Applying Lemma 6, we obtain group homomorphisms , into an abelian group , with non-trivial, such that whenever , placing us in option (i).

The Furstenberg-Weiss argument is often used (though not precisely in this form) to establish that certain key structure groups arising in ergodic theory are abelian; see for instance Proposition 6.3(1) of this paper of Host and Kra for an example.

One can get more structural control on in the Furstenberg-Weiss lemma in option (iii) if one also broadens options (i) and (ii):

**Lemma 12 (Variant of Furstenberg-Weiss lemma)** Let be a subgroup of a product of three groups . Then one of the following hold:

- (i) ( too small) There is some proper subgroup of for some such that whenever . (In other words, the projection of to is not surjective.)
- (ii) ( too large) There exists a normal of respectively, not all trivial, such that arises from a subgroup of , where is the quotient map.
- (iii) are abelian and isomorphic. Furthermore, there exist isomorphisms , , to an abelian group such that

The ability to encode an abelian additive relation in terms of group-theoretic properties is vaguely reminiscent of the group configuration theorem.

*Proof:* We apply Lemma 10. Option (i) of that lemma implies option (i) of the current lemma, and similarly for option (ii), so we may assume without loss of generality that is abelian. By permuting we may also assume that are abelian, and will use additive notation for these groups.

We may assume that the projections of to and are surjective, else we are in option (i). The group is then a normal subgroup of ; we may assume it is trivial, otherwise we can quotient it out and be in option (ii). Thus can be expressed as a graph for some map . As is a group, must be a homomorphism, and we can write it as for some homomorphisms , . Thus elements of obey the constraint .

If or fails to be injective, then we can quotient out by their kernels and end up in option (ii). If fails to be surjective, then the projection of to also fails to be surjective (since for , is now constrained to lie in the range of ) and we are in option (i). Simiarly if fails to be surjective. Thus we may assume that the homomorphisms are bijective and thus group isomorphisms. Setting to the identity, we arrive at option (iii).

Combining this lemma with Lemma 3, we obtain a nilpotent version:

**Corollary 13 (Variant of nilpotent Furstenberg-Weiss lemma)** Let be a subgroup of a product of three groups . Then one of the following hold:

- (i) ( too small) There are homomorphisms , to some abelian group for some , with not both trivial, such that whenever .
- (ii) ( too large) There exists a normal of respectively, not all trivial, such that arises from a subgroup of , where is the quotient map.
- (iii) are abelian and isomorphic. Furthermore, there exist isomorphisms , , to an abelian group such that

Here is another variant of the Furstenberg-Weiss lemma, attributed to Serre by Ribet (see Lemma 3.3):

**Lemma 14 (Serre’s lemma)** Let be a subgroup of a finite product of groups with . Then one of the following hold:

- (i) ( too small) There is some proper subgroup of for some such that whenever .
- (ii) ( too large) One has .
- (iii) One of the has a non-trivial abelian quotient .

*Proof:* The claim is trivial for (and we don’t need (iii) in this case), so suppose that . We can assume that each is a perfect group, , otherwise we can quotient out by the commutator and arrive in option (iii). Similarly, we may assume that all the projections of to , are surjective, otherwise we are in option (i).

We now claim that for any and any , one can find with for and . For this follows from the surjectivity of the projection of to . Now suppose inductively that and the claim has already been proven for . Since is perfect, it suffices to establish this claim for of the form for some . By induction hypothesis, we can find with for and . By surjectivity of the projection of to , one can find with and . Taking commutators of these two elements, we obtain the claim.

Setting , we conclude that contains . Similarly for permutations. Multiplying these together we see that contains all of , and we are in option (ii).

### Is there a non-analytic function with all differences analytic?

I was asked the following interesting question from a bright high school student I am working with, to which I did not immediately know the answer:

**Question 1** Does there exist a smooth function which is not real analytic, but such that all the differences are real analytic for every ?

The hypothesis implies that the Newton quotients are real analytic for every . If analyticity was preserved by smooth limits, this would imply that is real analytic, which would make real analytic. However, we are not assuming any uniformity in the analyticity of the Newton quotients, so this simple argument does not seem to resolve the question immediately.

In the case that is periodic, say periodic with period , one can answer the question in the negative by Fourier series. Perform a Fourier expansion . If is not real analytic, then there is a sequence going to infinity such that as . From the Borel-Cantelli lemma one can then find a real number such that (say) for infinitely many , hence for infinitely many . Thus the Fourier coefficients of do not decay exponentially and hence this function is not analytic, a contradiction.

I was not able to quickly resolve the non-periodic case, but I thought perhaps this might be a good problem to crowdsource, so I invite readers to contribute their thoughts on this problem here. In the spirit of the polymath projects, I would encourage comments that contain thoughts that fall short of a complete solution, in the event that some other reader may be able to take the thought further.

### Boosting the van der Corput inequality using the tensor power trick

In this previous blog post I noted the following easy application of Cauchy-Schwarz:

**Lemma 1 (Van der Corput inequality)**Let be unit vectors in a Hilbert space . Then

*Proof:* The left-hand side may be written as for some unit complex numbers . By Cauchy-Schwarz we have

As a corollary, correlation becomes transitive in a statistical sense (even though it is not transitive in an absolute sense):

**Corollary 2 (Statistical transitivity of correlation)**Let be unit vectors in a Hilbert space such that for all and some . Then we have for at least of the pairs .

*Proof:* From the lemma, we have

One drawback with this corollary is that it does not tell us *which* pairs correlate. In particular, if the vector also correlates with a separate collection of unit vectors, the pairs for which correlate may have no intersection whatsoever with the pairs in which correlate (except of course on the diagonal where they must correlate).

While working on an ongoing research project, I recently found that there is a very simple way to get around the latter problem by exploiting the tensor power trick:

**Corollary 3 (Simultaneous statistical transitivity of correlation)**Let be unit vectors in a Hilbert space for and such that for all , and some . Then there are at least pairs such that . In particular (by Cauchy-Schwarz) we have for all .

*Proof:* Apply Corollary 2 to the unit vectors and , in the tensor power Hilbert space .

It is surprisingly difficult to obtain even a qualitative version of the above conclusion (namely, if correlates with all of the , then there are many pairs for which correlates with for all simultaneously) without some version of the tensor power trick. For instance, even the powerful Szemerédi regularity lemma, when applied to the set of pairs for which one has correlation of , for a single , does not seem to be sufficient. However, there is a reformulation of the argument using the Schur product theorem as a substitute for (or really, a disguised version of) the tensor power trick. For simplicity of notation let us just work with real Hilbert spaces to illustrate the argument. We start with the identity

where is the orthogonal projection to the complement of . This implies a Gram matrix inequality for each where denotes the claim that is positive semi-definite. By the Schur product theorem, we conclude that and hence for a suitable choice of signs , One now argues as in the proof of Corollary 2.A separate application of tensor powers to amplify correlations was also noted in this previous blog post giving a cheap version of the Kabatjanskii-Levenstein bound, but this seems to not be directly related to this current application.

### The inclusion-exclusion principle for commuting projections

The (classical) Möbius function is the unique function that obeys the classical Möbius inversion formula:

**Proposition 1 (Classical Möbius inversion)** Let be functions from the natural numbers to an additive group . Then the following two claims are equivalent:

- (i) for all .
- (ii) for all .

There is a generalisation of this formula to (finite) posets, due to Hall, in which one sums over chains in the poset:

**Proposition 2 (Poset Möbius inversion)** Let be a finite poset, and let be functions from that poset to an additive group . Then the following two claims are equivalent:

- (i) for all , where is understood to range in .
- (ii) for all , where in the inner sum are understood to range in with the indicated ordering.

Comparing Proposition 2 with Proposition 1, it is natural to refer to the function as the Möbius function of the poset; the condition (ii) can then be written as

*Proof:*If (i) holds, then we have for any . Iterating this we obtain (ii). Conversely, from (ii) and separating out the term, and grouping all the other terms based on the value of , we obtain (1), and hence (i).

In fact it is not completely necessary that the poset be finite; an inspection of the proof shows that it suffices that every element of the poset has only finitely many predecessors .

It is not difficult to see that Proposition 2 includes Proposition 1 as a special case, after verifying the combinatorial fact that the quantity

is equal to when divides , and vanishes otherwise.I recently discovered that Proposition 2 can also lead to a useful variant of the inclusion-exclusion principle. The classical version of this principle can be phrased in terms of indicator functions: if are subsets of some set , then

In particular, if there is a finite measure on for which are all measurable, we haveOne drawback of this formula is that there are exponentially many terms on the right-hand side: of them, in fact. However, in many cases of interest there are “collisions” between the intersections (for instance, perhaps many of the pairwise intersections agree), in which case there is an opportunity to collect terms and hopefully achieve some cancellation. It turns out that it is possible to use Proposition 2 to do this, in which one only needs to sum over chains in the resulting poset of intersections:

**Proposition 3 (Hall-type inclusion-exclusion principle)** Let be subsets of some set , and let be the finite poset formed by intersections of some of the (with the convention that is the empty intersection), ordered by set inclusion. Then for any , one has

Using the Möbius function on the poset , one can write these formulae as

and
*Proof:* It suffices to establish (2) (to derive (3) from (2) observe that all the are contained in one of the , so the effect of may be absorbed into ). Applying Proposition 2, this is equivalent to the assertion that

**Example 4** If with , and are all distinct, then we have for any finite measure on that makes measurable that

**Example 5 (Variant of Legendre sieve)** If are natural numbers, and is some sequence of complex numbers with only finitely many terms non-zero, then by applying the above proposition to the sets and with equal to counting measure weighted by the we obtain a variant of the Legendre sieve

If the poset has bounded depth then the number of terms in Proposition 3 can end up being just polynomially large in rather than exponentially large. Indeed, if all chains in have length at most then the number of terms here is at most . (The examples (4), (5) are ones in which the depth is equal to two.) I hope to report in a later post on how this version of inclusion-exclusion with polynomially many terms can be useful in an application.

Actually in our application we need an abstraction of the above formula, in which the indicator functions are replaced by more abstract idempotents:

**Proposition 6 (Hall-type inclusion-exclusion principle for idempotents)** Let be pairwise commuting elements of some ring with identity, which are all idempotent (thus for ). Let be the finite poset formed by products of the (with the convention that is the empty product), ordered by declaring when (note that all the elements of are idempotent so this is a partial ordering). Then for any , one has

Morally speaking this proposition is equivalent to the previous one after applying a “spectral theorem” to simultaneously diagonalise all of the , but it is quicker to just adapt the previous proof to establish this proposition directly. Using the Möbius function for , we can rewrite these formulae as

and
*Proof:* Again it suffices to verify (6). Using Proposition 2 as before, it suffices to show that

### Job advertisement: Director of UCLA Endowed Olga Radko Math Circle (ORMC)

*[I am posting this advertisement in my capacity as chair of the Steering Committee for the UCLA Endowed Olga Radko Math Circle – T.]*

The Department of Mathematics at the University of California, Los Angeles, is inviting applications for the position of an Academic Administrator who will serve as the Director of the UCLA Endowed Olga Radko Math Circle (ORMC). The Academic Administrator will have the broad responsibility for administration of the ORMC, an outreach program with weekly activities for mathematically inclined students in grades K-12. Currently, over 300 children take part in the program each weekend. Instruction is delivered by a team of over 50 docents, the majority of whom are UCLA undergraduate and graduate students.

The Academic Administrator is required to teach three mathematics courses in the undergraduate curriculum per academic year as assigned by the Department. This is also intended to help with the recruitment of UCLA students as docents and instructors for the ORMC.

As the director of ORMC, the Academic Administrator will have primary responsibility for all aspects of ORMC operations:

- Determining the structure of ORMC, including the number and levels of groups
- Recruiting, training and supervising instructors, docents, and postdoctoral fellows associated with the ORMC
- Developing curricular materials and providing leadership in development of innovative ways of explaining mathematical ideas to school children
- Working with the Mathematics Department finance office to ensure timely payment of stipends and wages to ORMC instructors and docents, as appropriate
- Maintaining ORMC budget and budgetary projections, ensuring that the funds are used appropriately and efficiently for ORMC activities, and applying for grants as appropriate to fund the operations of ORMC
- Working with the Steering Committee and UCLA Development to raise funds for ORMC, both from families whose children participate in ORMC and other sources
- Admitting students to ORMC, ensuring appropriate placement, and working to maintain a collegial and inclusive atmosphere conducive to learning for all ORMC attendees
- Reporting to and working with the ORMC Steering Committee throughout the year

A competitive candidate should have leadership potential and experience with developing mathematical teaching materials for the use of gifted school children, as well as experience with teaching undergraduate mathematics courses. Candidates must have a Ph.D. degree (or equivalent) or expect to complete their Ph.D. by June 30, 2021.

Applications should be received by March 15, 2021. Further details on the position and the application process can be found at the application page.

### 246B, Notes 4: The Riemann zeta function and the prime number theorem

Previous set of notes: Notes 3. Next set of notes: 246C Notes 1.

One of the great classical triumphs of complex analysis was in providing the first complete proof (by Hadamard and de la Vallée Poussin in 1896) of arguably the most important theorem in analytic number theory, the prime number theorem:

**Theorem 1 (Prime number theorem)** Let denote the number of primes less than a given real number . Then

(Actually, it turns out to be slightly more natural to replace the approximation in the prime number theorem by the logarithmic integral , which turns out to be a more precise approximation, but we will not stress this point here.)

The complex-analytic proof of this theorem hinges on the study of a key meromorphic function related to the prime numbers, the Riemann zeta function . Initially, it is only defined on the half-plane :

**Definition 2 (Riemann zeta function, preliminary definition)** Let be such that . Then we define

Note that the series is locally uniformly convergent in the half-plane , so in particular is holomorphic on this region. In previous notes we have already evaluated some special values of this function:

However, it turns out that the zeroes (and pole) of this function are of far greater importance to analytic number theory, particularly with regards to the study of the prime numbers.The Riemann zeta function has several remarkable properties, some of which we summarise here:

**Theorem 3 (Basic properties of the Riemann zeta function)**

- (i) (Euler product formula) For any with , we have where the product is absolutely convergent (and locally uniform in ) and is over the prime numbers .
- (ii) (Trivial zero-free region) has no zeroes in the region .
- (iii) (Meromorphic continuation) has a unique meromorphic continuation to the complex plane (which by abuse of notation we also call ), with a simple pole at and no other poles. Furthermore, the Riemann xi function is an entire function of order (after removing all singularities). The function is an entire function of order one after removing the singularity at .
- (iv) (Functional equation) After applying the meromorphic continuation from (iii), we have for all (excluding poles). Equivalently, we have for all . (The equivalence between the (5) and (6) is a routine consequence of the Euler reflection formula and the Legendre duplication formula, see Exercises 26 and 31 of Notes 1.)

*Proof:* We just prove (i) and (ii) for now, leaving (iii) and (iv) for later sections.

The claim (i) is an encoding of the fundamental theorem of arithmetic, which asserts that every natural number is uniquely representable as a product over primes, where the are natural numbers, all but finitely many of which are zero. Writing this representation as , we see that

whenever , , and consists of all the natural numbers of the form for some . Sending and to infinity, we conclude from monotone convergence and the geometric series formula that whenever is real, and then from dominated convergence we see that the same formula holds for complex with as well. Local uniform convergence then follows from the product form of the Weierstrass -test (Exercise 19 of Notes 1).The claim (ii) is immediate from (i) since the Euler product is absolutely convergent and all terms are non-zero.

We remark that by sending to in Theorem 3(i) we conclude that

and from the divergence of the harmonic series we then conclude Euler’s theorem . This can be viewed as a weak version of the prime number theorem, and already illustrates the potential applicability of the Riemann zeta function to control the distribution of the prime numbers.The meromorphic continuation (iii) of the zeta function is initially surprising, but can be interpreted either as a manifestation of the extremely regular spacing of the natural numbers occurring in the sum (1), or as a consequence of various integral representations of (or slight modifications thereof). We will focus in this set of notes on a particular representation of as essentially the Mellin transform of the theta function that briefly appeared in previous notes, and the functional equation (iv) can then be viewed as a consequence of the modularity of that theta function. This in turn was established using the Poisson summation formula, so one can view the functional equation as ultimately being a manifestation of Poisson summation. (For a direct proof of the functional equation via Poisson summation, see these notes.)

Henceforth we work with the meromorphic continuation of . The functional equation (iv), when combined with special values of such as (2), gives some additional values of outside of its initial domain , most famously

If one*formally*compares this formula with (1), one arrives at the infamous identity although this identity has to be interpreted in a suitable non-classical sense in order for it to be rigorous (see this previous blog post for further discussion).

From Theorem 3 and the non-vanishing nature of , we see that has simple zeroes (known as *trivial zeroes*) at the negative even integers , and all other zeroes (the *non-trivial zeroes*) inside the *critical strip* . (The non-trivial zeroes are conjectured to all be simple, but this is hopelessly far from being proven at present.) As we shall see shortly, these latter zeroes turn out to be closely related to the distribution of the primes. The functional equation tells us that if is a non-trivial zero then so is ; also, we have the identity

*critical line*. We have the following infamous conjecture:

**Conjecture 4 (Riemann hypothesis)** All the non-trivial zeroes of lie on the critical line .

This conjecture would have many implications in analytic number theory, particularly with regard to the distribution of the primes. Of course, it is far from proven at present, but the partial results we have towards this conjecture are still sufficient to establish results such as the prime number theorem.

Return now to the original region where . To take more advantage of the Euler product formula (3), we take complex logarithms to conclude that

for suitable branches of the complex logarithm, and then on taking derivatives (using for instance the generalised Cauchy integral formula and Fubini’s theorem to justify the interchange of summation and derivative) we see that From the geometric series formula we have and so (by another application of Fubini’s theorem) we have the identity for , where the von Mangoldt function is defined to equal whenever is a power of a prime for some , and otherwise. The contribution of the higher prime powers is negligible in practice, and as a first approximation one can think of the von Mangoldt function as the indicator function of the primes, weighted by the logarithm function.The series and that show up in the above formulae are examples of Dirichlet series, which are a convenient device to transform various sequences of arithmetic interest into holomorphic or meromorphic functions. Here are some more examples:

**Exercise 5 (Standard Dirichlet series)** Let be a complex number with .

- (i) Show that .
- (ii) Show that , where is the divisor function of (the number of divisors of ).
- (iii) Show that , where is the Möbius function, defined to equal when is the product of distinct primes for some , and otherwise.
- (iv) Show that , where is the Liouville function, defined to equal when is the product of (not necessarily distinct) primes for some .
- (v) Show that , where is the holomorphic branch of the logarithm that is real for , and with the convention that vanishes for .
- (vi) Use the fundamental theorem of arithmetic to show that the von Mangoldt function is the unique function such that for every positive integer . Use this and (i) to provide an alternate proof of the identity (8). Thus we see that (8) is really just another encoding of the fundamental theorem of arithmetic.

Given the appearance of the von Mangoldt function , it is natural to reformulate the prime number theorem in terms of this function:

**Theorem 6 (Prime number theorem, von Mangoldt form)** One has

Let us see how Theorem 6 implies Theorem 1. Firstly, for any , we can write

The sum is non-zero for only values of , and is of size , thus Since , we conclude from Theorem 6 that as . Next, observe from the fundamental theorem of calculus that Multiplying by and summing over all primes , we conclude that From Theorem 6 we certainly have , thus By splitting the integral into the ranges and we see that the right-hand side is , and Theorem 1 follows.
**Exercise 7** Show that Theorem 1 conversely implies Theorem 6.

The alternate form (8) of the Euler product identity connects the primes (represented here via proxy by the von Mangoldt function) with the logarithmic derivative of the zeta function, and can be used as a starting point for describing further relationships between and the primes. Most famously, we shall see later in these notes that it leads to the remarkably precise Riemann-von Mangoldt explicit formula:

**Theorem 8 (Riemann-von Mangoldt explicit formula)** For any non-integer , we have

Actually, it turns out that this formula is in some sense *too* precise; in applications it is often more convenient to work with smoothed variants of this formula in which the sum on the left-hand side is smoothed out, but the contribution of zeroes with large imaginary part is damped; see Exercise 22. Nevertheless, this formula clearly illustrates how the non-trivial zeroes of the zeta function influence the primes. Indeed, if one formally differentiates the above formula in , one is led to the (quite nonrigorous) approximation

Comparing Theorem 8 with Theorem 6, it is natural to suspect that the key step in the proof of the latter is to establish the following slight but important extension of Theorem 3(ii), which can be viewed as a very small step towards the Riemann hypothesis:

**Theorem 9 (Slight enlargement of zero-free region)** There are no zeroes of on the line .

It is not quite immediate to see how Theorem 6 follows from Theorem 8 and Theorem 9, but we will demonstrate it below the fold.

Although Theorem 9 only seems like a slight improvement of Theorem 3(ii), proving it is surprisingly non-trivial. The basic idea is the following: if there was a zero at , then there would also be a different zero at (note cannot vanish due to the pole at ), and then the approximation (9) becomes

But the expression can be negative for large regions of the variable , whereas is always non-negative. This conflict eventually leads to a contradiction, but it is not immediately obvious how to make this argument rigorous. We will present here the classical approach to doing so using a trigonometric identity of Mertens.In fact, Theorem 9 is basically equivalent to the prime number theorem:

**Exercise 10** For the purposes of this exercise, assume Theorem 6, but do not assume Theorem 9. For any non-zero real , show that

This equivalence can help explain why the prime number theorem is remarkably non-trivial to prove, and why the Riemann zeta function has to be either explicitly or implicitly involved in the proof.

This post is only intended as the briefest of introduction to complex-analytic methods in analytic number theory; also, we have not chosen the shortest route to the prime number theorem, electing instead to travel in directions that particularly showcase the complex-analytic results introduced in this course. For some further discussion see this previous set of lecture notes, particularly Notes 2 and Supplement 3 (with much of the material in this post drawn from the latter).

** — 1. Meromorphic continuation and functional equation — **

We now focus on understanding the meromorphic continuation of , as well as the functional equation that that continuation satisfies. The arguments here date back to Riemann’s original paper on the zeta function. The general strategy is to relate the zeta function for to some sort of integral involving the parameter , which is manipulated in such a way that the integral makes sense for values of outside of the halfplane , and can thus be used to define the zeta function meromorphically in such a region. Often the Gamma function is involved in the relationship between the zeta function and integral. There are many such ways to connect to an integral; we present some of the more classical ones here.

One way to motivate the meromorphic continuation is to look at the continuous analogue

of (1). This clearly extends meromorphically to the whole complex plane. So one now just has to understand the analytic continuation properties of the residual For instance, using the Riemann sum type quadrature one can write this residual as since , it is a routine application of the Fubini and Morera theorems to establish analytic continuation of the residual to the half-plane , thus giving a meromorphic extension of to the region . Among other things, this shows that (the meromorphic continuation of) has a simple pole at with residue .
**Exercise 11** Using the trapezoid rule, show that for any in the region with , there exists a unique complex number for which one has the asymptotic

**Exercise 12** Obtain the refinement

One can keep going in this fashion using the Euler-Maclaurin formula (see this previous blog post) to extend the range of meromorphic continuation to the rest of the complex plane. However, we will now proceed in a different fashion, using the theta function

that made an appearance in previous notes, and try to transform this function into the zeta function. We will only need this function for imaginary values of the argument in the upper half-plane (so ); from Exercise 7 of Notes 2 we have the modularity relation In particular, since decays exponentially to as , blows up like as .We will attempt to apply the Mellin transform (Exercise 11 from Notes 2) to this function; formally, we have

There is however a problem: as goes to infinity, converges to one, and the integral here is unlikely to be convergent. So we will compute the Mellin transform of : The function decays exponentially as , and blows up like as , so this integral will be absolutely integrable when . Since we can write By the Fubini–Tonelli theorem, the integrand here is absolutely integrable, and hence From the Bernoulli definition of the Gamma function (Exercise 29(ii) of Notes 1) and a change of variables we have and hence by (1) we obtain the identity whenever . Replacing by , we can rearrange this as a formula for the function (4), namely whenever .Now we exploit the modular identity (12) to improve the convergence of this formula. The convergence of is much better near than near , so we use (13) to split

and then transform the first integral using the change of variables to obtain Using (12) we can write this as Direct computation shows that and thus whenever . However, the integrand here is holomorphic in and exponentially decaying in , so from the Fubini and Morera theorems we easily see that the right-hand side is an entire function of ; also from inspection we see that it is symmetric with respect to the symmetry . Thus we can define as an entire function, and hence as a meromorphic function, and one verifies the functional equation (6).It remains to establish that is of order . From (11) we have so from the triangle inequality

From the Stirling approximation (Exercise 30(v) from Notes 1) we conclude that for (say), and hence is of order at most as required. (One can show that has order exactly one by inspecting what happens to as , using that in this regime.) This completes the proof of Theorem 3.
**Exercise 13 (Alternate derivation of meromorphic continuation and functional equation)**

- (i) Establish the identity whenever .
- (ii) Establish the identity whenever , is not an integer, , where is the branch of the logarithm with real part in , and is the contour consisting of the line segment , the semicircle , and the line segment .
- (iii) Use (ii) to meromorphically continue to the entire complex plane .
- (iv) By shifting the contour to the contour for a large natural number and applying the residue theorem, show that again using the branch of the logarithm to define .
- (v) Establish the functional equation (5).

**Exercise 14** Use the formula from Exercise 12, together with the functional equation, to give yet another proof of the identity .

**Exercise 15 (Relation between zeta function and Bernoulli numbers)**

- (i) For any complex number with , use the Poisson summation formula (Proposition 3(v) from Notes 2) to establish the identity
- (ii) For as above and sufficiently small, show that Conclude that for any natural number , where the Bernoulli numbers are defined through the Taylor expansion Thus for instance , , and so forth.
- (iii) Show that for any odd natural number . (This identity can also be deduced from the Euler-Maclaurin formula, which generalises the approach in Exercise 12; see this previous post.)
- (iv) Use (14) and the residue theorem (now working inside the contour , rather than outside) to give an alternate proof of (15).

**Exercise 16 (Convexity bounds)**

- (i) Establish the bounds for any and with .
- (ii) Establish the bounds for any and with . (Hint: use the functional equation.)
- (iii) Establish the bounds for any and with . (Hint: use the Phragmén-Lindelöf principle, Exercise 19 from Notes 2, after dealing somehow with the pole at .)

*subconvexity estimates*. For instance, it is currently known that for any and , a result of Bourgain; the Lindelöf hypothesis asserts that this bound in fact holds for all , although this remains unproven (it is however a consequence of the Riemann hypothesis).

**Exercise 17 (Riemann-von Mangoldt formula)** Show that for any , the number of zeroes of in the rectangle is equal to . (*Hint:* apply the argument principle to evaluated at a rectangle for some that is chosen so that the horizontal edges of the rectangle do not come too close to any of the zeroes (cf. the selection of the radii in the proof of the Hadamard factorisation theorem in Notes 1), and use the functional equation and Stirling’s formula to control the asymptotics for the horizontal edges.)

We remark that the error term , due to von Mangoldt in 1905, has not been significantly improved despite over a century of effort. Even assuming the Riemann hypothesis, the error has only been reduced very slightly to (a result of Littlewood from 1924).

**Remark 18** Thanks to the functional equation and Rouche’s theorem, it is possible to numerically verify the Riemann hypothesis in any finite portion of the critical strip, so long as the zeroes in that strip are all simple. Indeed, if there was a zero off of the critical line , then an application of the argument principle (and Rouche’s theorem) in some small contour around but avoiding the critical line would be capable of numerically determining that there was a zero off of the line. Similarly, for each simple zero on the critical line, applying the argument principle for some small contour around that zero and symmetric around the critical line would numerically verify that there was exactly one zero within that contour, which by the functional equation would then have to lie exactly on that line. (In practice, more efficient methods are used to numerically verify the Riemann hypothesis over large finite portions of the strip, but we will not detail them here.)

** — 2. The explicit formula — **

We now prove Riemann-von Mangoldt explicit formula. Since is a non-trivial entire function of order , with zeroes at the non-trivial zeroes of (the trivial zeroes having been cancelled out by the Gamma function), we see from the Hadamard factorisation theorem (in the form of Exercise 35 from Notes 1) that

away from the zeroes of , where ranges over the non-trivial zeroes of (note from Exercise 11 that there is no zero at the origin), and is some constant. From (4) we can calculate while from Exercise 27 of Notes 1 we have and thus (after some rearranging) whereOne can compute the values of explicitly:

**Exercise 19** By inspecting both sides of (16) as , show that , and hence .

Jensen’s formula tells us that the number of non-trivial zeroes of in a disk is at most for any and . One can obtain a local version:

**Exercise 20 (Local bound on zeroes)**

- (i) Establish the upper bound whenever and with . (
*Hint:*use (10). More precise bounds are available with more effort, but will not be needed here.) - (ii) Establish the bounds uniformly in . (
*Hint:*use the Euler product.) - (iii) Show that for any , the number of non-trivial zeroes with imaginary part in is . (
*Hint:*use Jensen’s formula and the functional equation.) - (iv) For , , and , with not a zero of , show that
(
*Hint:*use Exercise 9 of Notes 1.)

Meanwhile, from Perron’s formula (Exercise 12 of Notes 2) and (8) we see that for any non-integer , we have

We can compute individual terms here and then conclude the Riemann-von Mangoldt explicit formula:
**Exercise 21 (Riemann-von Mangoldt explicit formula)** Let and . Establish the following bounds:

- (i) .
- (ii) .
- (iii) For any positive integer , we have
- (iv) For any non-trivial zero , we have
- (v) We have .
- (vi) We have .

*Hint:*for (i)-(iii), shift the contour to for an that gets sent to infinity, and using the residue theorem. The same argument works for (iv) except when is really close to , in which case a detour to the contour may be called for. For (vi), use Exercise 20 and partition the zeroes depending on what unit interval falls into.)

- (viii) Using the above estimates, conclude Theorem 8.

The explicit formula in Theorem 8 is completely exact, but turns out to be a little bit inconvenient for applications because it involves all the zeroes , and the series involving them converges very slowly (indeed the convergence is not even absolute). In practice it is preferable to work with a smoothed version of the formula. Here is one such smoothing:

**Exercise 22 (Smoothed explicit formula)**

- (i) Let be a smooth function compactly supported on . Show that is entire and obeys the bound (say) for some , all , and all .
- (ii) With as in (i), establish the identity with the summations being absolutely convergent by applying the Fourier inversion formula to , shifting the contour to frequencies for some , applying (8), and then shifting the contour again (using Exercise 20 and (i) to justify the contour shifting).
- (iii) Show that whenever is a smooth function, compactly supported in , with the summation being absolutely convergent.
- (iv) Explain why (iii) is
*formally*consistent with Theorem 8 when applied to the non-smooth function .

** — 3. Extending the zero free region, and the prime number theorem — **

We now show how Theorem 9 implies Theorem 6. Let be parameters to be chosen later. We will apply Exercise 22 to a function which equals one on , is supported on , and obeys the derivative estimates

for all and , and for all and . Such a function can be constructed by gluing together various rescaled versions of (antiderivatives of) standard bump functions. For such a function, we have On the other hand, we have and and hence We split into the two cases and , where is a parameter to be chosen later. For , there are only zeros, and all of them have real part strictly less than by Theorem 9. Hence there exists such that for all such zeroes. For each such zero, we have from the triangle inequality and so the total contribution of these zeroes to (17) is . For each zero with , we integrate parts twice to get some decay in : and from the triangle inequality and the fact that we conclude Since is convergent (this follows from Exercise 20 we conclude (for large enough depending on ) that the total contribution here is . Thus, after choosing suitably, we obtain the bound and thus whenever is sufficiently large depending on (since depends only on , which depends only on ). A similar argument (replacing by in the construction of ) gives the matching lower bound whenever is sufficiently large depending on . Sending , we obtain Theorem 6.
**Exercise 23** Assuming the Riemann hypothesis, show that

*Hint:*find a holomorphic continuation of to the region in a manner similar to how was first holomorphically continued to the region ).

It remains to prove Theorem 9. The claim is clear for thanks to the simple pole of at , so we may assume . Suppose for contradiction that there was a zero of at , thus

for sufficiently close to . Taking logarithms, we see in particular that Using Lemma 5(v), we conclude that Note that the summands here are oscillatory due to the cosine term. To manage the oscillation, we use the simple pole at that gives for sufficiently close to one, and on taking logarithms as before we get These two estimates come close to being contradictory, but not quite (because we could have close to for most numbers that are weighted by . To get the contradiction, we use the analytic continuation of to to conclude that and hence Now we take advantage of the*Mertens inequality*(which is a quantitative variant of the observation that if is close to then has to be close to ) as well as the non-negative nature of to conclude that and hence This leads to the desired contradiction by sending , and proves the prime number theorem.

**Exercise 24** Establish the inequality

**Remark 25** There are a number of ways to improve Theorem 9 that move a little closer in the direction of the Riemann hypothesis. Firstly, there are a number of *zero-free regions* for the Riemann zeta function known that give lower bounds for (and in particular preclude the existence of zeros) a small amount inside the critical strip, and can be used to improve the error term in the prime number theorem; for instance, the *classical zero-free region* shows that there are no zeroes in the region for some sufficiently small absolute constant , and lets one improve the error term in Theorem 6 to (with a corresponding improvement in Theorem 1, provided that one replaces with the logarithmic integral ). A further improvement in the zero free region and in the prime number theorem error term was subsequently given by Vinogradov. We also mention a number of important *zero density estimates* which provide non-trivial upper bounds for the number of zeroes in other, somewhat larger regions of the critical strip; the bounds are not strong enough to completely exclude zeroes as is the case with zero-free regions, but can at least limit the collective influence of such zeroes. For more discussion of these topics, see the various lecture notes to this previous course.

### A statement from mathematics department faculty at Stanford and MIT

*[The following statement is signed by several mathematicians at Stanford and MIT in support of one of their recently admitted graduate students, and I am happy to post it here on my blog. -T]*

We were saddened and horrified to learn that Ilya Dumanski, a brilliant young mathematician who has been admitted to our graduate programs at Stanford and MIT, has been imprisoned in Russia, along with several other mathematicians, for participation in a peaceful demonstration. Our thoughts are with them. We urge their rapid release, and failing that, that they be kept in humane conditions. A petition in their support has been started at

https://www.ipetitions.com/petition/a-call-for-immediate-release-of-arrested-students/

Signed,

Roman Bezrukavnikov (MIT)

Alexei Borodin (MIT)

Daniel Bump (Stanford)

Sourav Chatterjee (Stanford)

Otis Chodosh (Stanford)

Ralph Cohen (Stanford)

Henry Cohn (MIT)

Brian Conrad (Stanford)

Joern Dunkel (MIT)

Pavel Etingof (MIT)

Jacob Fox (Stanford)

Michel Goemans (MIT)

Eleny Ionel (Stanford)

Steven Kerckhoff (Stanford)

Jonathan Luk (Stanford)

Eugenia Malinnikova (Stanford)

Davesh Maulik (MIT)

Rafe Mazzeo (Stanford)

Haynes Miller (MIT)

Ankur Moitra (MIT)

Elchanan Mossel (MIT)

Tomasz Mrowka (MIT)

Bjorn Poonen (MIT)

Alex Postnikov (MIT)

Lenya Ryzhik (Stanford)

Paul Seidel (MIT)

Mike Sipser (MIT)

Kannan Soundararajan (Stanford)

Gigliola Staffilani (MIT)

Nike Sun (MIT)

Richard Taylor (Stanford)

Ravi Vakil (Stanford)

Andras Vasy (Stanford)

Jan Vondrak (Stanford)

Brian White (Stanford)

Zhiwei Yun (MIT)

### 246B, Notes 3: Elliptic functions and modular forms

Previous set of notes: Notes 2. Next set of notes: Notes 4.

On the real line, the quintessential examples of a periodic function are the (normalised) sine and cosine functions , , which are -periodic in the sense that

By taking various polynomial combinations of and we obtain more general trigonometric polynomials that are -periodic; and the theory of Fourier series tells us that all other -periodic functions (with reasonable integrability conditions) can be approximated in various senses by such polynomial combinations. Using Euler’s identity, one can use and in place of and as the basic generating functions here, provided of course one is willing to use complex coefficients instead of real ones. Of course, by rescaling one can also make similar statements for other periods than . -periodic functions can also be identified (by abuse of notation) with functions on the quotient space (known as the*additive -torus*or

*additive unit circle*), or with functions on the fundamental domain (up to boundary) of that quotient space with the periodic boundary condition . The map also identifies the additive unit circle with the

*geometric unit circle*, thanks in large part to the fundamental trigonometric identity ; this can also be identified with the

*multiplicative unit circle*. (Usually by abuse of notation we refer to all of these three sets simultaneously as the “unit circle”.) Trigonometric polynomials on the additive unit circle then correspond to ordinary polynomials of the real coefficients of the geometric unit circle, or Laurent polynomials of the complex variable .

What about periodic functions on the complex plane? We can start with *singly periodic functions* which obey a periodicity relationship for all in the domain and some period ; such functions can also be viewed as functions on the “additive cylinder” (or equivalently ). We can rescale as before. For holomorphic functions, we have the following characterisations:

**Proposition 1 (Description of singly periodic holomorphic functions)**

- (i) Every -periodic entire function has an absolutely convergent expansion where is the nome , and the are complex coefficients such that Conversely, every doubly infinite sequence of coefficients obeying (2) gives rise to a -periodic entire function via the formula (1).
- (ii) Every bounded -periodic holomorphic function on the upper half-plane has an expansion where the are complex coefficients such that Conversely, every infinite sequence obeying (4) gives rise to a -periodic holomorphic function which is bounded away from the real axis (i.e., bounded on for every ).

*Proof:* If is -periodic, then it can be expressed as for some function on the “multiplicative cylinder” , since the fibres of the map are cosets of the integers , on which is constant by hypothesis. As the map is a covering map from to , we see that will be holomorphic if and only if is. Thus must have a Laurent series expansion with coefficients obeying (2), which gives (1), and the inversion formula (5) follows from the usual contour integration formula for Laurent series coefficients. The converse direction to (i) also follows by reversing the above arguments.

For part (ii), we observe that the map is also a covering map from to the punctured disk , so we can argue as before except that now is a bounded holomorphic function on the punctured disk. By the Riemann singularity removal theorem (Exercise 35 of 246A Notes 3) extends to be holomorphic on all of , and thus has a Taylor expansion for some coefficients obeying (4). The argument now proceeds as with part (i).

The additive cylinder and the multiplicative cylinder can both be identified (on the level of smooth manifolds, at least) with the geometric cylinder , but we will not use this identification here.

Now let us turn attention to *doubly periodic* functions of a complex variable , that is to say functions that obey two periodicity relations

Within the world of holomorphic functions, the collection of doubly periodic functions is boring:

**Proposition 2** Let be an entire doubly periodic function (with periods linearly independent over ). Then is constant.

In the language of Riemann surfaces, this proposition asserts that the torus is a non-hyperbolic Riemann surface; it cannot be holomorphically mapped non-trivially into a bounded subset of the complex plane.

*Proof:* The fundamental domain (up to boundary) enclosed by is compact, hence is bounded on this domain, hence bounded on all of by double periodicity. The claim now follows from Liouville’s theorem. (One could alternatively have argued here using the compactness of the torus .

To obtain more interesting examples of doubly periodic functions, one must therefore turn to the world of *meromorphic functions* – or equivalently, holomorphic functions into the Riemann sphere . As it turns out, a particularly fundamental example of such a function is the Weierstrass elliptic function

*all*such tori, modulo isomorphism; this is a basic example of a moduli space, known as the (classical, level one) modular curve . This curve can be described in a number of ways. On the one hand, it can be viewed as the upper half-plane quotiented out by the discrete group ; on the other hand, by using the -invariant, it can be identified with the complex plane ; alternatively, one can compactify the modular curve and identify this compactification with the Riemann sphere . (This identification, by the way, produces a very short proof of the little and great Picard theorems, which we proved in 246A Notes 4.) Functions on the modular curve (such as the -invariant) can be viewed as -invariant functions on , and include the important class of modular functions; they naturally generalise to the larger class of (weakly) modular forms, which are functions on which transform in a very specific way under -action, and which are ubiquitous throughout mathematics, and particularly in number theory. Basic examples of modular forms include the Eisenstein series, which are also the Laurent coefficients of the Weierstrass elliptic functions . More number theoretic examples of modular forms include (suitable powers of) theta functions , and the modular discriminant . Modular forms are -periodic functions on the half-plane, and hence by Proposition 1 come with Fourier coefficients ; these coefficients often turn out to encode a surprising amount of number-theoretic information; a dramatic example of this is the famous modularity theorem, (a special case of which was) used amongst other things to establish Fermat’s last theorem. Modular forms can be generalised to other discrete groups than (such as congruence groups) and to other domains than the half-plane , leading to the important larger class of automorphic forms, which are of major importance in number theory and representation theory, but which are well outside the scope of this course to discuss.

** — 1. Doubly periodic functions — **

Throughout this section we fix two complex numbers that are linearly independent over , which then generate a lattice .

We now study the doubly periodic meromorphic functions with respect to these periods that are not identically zero. We first observe some constraints on the poles of these functions. Of course, by periodicity, the poles will themselves be periodic, and thus the set of poles forms a finite union of disjoint cosets of the lattice . Similarly, the zeroes form a finite union of disjoint cosets . Using the residue theorem, we can obtain some further constraints:

**Lemma 3 (Consequences of residue theorem)** Let be a doubly periodic meromorphic function (not identically zero) with periods , poles at , and zeroes at .

- (i) The sum of residues at each (i.e., we sum one residue per coset) is equal to zero.
- (ii) The number of poles (counting multiplicity, but only counting once per coset) is equal to the number of zeroes (again counting multiplicity, and once per coset).
- (iii) The sum of the poles (counting multiplicity, and working in the group ) is equal to the sum of the zeroes .

*Proof:* For (i), we first apply a translation so that none of the pole cosets intersects the fundamental parallelogram boundary ; this of course does not affect the sum of residues. Then, by the residue theorem, the sum in (i) is equal to the expression

For part (iii), we again translate so that none of the pole or zero cosets intersects , noting from part (ii) that any such translation affects the sum of poles and sum of zeroes by the same amount. By the residue theorem, it now suffices to show that

lies in the lattice . But one can rewrite this using the double periodicity as so it suffices to show that is an integer for . But (a slight modification of) the argument principle shows that this number is precisely the winding number around the origin of the image of under the map , and the claim follows.This lemma severely limits the possible number of behaviors for the zeroes and poles of a meromorphic function. To formalise this, we introduce some general notation:

**Definition 4 (Divisors)**

- (i) A divisor on the torus is a formal integer linear combination , where ranges over a finite collection of points in the torus (i.e., a finite collection of cosets ), and are integers, with the obvious additive group structure; equivalently, the space of divisors is the free abelian group with generators for (with the convention ).
- (ii) The number is the
*degree*of a divisor , the point is the*sum*of , and each is the*order*of the divisor at (with the convention that the order is if does not appear in the sum). A divisor is*non-negative*(or*effective*) if for all . We write if is non-negative (i.e., the order of is greater than or equal to that of at every point , and if and . - (iii) Given a meromorphic function (or equivalently, a doubly periodic function ) that is not identically zero, the
*principal divisor*is the divisor , where ranges over the zeroes and poles of , and is the order of the zero (if is a zero) or negative the order of the pole (if is a pole). - (iv) Given a divisor , we define to be the space of all meromorphic functions that are either zero, or are such that . That is to say, consists of those meromorphic functions that have at most a pole of order at if is positive, or at least zero of order if is negative.

A divisor can be viewed as an abstraction of the concept of a set of zeroes and poles (counting multiplicity). Observe that principal divisors obey the laws , when are meromorphic and non-zero. In particular, the space of principal divisors is a subgroup of the space of all divisors. By Lemma 3(ii), all principal divisors have degree zero, and from Lemma 3(iii), all principal divisors have sum zero as well. Later on we shall establish the converse claim that every divisor of degree and sum zero is a principal divisor; see Exercise 7.

**Remark 5** One can define divisors on other Riemann surfaces, such as the complex plane . Observe from the fundamental theorem of algebra that if one has two non-zero polynomials , then if and only if divides as a polynomial. This may give some hint as to the origin of the terminology “divisor”. The machinery of divisors turns out to have a rich algebraic and topological structure when applied to more general Riemann surfaces than tori, for instance enabling one to associate an abelian variety (the Jacobian variety) to every algebraic curve; see these 246C notes for further discussion.

It is easy to see that is always a vector space. All non-zero meromorphic functions belong to at least one of the , namely , so to classify all the meromorphic functions on , it would suffice to understand what all the spaces are.

Liouville’s theorem (in the form of Proposition 2) tells us that all elements of – that is to say, the holomorphic functions on – are constant; thus is one-dimensional. If is a negative divisor, the elements of are thus constant and have at least one zero, thus in these cases is trivial.

Now we gradually work our way up to higher degree divisors . A basic fact, proven from elementary linear algebra, is that every time one adds a pole to , the dimension of the space only goes up by at most one:

**Lemma 6** For any divisor and any , is a subspace of of codimension at most one. In particular, is finite-dimensional for any .

*Proof:* It is clear that is a subspace of . If has order at , then there is a linear functional that assigns to each meromorphic function the coefficient of the Laurent expansion of at (note from periodicity that the exact choice of coset representative is not relevant. A little thought reveals that the kernel of is precisely , and the first claim follows. The second claim follows from iterating the first claim, noting that any divisor can be obtained from a suitable negative divisor by the addition of finitely many poles .

Now consider the space for some point . Lemma 6 tells us that the dimension of this space is either one or two, since was one-dimensional. The space consists of functions that possibly have a simple pole at most at , and no other poles. But Lemma 3(i) tells us that the residue at has to vanish, and so is in fact in and thus is constant. (One could also argue here using the other two parts of Lemma 2; how?) So is no larger than , and is thus also one-dimensional.

Now let us study the space – the space of meromorphic functions that have at most a double pole at and no other poles. Again, Lemma 6 tells us that this space is one or two dimensional. To figure out which, we can normalise to be the origin coset . The question is now whether there is a doubly periodic meromorphic function that has a double pole at each point of . A naive candidate for such a function would be the infinite series

however this series turns out to not be absolutely convergent. Somewhat in analogy with the discussion of the Weierstrass and Hadamard factorisation theorems in Notes 1, we then proceed instead by working with the normalised function defined by the formula (6). Let us first verify that the series in (6) is absolutely convergent for . There are only finitely many with , and all the summands are finite for , so we only need to establish convergence of the tail However, from the fundamental theorem of calculus we have so to demonstrate absolute convergence it suffices to show that But a simple volume packing argument (considering the areas of the translates of the fundamental domain ) shows that the number of lattice points in any disk , is , and so by dyadic decomposition as in Notes 1, the series is absolutely convergent. Further repetition of the arguments from Notes 1 shows that the series in (6) converges locally uniformly in , and thus is holomorphic on this set. Furthermore, for any , the same arguments show that stays bounded in a punctured neighbourhood of , thus by the Riemann singularity removal theorem is equal to plus a bounded holomorphic function in the neighbourhood of . Thus is meromorphic with double poles (and vanishing residue) at every lattice point , and no other poles.Now we show that is doubly periodic, thus and for . We just prove the first identity, as the second is analogous. From (6) we have

The series on the right is absolutely convergent, and on every coset of it telescopes to zero. The claim then follows by Fubini’s theorem.By construction, lies in , and is clearly non-constant. Thus is two-dimensional, being spanned by the constant function and . By translation, we see that is two-dimensional for any other point as well.

From (6) it is also clear that the function is even: . In particular, for any avoiding the half-lattice (so that and occupy different locations in the torus ), the function has a zero at both and . By Lemma 3(ii) there are no other zeroes of this function (and this claim is also consistent with Lemma 3(iii)); thus the divisor of this function is given by

If lies in the half-lattice but not in (thus, it lies in one of the*half-periods*, , or ) then from the even and doubly periodic nature of we see that for all , so in fact must have at least a double zero at , and again from Lemma 3(ii) these are the only zeroes of this function. So the identity (9) also holds in this case.

**Exercise 7 (Classification of principal divisors)**

- (i) Let be four points such that . Show that the divisor is a principal divisor. (
*Hint:*if are all distinct, use the function If some of the coincide, use some transformed version of the Weierstrass elliptic function instead.) - (ii) Show that every divisor of degree zero and sum zero is a principal divisor.
- (iii) Two divisors are said to be
*equivalent*if their difference is a principal divisor. Show that two divisors are equivalent if and only if they have the same degree and same sum. - (iv) Show that the quotient group (known as the
*divisor class group*or Picard group) is isomorphic (as a group) to , and that the subgroup arising from degree zero divisors (also known as the Jacobian variety of ) is isomorphic to .

Now let us study the space , where we again normalise for sake of discussion. Lemma 6 tells us that this space is two or three dimensional, being spanned by , , and possibly one other function. Note that the derivative of the meromorphic function is also doubly periodic with a triple pole at , so it lies in and is not a linear combination of or (as these have a lower order singularity at ). Thus is three-dimensional, being spanned by . A formal term-by-term differentiation of (6) gives (7). To justify (7), observe that the arguments that demonstrated the meromorphicity of the right-hand side of (6) also show the meromorphicity of (7). From Fubini’s theorem, the fundamental theorem of calculus, and (6) we see that

for any contour in from one point to another , and the claim (7) now follows from another appeal to the fundamental theorem of calculus. Of course, will then also be three-dimensional for any other point on the torus. From (7) we also see that is odd; this also follows from the even nature of . From the oddness and periodicity has to have zeroes at the half-periods ; in particular, from Lemma 3(ii) there are no other zeroes, and the principal divisor is given byTurning now to , we could differentiate yet again to generate a doubly periodic function with a fourth order pole at the origin, but we can also work with the square of the Weierstrass function. From Lemma 6 we conclude that is four-dimensional and is spanned by . In a similar fashion, is a five-dimensional space spanned by .

Something interesting happens though at . Lemma 6 tells us that this space is the span of , and possibly one other function, which will have a pole of order six at the origin. Here we have *two* natural candidates for such a function: the cube of the Weierstrass function, and the square of its derivative. Both have a pole of order exactly six and lie in , and so must be a linear combination of . But since are even and are odd, must in fact just be a linear combination of . To work out the precise combination, we see by repeating the derivation of (7) that

**Exercise 8** Derive (8) directly from Proposition 2 by showing that the difference between the two sides is doubly periodic and holomorphic after removing singularities.

**Exercise 9 (Classification of doubly periodic meromorphic functions)**

- (i) For any , show that has dimension , and every element of this space is a polynomial combination of .
- (ii) Show that every doubly periodic meromorphic function is a rational function of .

We have an alternate form of (8):

**Exercise 10** Define the roots , , .

- (i) Show that are distinct, and that
for all . (
*Hint:*use (10).) Conclude in particular that , , and . - (ii) Show that the modular discriminant is equal to , and is in particular non-zero.

If we now define the elliptic curve

to be the union of a certain cubic curve in the complex plane together with the point at infinity (where the notions of “curve” and “plane” are relative to the underlying complex field rather than the more familiar real field ), then we have a map from to , with the convention that the origin is mapped to the point at infinity . For instance, the half-periods , , are mapped to the points of respectively.
**Lemma 11** The map defined by (11) is a bijection between and .

Among other things, this lemma implies that the elliptic curve is topologically equivalent (i.e., homeomorphic to) a torus, which is not an entirely obvious fact (though if one squints hard enough, the real analogue of an elliptic curve does resemble a distorted slice of a torus embedded in ).

*Proof:* Clearly is the only point that maps to , and (from (10)) the half-periods are the only points that map to . It remains to show that all the other points arise via from exactly one element of . The function has exactly two zeroes by Lemma 3(ii), which lie at for some as is even; since , is not equal to , hence is not a half-period. As is odd, the map (11) must therefore map to the two points of the elliptic curve that lie above , and the claim follows.

Analogously to the Riemann sphere , the elliptic curve can be given the structure of a Riemann surface, by prescribing the following charts:

- (i) When is a point in other than or , then locally is the graph of a holomorphic branch of the square root of near , and one can use as a coordinate function in a sufficiently small neighbourhood of .
- (ii) In the neighbourhood of for some , the function has a simple zero at and so has a local inverse that maps a neighbourhood of to a neighbourhood of , and a point sufficiently near can be parameterised by . One can then use as a coordinate function in a neighbourhood of .
- (iii) A neighbourhood of consists of and the points in the remaining portion of with sufficiently large; then is asymptotic to a square root of , so in particular and should both go to zero as goes to infinity in . We rewrite the defining equation of the curve in terms of and as . The function has a simple zero at zero and thus has a holomorphic local inverse that maps to , and we have in a neighbourhood of infinity. We can then use as a coordinate function in a neighbourhood of , with the convention that this coordinate function vanishes at infinity.

It is then a tedious but routine matter to check that has the structure of a Riemann surface. We then claim that the bijection defined by (11) is holomorphic, and thus a complex diffeomorphism of Riemann surfaces. In the neighbourhood of any point of the torus other than the origin , maps to a neighbourhood of finite point of , including the three points , the holomorphicity is a routine consequence of composing together the various local holomorphic functions and their inverses. In the neighbourhood of the origin , maps for small to a point of with a Laurent expansion

from the Laurent expansions of , so in particular the coordinate takes the form where the error term is holomorphic, with mapping to . In particular the map is a local complex diffeomorphism here and again we have holomorphicity. We thus conclude that the elliptic curve is complex diffeomorphic to the torus using the map . From Exercise 9, the meromorphic functions on may be identified with the rational functions on .While we have shown that all tori are complex diffeomorphic to elliptic curves, the converse statement that all elliptic curves are diffeomorphic to tori will have to wait until the next section for a proof, once we have set up the machinery of modular forms.

**Exercise 12 (Group law on elliptic curves)**

- (i) Let be three distinct elements of the torus that are not equal to the origin . Show that if and only if the three points , , are collinear in , in the sense that they lie on a common complex line for some complex numbers with not both zero.
- (ii) What happens in (i) if (say) and agree? What about if ?
- (iii) Using (i), (ii), give a purely geometric definition of a group addition law on the elliptic curve which is compatible with the group addition law on the torus via (11). (We remark that the associativity property of this law is not obvious from a purely geometric perspective, and is related to the Cayley-Bacharach theorem in classical geometry; see this previous blog post.)

**Exercise 13 (Addition law)** Show that for any lying in distinct cosets of , one has

**Exercise 14 (Special case of Riemann-Roch)**

- (i) Show that if two divisors are equivalent (in the sense of Exercise 7(iii)), then the vector spaces and are isomorphic (in particular, they have the same dimension).
- (ii) If is a divisor of some degree , show that the dimension of the space is zero if , equal to if , equal to if and has non-zero sum, and equal to if and has zero sum. (
*Hint:*use Exercise 7(iii) and part (i) to replace with an equivalent divisor of a simple form.) - (iii) Verify the identity for any divisor . This is a special case of the more general Riemann-Roch theorem, discussed in these 246C notes.

**Exercise 15 (Elliptic integrals)**

- (i) Show that is a covering map from to the thrice-punctured plane .
- (ii) Let be a contour in from some complex number to another complex number , and suppose that there is a holomorphic branch of the square root of in a neighbourhood of . Show that there exists complex numbers with , such that

**Remark 16** The integral is an example of an elliptic integral; many other elliptic integrals (such as the integral arising when computing the perimeter of an ellipse) can be transformed into this form (or into a closely related integral) by various elementary substitutions. Thus the Weierstrass elliptic function can be employed to evaluate elliptic integrals, which may help explain the terminology “elliptic” that occurs throughout these notes. In 246C notes we will introduce the notion of a meromorphic -form on a Riemann surface. The identity (12) can then be interpreted in this language as the differential form identity , where are the standard coordinates on the elliptic curve ; the meromorphic -form is initially only defined on outside of the four points , but this identity in fact reveals that the form extends holomorphically to all of ; it is an example of what is known as an Abelian differential of the first kind.

**Remark 17** The elliptic curve (for various choices of parameters ) can be defined in other fields than the complex numbers (though some technicalities arise in characteristic two and three due to the pathological behaviour of the discriminant in those cases). On the other hand, the Weierstrass elliptic function is a transcendental function which only exists in complex analysis and does not have a direct analogue in other fields. So this connection between elliptic curves and tori is specific to the complex field. Nevertheless, many facts about elliptic curves that were initially discovered over the complex numbers through this complex-analytic link to tori, were then reproven by purely algebraic means, so that they could be extended without much difficulty to many other fields than the complex numbers, such as finite fields. (For instance, the role of the complex torus can be replaced by the Jacobian variety, which was briefly introduced in Exercise 7.) Elliptic curves over such fields are of major importance in number theory (and cryptography), but we will not discuss these topics further here.

** — 2. Modular functions and modular forms — **

In Exercise 32 of 246A Notes 5, it was shown that two tori and are complex diffeomorphic if and only if one has

for some integers with . From this it is not difficult to see that if are two lattices in , then and are diffeomorphic if and only if for some , i.e., the lattices are complex dilations of each other.
Let us write for the set of all tori quotiented by the equivalence relation of complex diffeomorphism; this is the (classical, level one, noncompactified) modular curve. By the above discussion, this set can also be identified with the set of pairs of linearly independent (over ) complex numbers quotiented by the equivalence relation given implicitly by (13). One can simplify this a little by observing that any pair is equivalent to for some in the upper half-plane , namely either or depending on the relative phases of and ; this quantity is known as the *period ratio*. From (13) (swapping the roles of as necessary), we then see that two pairs are equivalent if one has

If we use the relation to write

we see that approaches the real line as if is non-zero; also, if is zero, then from we must have , and will either have imaginary part going off to infinity (if goes to infinity) or real part going to infinity (if is bounded and goes to infinity). In all cases we then conclude that goes to infinity as goes to infinity, uniformly for in any fixed compact subset of , which makes the action of on proper (for any compact set , one has intersecting for at most finitely many . If acted freely on (i.e., any element of other than the identity and negative identity has no fixed points in ), then the quotient would be a Riemann surface by the discussion in Section 2 of 246A Notes 5. Unfortunately, this is not quite true. For instance, the point is fixed by the Möbius transformation coming from the rotation matrix of , and the point is similarly fixed by the transformation coming from the matrix of . Geometrically, these fixed points come from the fact that the Gaussian integgers are invariant with respect to rotation by , while the Eisenstein integers are invariant with respect to rotation by . On the other hand, these are basically the only two places where the action is not free:
**Exercise 18** Suppose that is an element of which is fixed by some element of which is not the identity or negative identity. Let be the lattice .

- (i) Show that obeys a dilation invariance for some complex number which is not real.
- (ii) Show that the dilation in part (i) must have magnitude one. (Hint: look at a non-zero element of of minimal magnitude.)
- (iii) Show that there is no rotation invariance with . (Hint: again, work with a non-zero element of of minimal magnitude, and use the fact that is closed under addition and subtraction. It may help to think geometrically and draw plenty of pictures.)
- (iv) Show that is equivalent to either the Gaussian lattice or the Eisenstein lattice , and conclude that the period ratio is equivalent to either or .

**Remark 19** The conformal map on the complex numbers preserves the Gaussian integers and thus descends to a conformal map from the Gaussian torus to itself; similarly the conformal map preserves the Eisenstein integers and thus descends to a conformal map from the Eisenstein torus to itself. These rare examples of complex tori equipped with additional conformal automorphisms are examples of tori (or elliptic curves) endowed with complex multiplication. There are additional examples of elliptic curves endowed with conformal *endomorphisms* that are still considered to have complex multiplication, and have a particularly nice algebraic number theory structure, but we will not pursue this topic further here.

**Remark 20** The fact that the action of on lattices contains fixed points is somewhat annoying, as it prevents one from immediately viewing the modular curve as a Riemann surface. However by passing to a suitable finite index subgroup of , one can remove these fixed points, leading to a theory that is cleaner in some respects. For instance, one can work with the congruence group , which roughly speaking amounts to decorating the lattices (or their tori ) with an additional “-marking” that eliminates the fixed points. This leads to a modification of the theory which is for instance well suited for studying theta functions; the role of the -invariant in the discussion below is then played by the modular lambda function , which also gives a uniformisation of the twice-punctured complex plane . However we will not develop this parallel theory further here.

If we let be the elements of not equivalent to or , and the equivalence class of tori not equivalent to the Gaussian torus or the Eisenstein torus , then can be viewed as the quotient of the Riemann surface by the free and proper action of , so it has the structure of a Riemann surface; can thus be thought of as the Riemann surface with two additional points added. Later on we will also add a third point (known as the *cusp*) to the Riemann surface to compactify it to .

A function on the modular curve can be thought of, equivalently, as a function that is -invariant in the sense that for all and , or equivalently that one has the identity

whenever and are integers with . Similarly if takes values in the Riemann sphere rather than . If is holomorphic (resp. meromorphic) on , this will in particular define a holomorphic (resp. meromorphic) function on , and morally to all of as well (although we have not yet defined a Riemann structure on all of ).We define a modular function to be a meromorphic function on that obeys the condition (15), and which also has at most polynomial growth at the cusp in the sense that one has a bound of the form

for all with sufficiently large imaginary part, and some constants (this bound is needed for technical reasons to ensure “meromorphic” behaviour at the cusp , as opposed to an essential singularity). Specialising to the matrices we see that the condition (15) in particular implies the -periodicity and the inversion law for all . Conversely, these two special cases of (15) imply the general case:
**Exercise 21**

- (i) Let be two elements of with . Show that it is possible to transform the quadruplet to the quadruplet after a finite number of applications of the moves
and
({
*Hint:*use the principle of infinite descent, applying the moves in a suitable order to decrease the lengths of and when the dot product is not too small, taking advantage of the Lagrange identity to determine when this procedure terminates. It may help to think geometrically and draw plenty of pictures.) Conclude that the two matrices (17) generate all of . - (ii) Show that a function obeys (15) if and only if it obeys both (18) and (19).

**Exercise 22 (Standard fundamental domain)** Define the *standard fundamental domain* for to be the set

- (i) Show that every lattice is equivalent (up to dilations) to a lattice with , with unique except when it lies on the boundary of , in which case the lack of uniqueness comes either from the pair for some , or from the pair for some . (
*Hint:*arrange so that is a non-zero element of of minimal magnitude.) - (ii) Show that can be identified with the fundamental domain after identifying with for , and with for . Show also that the set is then formed the same way, but first deleting the points from .

We will give some examples of modular functions (beyond the trivial example of constant functions) shortly, but let us first observe that when one differentiates a modular function one gets a more general class of function, known as a modular form. In more detail, observe from (14) that the derivative of the Möbius transformation is , and hence by the chain rule and (15) the derivative of a modular function would obey the variant law

Motivated by this, we can define a (weakly)*modular form*of weight for any natural number to be a meromorphic function which obeys the modularity relation for all and all integers with (with the convention that for any non-zero complex ), and which is meromorphic at the cusp in the sense of (16). Thus for instance modular functions are weakly modular forms of weight . A

*modular form*of weight is a weakly modular form of weight which is holomorphic (not just meromorphic) on , and also “holomorphic at ” in the sense that is bounded for large enough. Note that as viewed a function of the nome , a modular form can be thought of as a certain type of holomorphic function on the disk (using the Riemann singularity removal theorem to remove the singularity at the origin ), while weakly modular forms (and in particular modular functions) are certain types of meromorphic functions on this disk. A modular form that vanishes at infinity is known as a

*cusp form*.

**Exercise 23** Let be a natural number. Show that a function obeys (20) if and only if it is -periodic in the sense of (18) and obeys the law

**Exercise 24 (Lattice interpretation of modular forms)** Let be a modular form of weight . Show that there is a unique function from lattices to complex numbers such that

Observe that the product of a modular form of weight and a modular form of weight is a modular form of weight , and that the ratio of two modular forms of weight will be a modular function (if the denominator is not identically zero). Also, the space of modular forms of a given weight is a vector space, as is the space of modular functions. This suggests a way to generate non-trivial modular functions, by first locating some modular forms and then taking suitable rational combinations of these forms.

Somewhat analogously to how we used Lemma 3 to investigate the spaces for divisors on a torus, we will investigate the space of modular forms via the following basic formula:

**Theorem 25 (Valence formula)** Let be a modular form of weight , not identically zero. Then we have

Informally, this formula asserts that the point only “deserves” to be counted in with multiplicity due to its order stabiliser, while the point only “deserves” to be counted in with multiplicity due to its order stabiliser. (The cusp has an infinite stabiliser, but this is compensated for by taking the order with respect to the nome variable rather than the period ratio variable .) The general philosophy of weighting points by the reciprocal of the order of their stabiliser occurs throughout mathematics; see this blog post for more discussion.

*Proof:* Firstly, from Exercise 22, we can place all the zeroes in the fundamental domain . When parameterised in terms of the nome , this domain is compact, hence has only finitely many zeros, so the sum in (22) is finite.

As in the proof of Lemma 3(ii), we use the residue theorem. For simplicity, let us first suppose that there are no zeroes on the boundary of the fundamental domain except possibly at the cusp . Then for large enough, we have from the residue theorem that

where is the closed contour consisting of the polygonal path concatenated with the circular arc . From the -periodicity, the contribution of the two vertical edges and cancel each other out. The contribution of the horizontal edge can be written using the change of variables as which by the residue theorem is equal to . Finally, using the modularity (21), one calculates that the contribution of the left arc is equal to minus the contribution of the right arc . This gives the proof of the valence theorem in the case that there are no zeroes on the boundary of .Suppose now that there is a zero on the right edge of , and hence also on the left edge by periodicity, for some . One can account for this zero by perturbing the contour to make a little detour to the right of (e.g., by a circular arc), and a matching detour to the right of . One can then verify that the same argument as before continues to work, with this boundary zero being counted exactly once. Similarly, if there is a zero on the left arc for some , and hence also at by modularity, one can make a detour slightly above and slightly below (with the two detours being related by the transform to ensure cancellation), and again we can argue as before. If instead there is a zero at , one makes an (approximately) semicircular detour above ; in this case the detour does not cancel out, but instead contributes a factor of in the limit as the radius of the detour goes to zero. Finally, if there is a zero at (and hence also at ), one makes detours by two arcs of angle approximately at these two points; these two (approximate) sixth-circles end up contributing a factor of in the limit, giving the claim.

**Exercise 26 (Quick applications of the valence formula)**

- (i) Let be a modular form of weight , not identically zero. Show that is equal to or an even number that is at least .
- (ii) (Liouville theorem for ) If is a modular form of weight zero, show that it is constant. (Hint: apply the valence theorem to various shifts of by constant.)
- (iii) For , show that the vector space of modular forms of weight is at most one dimensional. (Hint: in these cases, there are a very limited number of solutions to the equation with natural numbers.)
- (iv) Show that there are no cusp forms of weight when or , and for the space of cusp forms of weight is at most one dimensional.
- (v) Show that for any , the space of cusp forms of weight is a subspace of the space of modular forms of weight of codimension at most one, and that both spaces are finite-dimensional.

A basic example of modular forms are provided by the Eisenstein series

that we have already encountered for even integers greater than two (we ignore the odd Eisenstein series as they vanish). We can view this as a function on by the formula Observe that if are integers with , then using the matrix inverse in . Inserting this into (23), (24) we conclude that (Compare also with Exercise 24.) Also, from (23), (24) we have The series here is locally uniformly convergent for , so is holomorphic. Also, using the bounds for non-zero , while where is the famous Riemann zeta function we conclude on summing in and using the hypothesis that In particular, is bounded at infinity. Summarising, we have established that the Eisenstein series is a modular form of weight , which is not identically zero (since it approaches the non-zero value at the cusp ). Combining this with Exercise 26(iii), we see that we have completely classified the modular forms of weight for , namely they are the scalar multiples of . For instance, the coefficients and appearing in the previous section are modular forms of weight and weight respectively, and the modular discriminant from Exercise 10 is a modular form of weight . From that exercise, this modular form never vanishes on , hence by the valence formula it must have a simple zero at , and in particular is a cusp form. From Exercise 26 it is the unique cusp form of weight , up to constants.
**Exercise 27** Give an alternate proof that is a cusp form, not using the valence identity, by first establishing that and .

We can now create our first non-trivial modular function, the -invariant

The factor of is traditional, as it gives a nice normalisation at , as we shall see later. One can take advantage of complex multiplication to compute two special values immediately:

*Proof:* Using the rotation symmetry we see that , hence which implies that and hence . Similarly, using the rotation symmetry we have , hence . (One can also use the valence formulae to get the vanishing ).

Being modular, we can think of as a map from to . We have the following fundamental fact:

**Proposition 29** The map is a bijection.

*Proof:* Note that for any , if and only if is a zero of . It thus suffices to show that for every , the zeroes of the function in consist of precisely one orbit of . This function is a modular form of weight that does not vanish at infinity (since does not vanish while does). By the valence formula, we thus have

- has a simple zero at precisely one -orbit, not equivalent to or .
- has a double zero at (and equivalent points), and no other zeroes.
- has a triple zero at (and equivalent points), and no other zeroes.

Note that this proof also shows that has a double zero at and has a triple zero at , but that has a simple zero for any not equivalent to or .

We can now give the entire modular curve the structure of a Riemann surface by declaring to be the coordinate function. This is compatible with the existing Riemann surface structure on since was already holomorphic on this portion of the curve. Any modular function can then factor as for some meromorphic function that is initially defined on the punctured complex plane ; but from meromorphicity of on and at infinity we see that blows up at an at most polynomial rate as one approaches , , or , and so is in fact a meromorphic function on the entire Riemann sphere and is thus a rational function (Exercise 19 of 246A Notes 4). We conclude

**Proposition 30** Every modular function is a rational function of the -invariant .

Conversely, it is clear that every rational function of is modular, thus giving a satisfactory description of the modular functions.

**Exercise 31** Show that every modular function is the ratio of two modular forms of equal weight (with the denominator not identically zero).

**Exercise 32 (All elliptic curves are tori)** Let be two complex numbers with . Show that there is a lattice such that and , so in particular the elliptic curve

**Remark 33** By applying some elementary algebraic geometry transformations one can show that any (smooth, irreducible) cubic plane curve generated by a polynomial of degree is a Riemann surface complex diffeomorphic to a torus after adding some finite number of points at infinity; also, some degree curves such as

A famous application of the theory of the -invariant is to give a short Riemann surface-based proof of the the little Picard theorem (first proven in Theorem 55 of 246A Notes 4):

**Theorem 34 (Little Picard theorem)** Let be entire and non-constant. Then omits at most one point of .

*Proof:* Suppose for contradiction that omits at least two points of . By applying a linear transformation, we may assume that omits the points and . Then is a holomorphic function from to . Since the domain is simply connected, lifts to a holomorphic function from to . Since is complex diffeomorphic to a disk, this lift must be constant by Liouville’s theorem, hence is constant as required. (This is essentially Picard’s original proof of this theorem.)

The great Picard theorem can also be proven by a more sophisticated version of these methods, but it requires some study of the possible behavior of elements of ; see Exercise 37 below.

All modular forms are -periodic, and hence by Proposition 1 should have a Fourier expansion, which is also a Laurent expansion in the nome. As it turns out, the Fourier coefficients often have a highly number-theoretic interpretation. This can be illustrated with the Eisenstein series ; here we follow the treatment in Stein-Shakarchi. To compute the Fourier coefficients we first need a computation:

**Exercise 35** Let and , and let be the nome. Establish the identity

- (i) By applying the Poisson summation formula (Proposition 3(v) of Notes 2).
- (ii) By first establishing the identity by applying Proposition 1 to the difference of the two sides, and differentiating in . (It is also possible to establish (27) from differentiating and then manipulating the identities in Exercises 25 or 27 of Notes 1.)

From (25), (26) (and symmetry) one has

and hence by the above exercise Since it is not difficult to show that the double sum here is absolutey convergent and can be rearranged as we please. If we group the terms based on the product we thus have the Fourier expansion where the divisor function is defined by where the sum is over those natural numbers that divide . Thus for instance and so after some calculation and therefore thus the factor of in the definition of the -invariant normalises the “residue” of at infinity to equal .
**Remark 36** If one expands out a few more terms in the above expansions, one can calculate

**Exercise 37 (Great Picard theorem)**

- (i) Show that every fractional linear transformation on with , is either of finite order (elliptic case), conjugate to a translation for some after conjugating by another fractional linear transformation (parabolic case), or conjugate to a dilation for some after conjugating by another fractional linear transformation (hyperbolic case). (
*Hint:*study the eigenvalues and eigenvectors of , based on the value of the trace and in particular whether the magnitude of the trace is less than two, equal to two, or greater than two. Note that the trace also has to be an integer.) - (ii) Let be holomorphic. Show that there exists a holomorphic function such that for all , as well as a fractional linear transformation with and such that for all .
- (iii) If the transformation in (ii) is in the elliptic case of (i), show that is bounded in a neighbourhood of , and hence has a removable singularity at the origin. (
*Hint:*will have some finite period and can thus be studied using Proposition 1 after applying a Möbius transform to map to a disk.) - (iv) If the transformation in (ii) is in the hyperbolic case of (i), show that is bounded in a neighbourhood of , and hence has a removable singularity at the origin. (
*Hint:*The standard branch of maps to an annulus, and is invariant with respect to the dilation action . Use this to create a bounded -periodic holomorphic function on .) - (v) If the transformation in (ii) is in the parabolic case of (i), show that exhibits at most polynomial growth as one approaches , and hence has at most a pole at the origin. (
*Hint:*If for instance , then is -periodic and takes values in , and one can now repeat the arguments of (iii). Also use the expansion (28).) - (vi) Use the previous parts of this exercise to give another proof of the great Picard theorem (Theorem 56 of 245A Notes 4): the image of a holomorphic function in a punctured disk with an essential singularity at omits at most one value of .

**Exercise 38 (Dimension of space of modular forms)**

- (i) If is an even natural number, show that the dimension of the space of modular forms of weight is equal to except when is equal to mod , in which case it is equal to . (
*Hint:*for this follows from Exercise 26; to cover the larger ranges of , use the modular discriminant to show that the space of cusp forms of weight is isomorphic to the space of modular forms of weight . - (ii) If is an even natural number, show that a basis for the space of modular forms of weight is provided by the powers where range over natural numbers (including zero) with .

Thus far we have constructed modular forms and modular functions starting from Eisenstein series . There is another important, and seemingly quite different, way to generate modular forms coming from theta functions. Typically these functions are not *quite* modular in the sense given in these notes, but are close enough that after some manipulation one can transform theta functions into modular forms. The simplest example of a theta function is the Jacobi theta function

**Exercise 39** Define the Dedekind eta function by the formula

- (i) Establish the modified -periodicity
and the modified modularity
using the standard branch of the square root. (
*Hint:*a direct application of Poisson summation applied to gives a sum that looks somewhat like but with different numerical constants (in particular, one sees terms like instead of arising). Split the index of summation into three components , , based on the residue classes modulo and rearrange each component separately.) - (ii) Establish the identity
(
*Hint:*show that both sides are cusp forms of weight that vanish like near the cusp.)

**Remark 40** The relationship between and the power of the eta function can be interpreted (after some additional effort) as a relation between the modular discriminant and the theta function of a certain highly symmetric -dimensional lattice known as the Leech lattice, but we will not pursue this connection further here.

The function has a remarkable factorisation coming from Euler’s pentagonal number theorem

so that There are many proofs of the pentagonal number theorem in the literature. One approach is to first establish the more general Jacobi triple product identity:
**Theorem 41 (Jacobi triple product identity)** For any and , one has

Observe that by replacing by and with we have

and this gives the identity (31) after splitting the integers into the three residue classes modulo . One can obtain many further identities of this type by other substitutions; for instance, by setting in the triple product identity, one obtains
*Proof:* Let us denote the left-hand side and right-hand side of (33) by and respectively. For fixed , both sides are clearly holomorphic in , with . Our strategy in showing that and agree (following Stein-Shakarchi) is to first observe that they have many of the same periodicity properties. We clearly have -periodicity

**Remark 42** Another equivalent form of (32) is

Theta functions can be used to encode various number-theoretic quantities involving quadratic forms, such as sums of squares. For instance, from (30) and collecting terms one obtains the formula

for any natural number , where denotes the number of ways to express a natural number as the sum of squares of integers. From Fourier inversion (Proposition 1 and a rescaling) one then has a representation for any , which allows one to obtain asymptotics for when is large through estimation of the theta function (this is an example of the circle method); moreover, explicit identities relating the theta function to other near-modular forms (such as the Eisenstein series and their relatives) can be used to obtain exact formulae for for small values of that can be used for instance to establish the famous Lagrange four-square theorem that all natural numbers are the sum of four squares. We refer the reader to the Stein-Shakarchi text for an exposition of this connection.
**Exercise 43 (Hecke operators)** Let be a natural number.

- (i) If is a modular form of weight , and is the corresponding function on lattices given by Exercise 24, and is a positive natural number, show that there is a unique modular form of weight whose corresponding function on lattices is related to by the formula where the sum ranges over all sublattices of whose index is equal to . Show that is a linear operator on the space of weight modular forms that also maps the space of weight cusp forms to itself; this operator is known as a Hecke operator.
- (ii) Give the more explicit formula
- (iii) Show that the Hecke operators all commute with each other, thus whenever is a modular form of weight and are positive natural numbers. Furthermore show that if are coprime.
- (iv) If is a modular form of weight with Fourier expansion , show that for any coprime positive integers that the coefficient of is equal to .
- (v) Establish the multiplicativity of the Ramanujan tau function (the Fourier coefficients of the modular discriminant). (
*Hint:*use the one-dimensionality of the space of cusp forms of weight to conclude that is a simultaneous eigenfunction of the Hecke operators.)

*Hecke eigenfunctions*and are of major importance in number theory.

### 246B, Notes 2: Some connections with the Fourier transform

Previous set of notes: Notes 1. Next set of notes: Notes 3.

In Exercise 5 (and Lemma 1) of 246A Notes 4 we already observed some links between complex analysis on the disk (or annulus) and Fourier series on the unit circle:

- (i) Functions that are holomorphic on a disk are expressed by a convergent Fourier series (and also Taylor series) for (so in particular ), where conversely, every infinite sequence of coefficients obeying (1) arises from such a function .
- (ii) Functions that are holomorphic on an annulus are expressed by a convergent Fourier series (and also Laurent series) , where conversely, every doubly infinite sequence of coefficients obeying (2) arises from such a function .
- (iii) In the situation of (ii), there is a unique decomposition where extends holomorphically to , and extends holomorphically to and goes to zero at infinity, and are given by the formulae where is any anticlockwise contour in enclosing , and and where is any anticlockwise contour in enclosing but not .

This connection lets us interpret various facts about Fourier series through the lens of complex analysis, at least for some special classes of Fourier series. For instance, the Fourier inversion formula becomes the Cauchy-type formula for the Laurent or Taylor coefficients of , in the event that the coefficients are doubly infinite and obey (2) for some , or singly infinite and obey (1) for some .

It turns out that there are similar links between complex analysis on a half-plane (or strip) and Fourier *integrals* on the real line, which we will explore in these notes.

We first fix a normalisation for the Fourier transform. If is an absolutely integrable function on the real line, we define its Fourier transform by the formula

From the dominated convergence theorem will be a bounded continuous function; from the Riemann-Lebesgue lemma it also decays to zero as . My choice to place the in the exponent is a personal preference (it is slightly more convenient for some harmonic analysis formulae such as the identities (4), (5), (6) below), though in the complex analysis and PDE literature there are also some slight advantages in omitting this factor. In any event it is not difficult to adapt the discussion in this notes for other choices of normalisation. It is of interest to extend the Fourier transform beyond the class into other function spaces, such as or the space of tempered distributions, but we will not pursue this direction here; see for instance these lecture notes of mine for a treatment.
**Exercise 1 (Fourier transform of Gaussian)** If is a coplex number with and is the Gaussian function , show that the Fourier transform is given by the Gaussian , where we use the standard branch for .

The Fourier transform has many remarkable properties. On the one hand, as long as the function is sufficiently “reasonable”, the Fourier transform enjoys a number of very useful identities, such as the Fourier inversion formula

the Plancherel identity and the Poisson summation formula On the other hand, the Fourier transform also intertwines various*qualitative*properties of a function with “dual” qualitative properties of its Fourier transform ; in particular, “decay” properties of tend to be associated with “regularity” properties of , and vice versa. For instance, the Fourier transform of rapidly decreasing functions tend to be smooth. There are complex analysis counterparts of this Fourier dictionary, in which “decay” properties are described in terms of exponentially decaying pointwise bounds, and “regularity” properties are expressed using holomorphicity on various strips, half-planes, or the entire complex plane. The following exercise gives some examples of this:

**Exercise 2 (Decay of implies regularity of )** Let be an absolutely integrable function.

- (i) If has super-exponential decay in the sense that for all and (that is to say one has for some finite quantity depending only on ), then extends uniquely to an entire function . Furthermore, this function continues to be defined by (3).
- (ii) If is supported on a compact interval then the entire function from (i) obeys the bounds for . In particular, if is supported in then .
- (iii) If obeys the bound for all and some , then extends uniquely to a holomorphic function on the horizontal strip , and obeys the bound in this strip. Furthermore, this function continues to be defined by (3).
- (iv) If is supported on (resp. ), then there is a unique continuous extension of to the lower half-plane (resp. the upper half-plane which is holomorphic in the interior of this half-plane, and such that uniformly as (resp. ). Furthermore, this function continues to be defined by (3).

**Hint:**to establish holomorphicity in each of these cases, use Morera’s theorem and the Fubini-Tonelli theorem. For uniqueness, use analytic continuation, or (for part (iv)) the Cauchy integral formula.

Later in these notes we will give a partial converse to part (ii) of this exercise, known as the Paley-Wiener theorem; there are also partial converses to the other parts of this exercise.

From (3) we observe the following intertwining property between multiplication by an exponential and complex translation: if is a complex number and is an absolutely integrable function such that the modulated function is also absolutely integrable, then we have the identity

whenever is a complex number such that at least one of the two sides of the equation in (7) is well defined. Thus, multiplication of a function by an exponential weight corresponds (formally, at least) to translation of its Fourier transform. By using contour shifting, we will also obtain a dual relationship: under suitable holomorphicity and decay conditions on , translation by a complex shift will correspond to multiplication of the Fourier transform by an exponential weight. It turns out to be possible to exploit this property to derive many Fourier-analytic identities, such as the inversion formula (4) and the Poisson summation formula (6), which we do later in these notes. (The Plancherel theorem can also be established by complex analytic methods, but this requires a little more effort; see Exercise 8.)The material in these notes is loosely adapted from Chapter 4 of Stein-Shakarchi’s “Complex Analysis”.

** — 1. The inversion and Poisson summation formulae — **

We now explore how the Fourier transform of a function behaves when extends holomorphically to a strip. For technical reasons we will also impose a fairly mild decay condition on at infinity to ensure integrability. As we shall shortly see, the method of contour shifting then allows us to insert various exponentially decaying factors into Fourier integrals that make the justification of identities such as the Fourier inversion formula straightforward.

**Proposition 3 (Fourier transform of functions holomorphic in a strip)** Let , and suppose that is a holomorphic function on the strip which obeys a decay bound of the form

- (i) (Translation intertwines with modulation) For any in the strip , the Fourier transform of the function is .
- (ii) (Exponential decay of Fourier transform) For any , there is a quantity such that for all (or in asymptotic notation, one has for and ).
- (iii) (Partial Fourier inversion) For any and , one has and
- (iv) (Full Fourier inversion) For any , the identity (4) holds for this function .
- (v) (Poisson summation formula) The identity (6) holds for this function .

*Proof:* We begin with (i), which is a standard application of contour shifting. Applying the definition (3) of the Fourier transform, our task is to show that

For (ii), we apply (i) with to observe that the Fourier transform of is . Applying (8) and the triangle inequality we conclude that

for both choices of sign and all , giving the claim.For the first part of (iii), we write . By part (i), we have , so we can rewrite the desired identity as

By (3) and Fubini’s theorem (taking advantage of (8) and the exponential decay of as ) the left-hand side can be written as But a routine calculation shows that giving the claim. The second part of (iii) is proven similarly.To prove (iv), it suffices in light of (iii) to show that

for any . The left-hand side can be written after a change of variables as On the other hand, from dominated convergence as in the proof of (i) we have while from the Cauchy integral formula one has giving the claim.Now we prove (v). Let . From (i) we have

and for any . If we sum the first identity for we see from the geometric series formula and Fubini’s theorem that and similarly if we sum the second identity for we have Adding these two identities and changing variables, we conclude that We would like to use the residue theorem to evaluate the right-hand side, but we need to take a little care to avoid the poles of the integrand , which are at the integers. Hence we shall restrict to be a half-integer . In this case, a routine application of the residue theorem shows that Noting that stays bounded for in or when is a half-integer, we also see from dominated convergence as before that The claim follows.
**Exercise 4 (Hilbert transform and Plemelj formula)** Let be as in Proposition 3. Define the Cauchy-Stieltjes transform by the formula

- (i) Show that is holomorphic on and has the Fourier representation in the upper half-plane and in the lower half-plane .
- (ii) Establish the Plemelj formulae and uniformly for any , where the Hilbert transform of is defined by the principal value integral
- (iii) Show that is the unique holomorphic function on that obeys the decay bound and solves the (very simple special case of the) Riemann-Hilbert problem uniformly for all , with both limits existing uniformly in .
- (iv) Establish the identity where the signum function is defined to equal for , for , and for .
- (v) Assume now that has mean zero (i.e., ). Show that extends holomorphically to the strip and obeys the bound (8) (but possibly with a different constant , and with replaced by a slightly smaller quantity ), with the identity
holding for . ({
*Hint:*To exploit the mean value hypothesis to get good decay bounds on , write as the sum of and and use the mean value hypothesis to manage the first term. For the contribution of the second term, take advantage of contour shifting to avoid the singularity at . One may have to divide the integrals one encounters into a couple of pieces and estimate each piece separately.) - (vi) Continue to assume that has mean zero. Establish the identities
and
(
*Hint:*for the latter inequality, square both sides of (9) and use (iii).)

**Exercise 5 (Kramers-Kronig relations)** Let be a continuous function on the upper half-plane which is holomorphic on the interior of this half-plane, and obeys the bound for all non-zero in this half-plane and some . Establish the Kramers-Kronig relations

**Exercise 6**

- (i) By applying the Poisson summation formula to the function , establish the identity for any positive real number . Explain why this is consistent with Exercise 24 from Notes 1.
- (ii) By carefully taking limits of (i) as , establish yet another alternate proof of Euler’s identity

**Exercise 7** For in the upper half-plane , define the theta function . Use Exercise 1 and the Poisson summation formula to establish the modular identity

**Exercise 8 (Fourier proof of Plancherel identity)** Let be smooth and compactly supported. For any with , define the quantity

- (i) When is real, show that . (
*Hint:*find a way to rearrange the expression .) - (ii) For non-zero, show that , where the implied constant in the notation can depend on . (
*Hint:*integrate by parts several times.) - (iii) Establish the Plancherel identity (5).

*transmission coefficient*of a Dirac operator with potential and spectral parameter (or , depending on normalisations).

The Fourier inversion formula was only established in Proposition 3 for functions that had a suitable holomorphic extension to a strip, but one can relax the hypotheses by a limiting argument. Here is one such example of this:

**Exercise 9 (More general Fourier inversion formula)** Let be continuous and obey the bound for all and some . Suppose that the Fourier transform is absolutely integrable.

- (i) Show that for any , the regularised function extends to an entire function obeying the hypotheses of Proposition 3 for any , with for any . (Hint: use Exercise 1.)
- (ii) Show that the Fourier inversion formula (4) holds for this function .

**Exercise 10 (Laplace inversion formula)** Let be a continuously twice differentiable function, obeying the bounds for all and some .

- (i) Show that the Fourier transform obeys the asymptotic for any non-zero .
- (ii) Establish the principal value inversion formula
for any positive real . (
*Hint:*modify the proof of Exercise 9(ii).) What happens when is negative? zero? - (iii) Define the Laplace transform of for by the formula
Show that is continuous on the half-plane , holomorphic on the interior of this half-plane, and obeys the
*Laplace-Mellin inversion formula*for any and , where is the line segment contour from to . Conclude in particular that the Laplace transform is injective on this class of functions .

*Bromwich integral*, and often written (with a slight abuse of notation) as . The Laplace transform is a close cousin of the Fourier transform that has many uses; for instance, it is a popular tool for analysing ordinary differential equations on half-lines such as .

**Exercise 11 (Mellin inversion formula)** Let be a continuous function that is compactly supported in . Define the *Mellin transform* by the formula

*Mellin inversion formula*for any and . The regularity and support hypotheses on can be relaxed significantly, but we will not pursue this direction here.

**Exercise 12 (Perron’s formula)** Let be a function which is of subpolynomial growth in the sense that for all and , where depends on (and ). For in the half-plane , form the Dirichlet series

**Exercise 13 (Solution to Schrödinger equation)** Let be as in Proposition 3. Define the function by the formula \{ u(t,x) := \int_**R** \hat f(\xi) e^{2\pi i x \xi – 4 \pi^2 i \xi^2 t} d\xi.\}

- (i) Show that is a smooth function of that obeys the Schrödinger equation with initial condition for .
- (ii) Establish the formula for and , where we use the standard branch of the square root.

** — 2. Phragmén-Lindelöf and Paley-Wiener — **

The maximum modulus principle (Exercise 26 from 246A Notes 1) for holomorphic functions asserts that if a function continuous on a compact subset of the plane and holomorphic on the interior of that set is bounded in magnitude by a bound on the boundary , then it is also bounded by on the interior. This principle does not directly apply for noncompact domains : for instance, on the entire complex plane , there is no boundary whatsoever and the bound is clearly vacuous. On the half-plane , the holomorphic function (for instance) is bounded in magnitude by on the boundary of the half-plane, but grows exponentially in the interior. Similarly, in the strip , the holomorphic function is bounded in magnitude by on the boundary of the strip, but is grows double-exponentially in the interior of the strip. However, if one does not have such absurdly high growth, one can recover a form of the maximum principle, known as the Phragmén-Lindelöf principle. Here is one formulation of this principle:

**Theorem 14 (Lindelöf’s theorem)** Let be a continuous function on a strip for some , which is holomorphic in the interior of the strip and obeys the bound

**Remark 15** The hypothesis (12) is a qualitative hypothesis rather than a quantitative one, since the exact values of do not show up in the conclusion. It is quite a mild condition; any function of exponential growth in , or even with such super-exponential growth as or , will obey (12). The principle however fails without this hypothesis, as discussed previously.

*Proof:* By shifting and dilating (adjusting as necessary) we can reduce to the case , , and by multiplying by a constant we can also normalise .

Suppose we temporarily assume that as . Then on a sufficiently large rectangle , we have on the boundary of the rectangle, hence on the interior by the maximum modulus principle. Sending , we obtain the claim.

To remove the assumption that goes to zero at infinity, we use the trick of giving ourselves an epsilon of room. Namely, we multiply by the holomorphic function for some . A little complex arithmetic shows that the function goes to zero at infinity in . Applying the previous case to this function, then taking limits as , we obtain the claim.

**Corollary 16 (Phragmén-Lindelöf principle in a sector)** Let be a continuous function on a sector for some , which is holomorphic on the interior of the sector and obeys the bound

*Proof:* Apply Theorem 14 to the function on the strip .

**Exercise 17** With the notation and hypotheses of Theorem 14, show that the function is log-convex on .

**Exercise 18 (Hadamard three-circles theorem)** Let be a holomorphic function on an annulus . Show that the function is log-convex on .

**Exercise 19 (Phragmén-Lindelöf principle)** Let be as in Theorem 14 with , but with the hypotheses after “Suppose also” in that theorem replaced instead by the bounds and for all and some exponents and a constant . Show that one has for all and some constant (which is allowed to depend on the constants in (12), as well as ). (Hint: it is convenient to work first in a half-strip such as for some large . Then multiply by something like for some suitable branch of the logarithm and apply a variant of Theorem 14 for the half-strip. A more refined estimate in this regard is due to Rademacher.) This particular version of the principle gives the *convexity bound* for Dirichlet series such as the Riemann zeta function. Bounds which exploit the deeper properties of these functions to improve upon the convexity bound are known as *subconvexity bounds* and are of major importance in analytic number theory, which is of course well outside the scope of this course.

Now we can establish a remarkable converse of sorts to Exercise 2(ii) known as the Paley-Wiener theorem, that links the exponential growth of (the analytic continuation) of a function with the support of its Fourier transform:

**Theorem 20 (Paley-Wiener theorem)** Let be a continuous function obeying the decay condition

- (i) The Fourier transform is supported on .
- (ii) extends analytically to an entire function that obeys the bound for some .
- (iii) extends analytically to an entire function that obeys the bound for some .

The continuity and decay hypotheses on can be relaxed, but we will not explore such generalisations here.

*Proof:* If (i) holds, then by Exercise 9, we have the inversion formula (4), and the claim (iii) then holds by a slight modification of Exercise 2(ii). Also, the claim (iii) clearly implies (ii).

Now we see why (iii) implies (i). We first assume that we have the stronger bound

for . Then we can apply Proposition 3 for any , and conclude in particular that for any and . Applying (14) and the triangle inequality, we see that If , we can then send and conclude that ; similarly for we can send and again conclude . This establishes (i) in this case.Now suppose we only have the weaker bound on assumed in (iii). We again use the epsilon of room trick. For any , we consider the modified function . This is still holomorphic on the lower half-plane and obeys a bound of the form (14) on this half-plane. An inspection of the previous arguments shows that we can still show that for despite no longer having holomorphicity on the entire upper half-plane; sending using dominated convergence we conclude that for . A similar argument (now using in place of shows that for . This proves (i).

Finally, we show that (ii) implies (iii). The function is entire, bounded on the real axis by (13), bounded on the upper imaginary axis by (iii), and has exponential growth. By Corollary 16, it is also bounded on the upper half-plane, which gives (iii) in the upper half-plane as well. A similar argument (using in place of ) also yields (iii) in the lower half-plane.

** — 3. The Hardy uncertainty principle — **

Informally speaking, the uncertainty principle for the Fourier transform asserts that a function and its Fourier transform cannot simultaneously be strongly localised, except in the degenerate case when is identically zero. There are many rigorous formulations of this principle. Perhaps the best known is the *Heisenberg uncertainty principle*

Another manifestation of the uncertainty principle is the following simple fact:

- (i) If is an integrable function that has exponential decay in the sense that one has for all and some , then the Fourier transform is either identically zero, or only has isolated zeroes (that is to say, the set is discrete.
- (ii) If is a compactly supported continuous function such that is also compactly supported, then is identically zero.

*Proof:* For (i), we observe from Exercise 2(iii) that extends holomorphically to a strip around the real axis, and the claim follows since non-zero holomorphic functions have isolated zeroes. For (ii), we observe from (i) that must be identically zero, and the claim now follows from the Fourier inversion formula (Exercise 9).

Lemma 21(ii) rules out the existence of a bump function whose Fourier transform is also a bump function, which would have been a rather useful tool to have in harmonic analysis over the reals. (Such functions do exist however in some non-archimedean domains, such as the -adics.) On the other hand, from Exercise 1 we see that we do at least have gaussian functions whose Fourier transform also decays as a gaussian. Unfortunately this is basically the best one can do:

**Theorem 22 (Hardy uncertainty principle)** Let be a continuous function which obeys the bound for all and some . Suppose also that for all and some . Then is a scalar multiple of the gaussian , that is to say one has for some .

*Proof:* By replacing with the rescaled version , which replaces with the rescaled version , we may normalise . By multiplying by a small constant we may also normalise .

From Exercise 2(i), extends to an entire function. By the triangle inequality, we can bound

for any . Completing the square and using , we conclude the bound In particular, if we introduce the normalised function then In particular, is bounded by on the imaginary axis. On the other hand, from hypothesis is also bounded by on the real axis. We can now*almost*invoke the Phragmén-Lindeöf principle (Corollary 16) to conclude that is bounded on all four quadrants, but the growth bound we have (15) is just barely too weak. To get around this we use the epsilon of room trick. For any , the function is still entire, and is still bounded by in magnitude on the real line. From (15) we have so in particular it is bounded by on the slightly tilted imaginary axis . We can now apply Corollary 16 in the two acute-angle sectors between and to conclude that in those two sectors; letting , we conclude that in the first and third quadrants. A similar argument (using negative values of ) shows that in the second and fourth quadrants. By Liouville’s theorem, we conclude that is constant, thus we have for some complex number . The claim now follows from the Fourier inversion formula (Proposition 3(iv)) and Exercise 1.

One corollary of this theorem is that if is continuous and decays like or better, then cannot decay any faster than without vanishing identically. This is a stronger version of Lemma 21(ii). There is a more general tradeoff known as the Gel’fand-Shilov uncertainty principle, which roughly speaking asserts that if decays like then cannot decay faster than without vanishing identically, whenever are dual exponents in the sense that , and is large enough (the precise threshold was established in work of Morgan). See for instance this article of Nazarov for further discussion of these variants.

**Exercise 23** If is continuous and obeys the bound for some and and all , and obeys the bound for some and all , show that is of the form for some polynomial of degree at most .

**Exercise 24** In this exercise we develop an alternate proof of (a special case of) the Hardy uncertainty principle, which can be found in the original paper of Hardy. Let the hypotheses be as in Theorem 22.

- (i) Show that the function is holomorphic on the region and obeys the bound in this region, where we use the standard branch of the square root.
- (ii) Show that the function is holomorphic on the region and obeys the bound in this region.
- (iii) Show that and agree on their common domain of definition.
- (iv) Show that the functions are constant. (You may find Exercise 13 from 246A Notes 4 to be useful.)
- (v) Use the above to give an alternate proof of Theorem 22 in the case when is even. (
*Hint:*subtract a constant multiple of a gaussian from to make vanish, and conclude on Taylor expansion around the origin that all the even moments vanish. Conclude that the Taylor series coefficients of around the origin all vanish.)

**Exercise 25** (This problem is due to Tom Liggett; see this previous post.) Let be a sequence of complex numbers bounded in magnitude by some bound , and suppose that the power series obeys the bound for all and some .

- (i) Show that the Laplace transform extends holomorphically to the region and obeys the bound in this region.
- (ii) Show that the function is holomorphic in the region , obeys the bound in this region, and agrees with on the common domain of definition.
- (iii) Show that is a constant multiple of .
- (iv) Show that the sequence is a constant multiple of the sequence .

**Remark 26** There are many further variants of the Hardy uncertainty principle. For instance we have the following uncertainty principle of Beurling, which we state in a strengthened form due to Bonami, Demange, and Jaming: if is a square-integrable function such that , then is equal (almost everywhere) to a polynomial times a gaussian; it is not difficult to show that this implies Theorem 22 and Exercise 23, as well as the Gel’fand-Shilov uncertainty principle. In recent years, PDE-based proofs of the Hardy uncertainty principle have been established, which have then been generalised to establish uncertainty principles for various Schrödinger type equations; see for instance this review article of Kenig. I also have some older notes on the Hardy uncertainty principle in this blog post. Finally, we mention the *Beurling-Malliavin theorem*, which provides a precise description of the possible decay rates of a function whose Fourier transform is compactly supported; see for instance this paper of Mashregi, Nazarov, and Khavin for a modern treatment.