# Matematički blogovi

### Some slides from Basel

I have just returned from Basel, Switzerland, on the occasion of the awarding of the 2019 Ostrowski prize to Assaf Naor. I was invited to give the laudatio for Assaf’s work, which I have uploaded here. I also gave a public lecture (intended at the high school student level) at the University of Basel entitled “The Notorious Collatz conjecture”; I have uploaded the slides for that here. (Note that the slides here are somewhat unpolished as I was not initially planning to make them public until I was recently requested to do so. In particular I do not have full attribution for some of the images used in the slides.)

Basel has historically been home to a number of very prominent mathematicians, most notably Jacob Bernoulli, whose headstone I saw at the Basel Minster,

and also Leonhard Euler, for which I could not find a formal memorial, but I did at least see a hotel bearing his name:

### Decrease of Fourier coefficients of stationary measures on the circle (after Jialun Li)

Last January 8, 2020, Jialun Li gave the talk “*Decrease of Fourier coefficients of stationary measures on the circle*” in the “flat seminar” that I co-organize with Anton Zorich once per month.

In this post, I’ll transcript my notes of this nice talk (while taking full responsibility for any errors/mistakes in what follows).

**1. Introduction**

**1.1. Stationary measures**

Consider the linear action of on induces an action on the projective space . For later use, recall that via

Given a probability measure on , we can build a Markov chain / random walk whose steps consist into taking points into where is chosen accordingly with the law of .

The absence of hypothesis on might lead to uninteresting random walks: in fact, if a point is stabilized by two elements , then the random walk starting at associated to is not very interesting.

For this reason, we shall assume that

Hypothesis (i): the support of generates a *Zariski-dense* semigroup .

**Remark 1** *By Tits alternative, in our current setting of , the hypothesis (i) can be reformulated by replacing “Zariski-dense” with “not solvable”.*

As it was famously established by Furstenberg, the random walks associated to have a well-defined asymptotic behaviour whenever (i) is fulfilled:

**Theorem 1 (Furstenberg)** *Under (i), there exists (an unique) probability measure on such that, for all ,*

*as . Here, the convolution of with a probability measure on is a probability measure on defined as*

*so that is the distribution of points obtained from after steps of the Markov chain associated to .*

In the literature, is called *Furstenberg measure*, and it is an important example of –*stationary measure*, i.e., a probability measure on which is “invariant on average”:

**1.2. Lyapunov exponents**

The stationary measure can be used to describe the growth of the norms of random products associated to -almost every whenever satisfies (i) *and* its first moment is finite:

**Theorem 2 (Furstenberg, Guivarch–Raugi)** *If has finite first moment, i.e.,*

*and satisfies (i), then*

*for -almost every .*

The quantity is called *Lyapunov exponent*.

**1.3. Regularity of stationary measures**

The Furstenberg measure dictates the distribution of the Markov chains associated to and, for this reason, it is natural to inquiry about the *regularity* properties of stationary measures.

In this direction, Guivarch showed that the Furstenberg measures have a certain regularity when satisfies (i) and its exponential moment is finite:

Hypothesis (ii): there exists with .

**Theorem 3 (Guivarch)** *Under (i) and (ii), there are and such that*

*for all and (where is the interval of radius centered at ). In particular, has no atoms.*

More recently, Jialun Li established in this article here another regularity result by showing the decay of the *Fourier coefficients*

(where ). More concretely, he proved that:

**Theorem 4 (Li)** *Under (i) and (ii), we have . In other words, is a Rajchman measure.*

In a certain sense, the role of assumption (i) in the previous theorem is to avoid the following kind of example:

**Example 1** *Let with*

*Note that the semigroup generated by is not Zariski dense in (as and are upper-triangular).*We affirm that there is no decay of Fourier coefficients in this situation. Indeed, recall that if we identify with via , then acts on via Möbius transformations, i.e., an element acts on as

*In particular, , , and the Fourier coefficients of the stationary measure given by the standard Hausdorff measure on middle-third Cantor set do not decay to zero.*In a similar vein, if is a real number such that is a Pisot number, then with

*admits a stationary measure (called the Bernoulli convolution of parameter describing the distribution of the points where with probability ) whose Fourier coefficients do not decay.*

The proof of Theorem 4 is based on a renewal theorem. More concretely, given a function , let

By thinking of as a smooth version of the characteristic function of an interval , we see that is “counting random products with norm in the interval ”. In this context, Guivarch and Le Page established the following renewal theorem:

**Theorem 5 (Guivarch–Le Page)** *Under (i) and (ii), one has*

*as .*

**Remark 2** *Another important fact in the proof of Theorem 4 is the non-arithmeticity of the Jordan projections of the elements of , i.e., the fact that these Jordan projections generate a dense subgroup of (whenever (i) is satisfied).*

Since we will come back later to the discussion of deriving the decay of Fourier coefficients (e.g., Theorem 4) from a renewal theorem, let us now move forward in order to introduce the main result of this post, namely, a *quantitative* version of Theorem 4.

**2. Quantitative decay of Fourier coefficients**

The central result of this post is inspired by the following theorem of Bourgain and Dyatlov.

**Theorem 6 (Bourgain–Dyatlov)** *If is the Patterson–Sullivan measure associated to a Schottky subgroup of , then there exists (depending only on the dimension of , i.e., the Hausdorff dimension of the limit set of the Schottky subgroup) such that*

*for all .*

The method of proof of this result is based on the so-called discretized sum-product estimates from additive combinatorics.

Interestingly enough, this result can be interpreted as a decay of Fourier coefficients of certain *stationary* measures thanks to the following theorem:

**Theorem 7 (Furstenberg, Sullivan, …)** *The Patterson–Sullivan measure of a Schottky subgroup coincides with the stationary measure of some probability measure on satisfying (i) and (ii).*

**Remark 3** *We saw the proof of a version of this result for cocompact lattices of in Proposition 14 of this blog post here.*

The previous theorems suggest that a decay of Fourier coefficients of the Furstenberg measure associated to a probability measure on satisfying (i) and (ii). This statement was recently proved by Jialun Li in this article here.

**Theorem 8 (Li)** *If is a probability measure on satisfying (i) and (ii), then there exists such that the Furstenberg measure associated to verifies*

*for all .*

**Remark 4** *Actually, Li’s theorem is stated in his article for any real split semisimple Lie group .*

The proof of this result is also based on a discretized sum-product estimate. Moreover, this statement is closely related to spectral gap of transfer operators and a renewal theorem:

**Theorem 9 (Li)** *Let be a probability measure on verifying (i) and (ii). Given , consider the transfer operator*

*acting on (with small enough). Then,we have the following spectral gap property: there exists such that the spectral radius of satisfies*

*for all .*

**Theorem 10 (Li)** *Under (i) and (ii), there exists such that the renewal operator satisfies*

*for all .*

In his article, Li establishes first Theorem 8 from a discretized sum-product estimate, and subsequently Theorems 9 and 10 are deduced from Theorem 8.

Nevertheless, Li pointed out in his talk that Theorems 8, 9 and 10 are “morally equivalent” to each other. In fact,

- Theorem 8 Theorem 9: the Fourier decay can be used to prove spectral gap for transfer operators via the so-called
*Dolgopyat method*(which was discussed in this blog post here); - Theorem 9 Theorem 10: the spectral gap for transfer operators allows to deduce the renewal theorem because some elementary calculations reveal that is related to ;
- Theorem 10 Theorem 8: let us finally fulfil our promise made in the end of the previous section by briefly explaining the idea of the derivation of the Fourier decay in Theorem 8 from the renewal theorem in Theorem 10; since , the th Fourier coefficient of the Furstenberg measure is
by Cauchy–Schwarz inequality, the control of is reduced to the study of

since , we see that the size of the integral above depends on the “number of random products with norm in a given interval”, and the answer to this kind of “counting problem” is encoded in the asymptotic property of the renewal operator provided by Theorem 10.

**Remark 5** *The analog of Theorem 10 in Abelian settings is false: the random walks driven by a finitely supported law on which is not arithmetic (i.e., its support generates a dense subgroup) verify a renewal theorem*

*for , but the error term is never exponential because grows polynomially with . (Of course, this phenomenon is avoided in the context of thanks to the fact that the Zariski-density assumption (i) on ensures an exponential growth of with .)*

### Louis Nirenberg

I just heard the news that Louis Nirenberg died a few days ago, aged 94. Nirenberg made a vast number of contributions to analysis and PDE (and his work has come up repeatedly on my own blog); I wrote about his beautiful moving planes argument with Gidas and Ni to establish symmetry of ground states in this post on the occasion of him receiving the Chern medal, and on how his extremely useful interpolation inequality with Gagliardo (generalising a previous inequality of Ladyzhenskaya) can be viewed as an amplification of the usual Sobolev inequality in this post. Another fundamentally useful inequality of Nirenberg is the John-Nirenberg inequality established with Fritz John: if a (locally integrable) function (which for simplicity of exposition we place in one dimension) obeys the bounded mean oscillation property

for all intervals , where is the average value of on , then one has exponentially good large deviation estimates

for all and some absolute constant . This can be compared with Markov’s inequality, which only gives the far weaker decay

The point is that (1) is assumed to hold not just for a given interval , but also all subintervals of , and this is a much more powerful hypothesis, allowing one for instance to use the standard Calderon-Zygmund technique of stopping time arguments to “amplify” (3) to (2). Basically, for any given interval , one can use (1) and repeated halving of the interval until significant deviation from the mean is encountered to locate some disjoint exceptional subintervals where deviates from by , with the total measure of the being a small fraction of that of (thanks to a variant of (3)), and with staying within of at almost every point of outside of these exceptional intervals. One can then establish (2) by an induction on . (There are other proofs of this inequality also, e.g., one can use Bellman functions, as discussed in this old set of notes of mine.) Informally, the John-Nirenberg inequality asserts that functions of bounded mean oscillation are “almost as good” as bounded functions, in that they almost always stay within a bounded distance from their mean, and in fact the space BMO of functions of bounded mean oscillation ends up being superior to the space of bounded measurable functions for many harmonic analysis purposes (among other things, the space is more stable with respect to singular integral operators).

I met Louis a few times in my career; even in his later years when he was wheelchair-bound, he would often come to conferences and talks, and ask very insightful questions at the end of the lecture (even when it looked like he was asleep during much of the actual talk!). I have a vague memory of him asking me some questions in one of the early talks I gave as a postdoc; I unfortunately do not remember exactly what the topic was (some sort of PDE, I think), but I was struck by how kindly the questions were posed, and how patiently he would listen to my excited chattering about my own work.

### Equidistribution of Syracuse random variables and density of Collatz preimages

Define the *Collatz map* on the natural numbers by setting to equal when is odd and when is even, and let denote the forward Collatz orbit of . The notorious Collatz conjecture asserts that for all . Equivalently, if we define the backwards Collatz orbit to be all the natural numbers that encounter in their forward Collatz orbit, then the Collatz conjecture asserts that . As a partial result towards this latter statement, Krasikov and Lagarias in 2003 established the bound

for all and . (This improved upon previous values of obtained by Applegate and Lagarias in 1995, by Applegate and Lagarias in 1995 by a different method, by Wirsching in 1993, by Krasikov in 1989, by Sander in 1990, and some by Crandall in 1978.) This is still the largest value of for which (1) has been established. Of course, the Collatz conjecture would imply that we can take equal to , which is the assertion that a positive density set of natural numbers obeys the Collatz conjecture. This is not yet established, although the results in my previous paper do at least imply that a positive density set of natural numbers iterates to an (explicitly computable) bounded set, so in principle the case of (1) could now be verified by an (enormous) finite computation in which one verifies that every number in this explicit bounded set iterates to . In this post I would like to record a possible alternate route to this problem that depends on the distribution of a certain family of random variables that appeared in my previous paper, that I called *Syracuse random variables*.

**Definition 1 (Syracuse random variables)** For any natural number , a *Syracuse random variable* on the cyclic group is defined as a random variable of the form

where are independent copies of a geometric random variable on the natural numbers with mean , thus

} for . In (2) the arithmetic is performed in the ring .

Thus for instance

and so forth. After reversing the labeling of the , one could also view as the mod reduction of a -adic random variable

The probability density function of the Syracuse random variable can be explicitly computed by a recursive formula (see Lemma 1.12 of my previous paper). For instance, when , is equal to for respectively, while when , is equal to

when respectively.

The relationship of these random variables to the Collatz problem can be explained as follows. Let denote the odd natural numbers, and define the *Syracuse map* by

where the –valuation is the number of times divides . We can define the forward orbit and backward orbit of the Syracuse map as before. It is not difficult to then see that the Collatz conjecture is equivalent to the assertion , and that the assertion (1) for a given is equivalent to the assertion

for all , where is now understood to range over odd natural numbers. A brief calculation then shows that for any odd natural number and natural number , one has

where the natural numbers are defined by the formula

so in particular

Heuristically, one expects the -valuation of a typical odd number to be approximately distributed according to the geometric distribution , so one therefore expects the residue class to be distributed approximately according to the random variable .

The Syracuse random variables will always avoid multiples of three (this reflects the fact that is never a multiple of three), but attains any non-multiple of three in with positive probability. For any natural number , set

Equivalently, is the greatest quantity for which we have the inequality

for all integers not divisible by three, where is the set of all tuples for which

Thus for instance , , and . On the other hand, since all the probabilities sum to as ranges over the non-multiples of , we have the trivial upper bound

There is also an easy submultiplicativity result:

**Lemma 2** For any natural numbers , we have

*Proof:* Let be an integer not divisible by , then by (4) we have

If we let denote the set of tuples that can be formed from the tuples in by deleting the final component from each tuple, then we have

with an integer not divisible by three. By definition of and a relabeling, we then have

for all . For such tuples we then have

so that . Since

for each , the claim follows.

From this lemma we see that for some absolute constant . Heuristically, we expect the Syracuse random variables to be somewhat approximately equidistributed amongst the multiples of (in Proposition 1.4 of my previous paper I prove a fine scale mixing result that supports this heuristic). As a consequence it is natural to conjecture that . I cannot prove this, but I can show that this conjecture would imply that we can take the exponent in (1), (3) arbitrarily close to one:

**Proposition 3** Suppose that (that is to say, as ). Then

as , or equivalently

as . In other words, (1), (3) hold for all .

I prove this proposition below the fold. A variant of the argument shows that for any value of , (1), (3) holds whenever , where is an explicitly computable function with as . In principle, one could then improve the Krasikov-Lagarias result by getting a sufficiently good upper bound on , which is in principle achievable numerically (note for instance that Lemma 2 implies the bound for any , since for any ).

** — 1. Proof of proposition — **

Assume . Let be sufficiently small, and let be sufficiently large depending on . We first establish the following proposition, that shows that elements in a certain residue class have a lot of Syracuse preimages:

**Proposition 4** There exists a residue class of with the property that for all integers in this class, and all non-negative integers , there exist natural numbers with

and

and at least tuples

obeying the additional properties

*Proof:* We begin with the base case . By (4) and the hypothesis , we see that

for all integers not divisible by . Let denote the tuples in that obey the additional regularity hypotheses

for all ,note that this implies in particular the case of (7). From the Chernoff inequality (noting that the geometric random variable has mean ) and the union bound we have

for an absolute constant (where we use the periodicity of in to define for by abuse of notation). Hence by the pigeonhole principle we can find a residue class not divisible by such that

and hence by the triangle inequality we have

for all in this residue class.

Henceforth is assumed to be an element of this residue class. For , we see from (8)

hence by the pigeonhole principle there exists (so in particular ) such that

so the number of summands here is at least . This establishes the base case .

Now suppose inductively that , and that the claim has already been proven for . By induction hypothesis, there exists natural numbers with

(which in particular imply that ) and at least tuples

obeying the additional properties

and (7) for all .

For each tuple (10), we may write (as in the proof of Lemma 2)

for some integers . We claim that these integers lie in distinct residue classes modulo where

Indeed, suppose that for two tuples , of the above form. Then

(where we now invert in the ring ), or equivalently

By (11), (7), all the summands on the left-hand side are natural numbers of size , hence the sum also has this size; similarly for the right-hand side. From the estimates of , we thus see that both sides are natural numbers between and , by hypothesis on . Thus we may remove the modular constraint and conclude that

and then a routine induction (see Lemma 6.2 of my paper) shows that . This establishes the claim.

As a corollary, we see that every residue class modulo contains

of the at most. Since there were at least tuples to begin with, we may therefore forbid up to residue classes modulo , and still have surviving tuples with the property that avoids all the forbidden classes.

Let be one of the tuples (10). By the hypothesis , we have

Let denote the set of tuples with the additional property

for all , then by the Chernoff bound we have

for some absolute constant . Thus, by the Markov inequality, by forbidding up to classes, we may ensure that

and hence

We thus have

where run over all tuples with being one of the previously surviving tuples, and . By (11) we may rearrange this a little as

By construction, we have

for any tuple in the above sum, hence by the pigeonhole principle we may find an integer

In particular the number of summands is at least . Also observe from (13), (12) that

so in particular

It is a routine matter to verify that all tuples in this sum lie in and obeys the requirements (6), (7), closing the induction hypothesis.

**Corollary 5** For all in the residue class from the previous proposition, and all , we have

In particular, we have

as .

*Proof:* For every tuple in the previous proposition, we have

for some integer . As before, all these integers are distinct, and have magnitude

From construction we also have , so that . The number of tuples is at least

which can be computed from the properties of to be of size at least . This gives the first claim, and the second claim follows by taking to be the first integer for which .

To conclude the proof of Proposition 3, it thus suffices to show that

**Lemma 6** Every residue class has a non-trivial intersection with .

Indeed, if we let be the residue class from the preceding propositions, and use this lemma to produce an element of that lies in this class, then from the inclusion we obtain (3) with , and then on sending to zero we obtain the claim.

*Proof:* An easy induction (based on first establishing that for all natural numbers ) shows that the powers of two modulo occupy every residue class not divisible by . From this we can locate an integer in of the form . Since , the claim follows.

We remark that the same argument in fact shows (assuming of course) that

in the limit for any natural number not divisible by three.

### Some recent papers

Just a brief post to record some notable papers in my fields of interest that appeared on the arXiv recently.

- “A sharp square function estimate for the cone in “, by Larry Guth, Hong Wang, and Ruixiang Zhang. This paper establishes an optimal (up to epsilon losses) square function estimate for the three-dimensional light cone that was essentially conjectured by Mockenhaupt, Seeger, and Sogge, which has a number of other consequences including Sogge’s local smoothing conjecture for the wave equation in two spatial dimensions, which in turn implies the (already known) Bochner-Riesz, restriction, and Kakeya conjectures in two dimensions. Interestingly, modern techniques such as polynomial partitioning and decoupling estimates are not used in this argument; instead, the authors mostly rely on an induction on scales argument and Kakeya type estimates. Many previous authors (including myself) were able to get weaker estimates of this type by an induction on scales method, but there were always significant inefficiencies in doing so; in particular knowing the sharp square function estimate at smaller scales did not imply the sharp square function estimate at the given larger scale. The authors here get around this issue by finding an even stronger estimate that implies the square function estimate, but behaves significantly better with respect to induction on scales.
- “On the Chowla and twin primes conjectures over “, by Will Sawin and Mark Shusterman. This paper resolves a number of well known open conjectures in analytic number theory, such as the Chowla conjecture and the twin prime conjecture (in the strong form conjectured by Hardy and Littlewood), in the case of function fields where the field is a prime power which is fixed (in contrast to a number of existing results in the “large ” limit) but has a large exponent . The techniques here are orthogonal to those used in recent progress towards the Chowla conjecture over the integers (e.g., in this previous paper of mine); the starting point is an algebraic observation that in certain function fields, the Mobius function behaves like a quadratic Dirichlet character along certain arithmetic progressions. In principle, this reduces problems such as Chowla’s conjecture to problems about estimating sums of Dirichlet characters, for which more is known; but the task is still far from trivial.
- “Bounds for sets with no polynomial progressions“, by Sarah Peluse. This paper can be viewed as part of a larger project to obtain quantitative density Ramsey theorems of Szemeredi type. For instance, Gowers famously established a relatively good quantitative bound for Szemeredi’s theorem that all dense subsets of integers contain arbitrarily long arithmetic progressions . The corresponding question for polynomial progressions is considered more difficult for a number of reasons. One of them is that dilation invariance is lost; a dilation of an arithmetic progression is again an arithmetic progression, but a dilation of a polynomial progression will in general not be a polynomial progression with the same polynomials . Another issue is that the ranges of the two parameters are now at different scales. Peluse gets around these difficulties in the case when all the polynomials have distinct degrees, which is in some sense the opposite case to that considered by Gowers (in particular, she avoids the need to obtain quantitative inverse theorems for high order Gowers norms; which was recently obtained in this integer setting by Manners but with bounds that are probably not strong enough to for the bounds in Peluse’s results, due to a degree lowering argument that is available in this case). To resolve the first difficulty one has to make all the estimates rather uniform in the coefficients of the polynomials , so that one can still run a density increment argument efficiently. To resolve the second difficulty one needs to find a quantitative concatenation theorem for Gowers uniformity norms. Many of these ideas were developed in previous papers of Peluse and Peluse-Prendiville in simpler settings.
- “On blow up for the energy super critical defocusing non linear Schrödinger equations“, by Frank Merle, Pierre Raphael, Igor Rodnianski, and Jeremie Szeftel. This paper (when combined with two companion papers) resolves a long-standing problem as to whether finite time blowup occurs for the defocusing supercritical nonlinear Schrödinger equation (at least in certain dimensions and nonlinearities). I had a previous paper establishing a result like this if one “cheated” by replacing the nonlinear Schrodinger equation by a system of such equations, but remarkably they are able to tackle the original equation itself without any such cheating. Given the very analogous situation with Navier-Stokes, where again one can create finite time blowup by “cheating” and modifying the equation, it does raise hope that finite time blowup for the incompressible Navier-Stokes and Euler equations can be established… In fact the connection may not just be at the level of analogy; a surprising key ingredient in the proofs here is the observation that a certain blowup ansatz for the nonlinear Schrodinger equation is governed by solutions to the (compressible) Euler equation, and finite time blowup examples for the latter can be used to construct finite time blowup examples for the former.

### Breuillard–Sert’s joint spectrum (III)

Last time, we saw that if is a compact subset of reductive, real linear algebraic group such that the monoid generated by is Zariski dense in , then the Cartan projections and the Jordan projections associated to converge in the Hausdorff topology to the same limit , an object baptised “*joint spectrum* of ” by Breuillard and Sert.

Today, I’ll transcript below my notes of a talk by Romain Dujardin explaining to the participants of our *groupe de travail* some basic convexity and continuity properties of the joint spectrum. After that, we close the post with a brief discussion of the question of prescribing the joint spectrum.

As usual, all mistakes in what follows are my sole responsibility.

**1. Preliminaries**

Let us warm up by reviewing the setting of the previous posts of this series.

Let be a reductive real linear algebraic group and denote its rank by . By definition, a maximal torus is isomorphic to .

The Cartan decomposition (with a maximal compact subgroup of ) allows to write any as for an unique where is a choice of Weyl chamber in the Lie algebra of . The interior of the Weyl chamber is denoted by .

**Example 1** *For , we can take in , so that .*

The element is called the Cartan projection of .

**Example 2** *For , , where are the singular values of .*

Similarly, the Jordan projection is defined in terms of the Jordan-Chevalley decomposition. For , this amounts to write the Jordan normal form with diagonalisable and nilpotent, so that with unipotent, , and has eigenvalues where are the eigenvalues of (ordered by decreasing sizes of their moduli).

The group has a family of distinguished representations such that the components of the vectors , resp. , are linear combinations of , resp. . In particular, the usual formula for the spectral radius implies that as (and, as it turns out, this fact is important in establishing the coincidence of the limits of the sequences and ).

**Example 3** *For , the representations of on , , have the property that the eigenvalue of with the largest modulus is .*

The rank of can be written as where is the dimension of the center of . In the literature, is called the *semi-simple rank* of . In general, we have “truly” distinguished representations which are completed by a choice of characters of .

**Example 4** *For , , and the representations from the previous example have the property that with is “truly” distinguished and the determinant representation comes from the center.*

**Remark 1** *Recall that a weight of a representation of is a generalized eigenvalue associated to a non-trivial -invariant subspace, i.e., is a weight whenever*

*The weights are partially ordered via if and only if for all , and any irreducible representation possesses an unique maximal weight (and, as it turns out, is one-dimensional).**In this context, the distinguished representations form a family of representations whose maximal weights provide a basis of .*

A matrix is proximal when its projective action on possesses an attracting fixed point and a repulsive hyperplane . Also, an element is called –*proximal* if and only if the matrices are proximal for all (or, equivalently, ).

A matrix is -proximal whenever is proximal, , and for all , (where is the Fubini-Study on the projective space ). Moreover, is –*proximal* if and only if the matrices are -proximal for all .

A beautiful theorem of Abels–Margulis–Soifer asserts that -proximal elements are really abundant: given a Zariski-dense monoid of , there exists such that for all , one can find a finite subset with the property that for any , one can find with -proximal.

In the previous post of this series, we saw that Abels–Margulis–Soifer was at the heart of Breuillard–Sert proof of the following result:

**Theorem 1** *If is compact and the monoid generated by is Zariski-dense in , then the sequences and converge in Hausdorff topology to a compact subset called the joint spectrum of .*

After this brief review of the definition of the joint spectrum, let us now study some of its basic properties.

**2. Convexity of the joint spectrum**

**Theorem 2** * is a convex subset of .*

**Remark 2** *Later, we will see some sufficient conditions to get .*

Similarly to the proof of Theorem 1, some important ideas behind the proof of Theorem 2 are:

- the Jordan projection behaves well under powers: ;
- the Cartan projection is subadditive: ;
- the Cartan and Jordan projections of proximal elements are comparable: there is a constant such that for all -proximal;
- Abels–Margulis–Soifer provides a huge supply of proximal elements.

We start to formalize these ideas with the following lemma:

**Lemma 3** *If and are -proximal elements, then there are and such that for all .*

*Proof:* After replacing by the matrix , our task is reduced to study the behaviours of the eigenvalues of largest moduli of proximal matrices .

By definition of proximality, the matrices converge to a projection on parallel to as . Also, an analogous statement is valid for . In particular, for any , one has

as .

It is not hard to show that there exists such that is not nilpotent: in fact, this happens because is Zariski-dense and the nilpotency condition can be describe in polynomial terms. In particular, and, by continuity, there exists with

for all . This ends the proof.

At this point, we are ready to prove Theorem 2. Since is a compact subset of , the proof of its convexity is reduced to show that for all .

For this sake, we begin by applying Abels–Margulis–Soifer theorem in order to fix and a finite subset so that for any we can find with -proximal. By definition, there exists such that any satisfies for some .

Next, we consider and we recall that . Hence, given , we have that for all sufficiently large, there are with

Now, we select with and -proximal. Recall that, by proximality, there exists a constant with

(and an analogous statement is also true for ). Furthermore, by Lemma 3, there are , say , and with

for all . Observe that .

By dividing by , by taking large (so that ) and by letting (so that ), we see that

for and sufficiently large.

Since is arbitrary and is closed, this proves that . This completes the proof of Theorem 2.

**3. Continuity properties of the joint spectrum**

**3.1. Domination and continuity**

**Definition 4** *We say that is -dominated if there exists such that*

*for all sufficiently large and . (Recall that are the singular values of .)*

**Definition 5** *We say that is -dominated if is -dominated for all .*

**Remark 3** *If is -dominated, then it is possible to show that the joint spectrum is well-defined even when is not Zariski dense in .*

The next proposition asserts that the notion of -domination generalizes the concept of matrices with simple spectrum (i.e., all of its eigenvalues have distinct moduli and multiplicity one).

**Proposition 6** * is -dominated if and only if .*

On the other hand, the notion of -domination is related to Schottky families.

**Definition 7** *We say that is a -Schottky family if*

- (a) any is -proximal;
- (b) for all .

**Proposition 8** * is -dominated there are and so that is a -Schottky family.*

*Proof:* Let us first establish the implication . It is not hard to see that if is -dominated, then is -dominated. Therefore, we can assume that is a -Schottky family. At this point, we invoke the following lemma due to Breuillard–Gelander:

**Lemma 9 (Breuillard–Gelander)** *If is -Lipschitz on an non-empty open subset of , then .*

*Proof:* Thanks to the decomposition, we can assume that . Given and sufficiently small, our assumption on implies that and . These inequalities imply the desired fact that after some computations with the Fubini-Study metric .

If is a -Schottky family, then all elements of are -Lipschitz on a neighborhood of any fixed , for all . By the previous lemma, we conclude that for all sufficiently large and . Thus, is -dominated.

Let us now prove the implication . For this sake, we use a result of Bochi–Gourmelon (justifying the nomenclature “domination”): is -dominated if and only if there is a dominated splitting for a natural linear cocycle over the full shift dynamics on , i.e.,

*Splitting condition*: there are continuous maps and such that for all (here, is the Grassmannian of hyperplanes of );*Invariance condition*: and for all (here, denotes the left shift dynamics );*Domination condition*: the weakest contraction along dominates the strongest expansion along , that is, there are and such that .

**Remark 4** *For , the equivalence between -domination and the presence of dominated splittings was established by Yoccoz.*

An important metaprinciple in Dynamics (going back to the classical proofs of the stable manifold theorem) asserts that “stable spaces depend only on the future orbit”. In our present context, this is reflected by the fact that one can show that depends only on and depends only on for all .

An interesting consequence of this fact is the following statement about the “non-existence of tangencies between and ”: if is -dominated, then for all . Indeed, this statement can be easily obtained by contradiction: if for some and , then has the property that and . Hence, , a contradiction with the splitting condition above.

At this stage, we are ready to show that if is -dominated, then is a -Schottky family for some and . In fact, given , let be the periodic sequence obtained by infinite concatenation of the word . We affirm that, for sufficiently large, is proximal with and , and -Lipschitz outside the -neighborhood of . This happens because the compactness of and the non-existence of tangencies between and provide an uniform transversality between and . By combining this information with the domination condition above (and the fact that for sufficiently large), a small linear-algebraic computation reveals that any is proximal and -Lipschitz outside the -neighborhood of for adequate choices of and .

The proof of the previous proposition gave a clear link between -domination and the notion of dominated splittings. Since a dominated splitting is robust under small perturbations (because they are detected by variants of the so-called cone field criterion), a direct consequence of the proof of the proposition above is:

**Corollary 10** *The -domination property is open: if is -dominated, then any included in a sufficiently small neighborhood of is also -dominated.*

The previous proposition also links -domination to Schottky families and, as it turns out, this is a key ingredient to obtain the continuity of the joint spectrum in the presence of domination.

**Theorem 11** *If is -dominated, then the map is continuous at .*

Very roughly speaking, the proof of this result relies on the fact that if a matrix is “very Schottky” (like a huge power of a proximal matrix), then this matrix is quite close to a rank 1 operator and, in this regime, the Jordan projection behaves in an “almost additive” way.

**3.2. Examples of discontinuity**

**3.2.1. Calculation of a joint spectrum in**

Recall that acts on Poincaré disk by isometries of the hyperbolic metric. Consider , where and are loxodromic elements of acting by translations along disjoint oriented geodesic axis and on from to and from to . We assume that the endpoints of the axes and are cyclically order on as , and we denote by and the translation lengths of and along and .

In the sequel, we want to compute and, for this sake, we need to understand where is a word of length on and .

**Proposition 12** *If and are elements of as above, denotes the distance between the axes of and , and , then is the interval*

*Proof:* One can show (using hyperbolic geometry) that is a loxodromic element whose axis stays between the axes of and while going from a point in to a point in , and the translation length of satisfies

In particular, .

We affirm that if is a word on and and is a word obtained from by replacing some letter by , then . In fact, by performing a conjugation if necessary, we can assume that and , so that and .

Therefore, if we start in with and we successively replace by until we reach , then we see from the claim in the previous paragraph that becomes denser in as . This proves that .

**3.2.2. Some joint spectra in**

Let as above and fix . We assume that there exists such that where is the rotation by .

The joint spectrum of in the plane with axis and is a triangle with vertex at , intersecting the -axis on the interval , and the side opposite to the vertex contained in the line . Indeed, one eventually get this description of because , , can be computed explicitly in terms of the joint spectrum of thanks to the fact that commutes with and . Note that .

Let us now consider , where denotes the rotation by . We affirm that for all and, *a fortiori*, is *discontinuous* at (because as ). In fact, given , since , the word

equals to . Therefore,

and, by letting , we conclude that , as desired.

**4. Prescribing the joint spectrum**

We close this post with a brief sketch of the following result:

**Theorem 13**

- (1) If is a convex body dans , there exists a compact subset of generating a Zariski-dense monoid such that .
- (2) Moreover, if is a convex polyhedron with a finite number of vertices, then there exists a finite subset generating a Zariski-dense monoid such that .

*Proof:* (1) If we forget about the Zariski-denseness condition, then we could take simply . In order to respect the Zariski-density constraint, we fix and we set where is a small neighborhood of the identity. In this way, the monoid generated by is Zariski-dense and it is possible to check that whenever is sufficiently small.

(2) Given a finite set whose convex hull is , we can take where is a finite set with sufficiently many points so that the monoid generated by is Zariski-dense.

### Elgindi’s approximation of the Biot-Savart law

Let be a divergence-free vector field, thus , which we interpret as a velocity field. In this post we will proceed formally, largely ignoring the analytic issues of whether the fields in question have sufficient regularity and decay to justify the calculations. The vorticity field is then defined as the curl of the velocity:

(From a differential geometry viewpoint, it would be more accurate (especially in other dimensions than three) to define the vorticity as the exterior derivative of the musical isomorphism of the Euclidean metric applied to the velocity field ; see these previous lecture notes. However, we will not need this geometric formalism in this post.)

Assuming suitable regularity and decay hypotheses of the velocity field , it is possible to recover the velocity from the vorticity as follows. From the general vector identity applied to the velocity field , we see that

and thus (by the commutativity of all the differential operators involved)

Using the Newton potential formula

and formally differentiating under the integral sign, we obtain the Biot-Savart law

This law is of fundamental importance in the study of incompressible fluid equations, such as the Euler equations

since on applying the curl operator one obtains the vorticity equation

and then by substituting (1) one gets an autonomous equation for the vorticity field . Unfortunately, this equation is non-local, due to the integration present in (1).

In a recent work, it was observed by Elgindi that in a certain regime, the Biot-Savart law can be approximated by a more “low rank” law, which makes the non-local effects significantly simpler in nature. This simplification was carried out in spherical coordinates, and hinged on a study of the invertibility properties of a certain second order linear differential operator in the latitude variable ; however in this post I would like to observe that the approximation can also be seen directly in Cartesian coordinates from the classical Biot-Savart law (1). As a consequence one can also initiate the beginning of Elgindi’s analysis in constructing somewhat regular solutions to the Euler equations that exhibit self-similar blowup in finite time, though I have not attempted to execute the entirety of the analysis in this setting.

Elgindi’s approximation applies under the following hypotheses:

- (i) (Axial symmetry without swirl) The velocity field is assumed to take the form
for some functions of the cylindrical radial variable and the vertical coordinate . As a consequence, the vorticity field takes the form

- (ii) (Odd symmetry) We assume that and , so that .

A model example of a divergence-free vector field obeying these properties (but without good decay at infinity) is the linear vector field

which is of the form (3) with and . The associated vorticity vanishes.

We can now give an illustration of Elgindi’s approximation:

**Proposition 1 (Elgindi’s approximation)** Under the above hypotheses (and assuing suitable regularity and decay), we have the pointwise bounds

for any , where is the vector field (5), and is the scalar function

Thus under the hypotheses (i), (ii), and assuming that is slowly varying, we expect to behave like the linear vector field modulated by a radial scalar function. In applications one needs to control the error in various function spaces instead of pointwise, and with similarly controlled in other function space norms than the norm, but this proposition already gives a flavour of the approximation. If one uses spherical coordinates

then we have (using the spherical change of variables formula and the odd nature of )

where

is the operator introduced in Elgindi’s paper.

*Proof:* By a limiting argument we may assume that is non-zero, and we may normalise . From the triangle inequality we have

and hence by (1)

In the regime we may perform the Taylor expansion

Since

we see from the triangle inequality that the error term contributes to . We thus have

where is the constant term

and are the linear term

By the hypotheses (i), (ii), we have the symmetries

The even symmetry (8) ensures that the integrand in is odd, so vanishes. The symmetry (6) or (7) similarly ensures that , so vanishes. Since , we conclude that

Using (4), the right-hand side is

where . Because of the odd nature of , only those terms with one factor of give a non-vanishing contribution to the integral. Using the rotation symmetry we also see that any term with a factor of also vanishes. We can thus simplify the above expression as

Using the rotation symmetry again, we see that the term in the first component can be replaced by or by , and similarly for the term in the second component. Thus the above expression is

giving the claim.

**Example 2** Consider the divergence-free vector field , where the vector potential takes the form

for some bump function supported in . We can then calculate

and

In particular the hypotheses (i), (ii) are satisfied with

One can then calculate

If we take the specific choice

where is a fixed bump function supported some interval and is a small parameter (so that is spread out over the range ), then we see that

(with implied constants allowed to depend on ),

and

which is completely consistent with Proposition 1.

One can use this approximation to extract a plausible ansatz for a self-similar blowup to the Euler equations. We let be a small parameter and let be a time-dependent vorticity field obeying (i), (ii) of the form

where and is a smooth field to be chosen later. Admittedly the signum function is not smooth at , but let us ignore this issue for now (to rigorously make an ansatz one will have to smooth out this function a little bit; Elgindi uses the choice , where ). With this ansatz one may compute

By Proposition 1, we thus expect to have the approximation

We insert this into the vorticity equation (2). The transport term will be expected to be negligible because , and hence , is slowly varying (the discontinuity of will not be encountered because the vector field is parallel to this singularity). The modulating function is similarly slowly varying, so derivatives falling on this function should be lower order. Neglecting such terms, we arrive at the approximation

and so in the limit we expect obtain a simple model equation for the evolution of the vorticity envelope :

If we write for the logarithmic primitive of , then we have and hence

which integrates to the Ricatti equation

which can be explicitly solved as

where is any function of that one pleases. (In Elgindi’s work a time dilation is used to remove the unsightly factor of appearing here in the denominator.) If for instance we set , we obtain the self-similar solution

and then on applying

Thus, we expect to be able to construct a self-similar blowup to the Euler equations with a vorticity field approximately behaving like

and velocity field behaving like

In particular, would be expected to be of regularity (and smooth away from the origin), and blows up in (say) norm at time , and one has the self-similarity

and

A self-similar solution of this approximate shape is in fact constructed rigorously in Elgindi’s paper (using spherical coordinates instead of the Cartesian approach adopted here), using a nonlinear stability analysis of the above ansatz. It seems plausible that one could also carry out this stability analysis using this Cartesian coordinate approach, although I have not tried to do this in detail.

### Dartyge’s talk on ellipsephic integers

Last November, I attended the beautiful conference Prime Numbers, Determinism and Pseudorandomness at CIRM. This conference was originally prepared to celebrate the 60th birthday of Christian Mauduit, but unfortunately a tragic event during the summer of 2019 made that this conference ended up becoming a celebration of the memory of Christian.

The links to the titles, abstracts, slides and videos for the talks of this excellent meeting can be found here.

In this blog post, I would like to transcript my notes for the amazing survey talk “On ellipsephic integers” by Cécile Dartyge on one of Christian’s favorite topics in Analytic Number Theory, namely, the statistics of integers missing some digits.

Of course, all mistakes in the sequel are my sole responsibility.

**1. Introduction**

*Ellipsephic integers* refers to a collection of integers with missing digits in a certain basis (e.g., all integers whose representation in basis 10 doesn’t contain the digit 9). Christian Mauduit proposed this nomenclature partly because ellipsis = missing and psiphic = digit in Greek.

Formally, we consider a basis , , and a subset of of cardinality . The corresponding set of ellipsephic integers is

The subset of ellipsephic integers below a certain threshold is denoted by

For the sake of exposition, we shall assume from now on that and

unless it is explicitly said otherwise.

**2. Ellipsephic integers on arithmetic progressions**

Let . Despite their sparseness, it was proved by Erdös, Mauduit and Sárközy that ellipsephic integers behave well (i.e., “à la Siegel-Walfisz”) along arithmetic progressions:

**Theorem 1 (Erdös–Mauduit–Sárközy)** *There are two constants and such that*

*for all , , and sufficiently large.*

*Proof:* As it is usual in this kind of counting problem, one relies on exponential sums. More precisely, note that

where . The “main term” comes from the case , so that our task consists into estimating the “error term”. For this sake, one has essentially to study

where . Observe that

The terms are controlled thanks to the following lemma (giving some saving over the trivial bound for all ):

**Lemma 2 (Erdös–Mauduit–Sárközy)** *Let . For any , one has*

*where .*

In order to take full advantage of the saving on the right-hand side of the inequality, one needs the following lemma:

**Lemma 3 (Mauduit–Sárközy)** *For any and , one has*

The details of the derivation of the desired theorem from the two lemmas above is explained in Section 4 of Erdös–Mauduit–Sárközy paper.

The methods of Erdös–Mauduit–Sárközy above paved the way to further results about ellipsephic integers. For instance, similarly to Bombieri–Vinogradov theorem, it is natural to expect that the distribution result of Erdös–Mauduit–Sárközy gets better on average: as it turns out, this was done independently by C. Dartyge and C. Mauduit, and S. Konyagin (circa 2000):

**Theorem 4 (Dartyge–Mauduit, Konyagin)** *There exists such that for all there exists with the property that*

*Proof:* One uses Lemmas 2 and 3 above, a large sieve method, and some bounds on the moments

of the function .

**Remark 1** *More recently, K. Aloui, C. Mauduit and M. Mkaouar improved (in 2017) some of the results of Erdös–Mauduit–Sárközy to obtain some distribution results for ellipsephic and palindromic integers.*

**3. Ellipsephic primes and almost primes**

By pursuing sieve methods, Dartyge and Mauduit obtained in 2001 the following result about ellipsephic almost primes:

**Theorem 5 (Dartyge–Mauduit)** *There exists such that*

*where stands for the number of prime factors of (counted with multiplicity).*

A natural question motivated by this theorem concerns the determination of explicit values of in the previous statement. The answer to this question is somewhat related to the value of in the last theorem of the previous section and, in this direction, it is possible to show that

- if , then one can take
- (and ) for ,
- (and ) for , …,
- for , and
- as

- if , , then one can take .

In 2009 and 2010, C. Mauduit and J. Rivat proved two conjectures of Gelfond on sums of digits of primes and squares. The methods in these articles gave hope to reach the case (of ellipsephic primes) in Dartyge–Mauduit theorem above. This was recently accomplished by J. Maynard in 2016: if and , then

**Remark 2** *In his thesis, A. Irving got analogous results for palindromic ellipsephic integers with digits in basis with two prime factors.*

After this brief discussion of ellipsephic almost primes, let us now talk about ellipsephic integers possessing only small prime factors.

**4. Friable ellipsephic integers**

Recall that a friable integer is an integer without large prime factors. For later reference, we denote the largest prime factor of by

It was shown by Erdös–Mauduit–Sárközy that, for any fixed and for all , there are *infinitely many* ellipsephic integers of the form whose largest prime factor is .

Logically, this results motivates the question to establish the existence of a *positive proportion* of friable ellipsephic integers. This seems a hard task for *arbitrary* , but this problem becomes more tractable for small values of when the basis is large enough.

In fact, S. Col showed that there exists such that

Moreover, if , then it is possible to take (which is close to one for large). On the other hand, can be taken very small when and sufficiently large.

**5. Ellipsephic solutions to Vinogradov systems**

A Vinogradov system is a system of equations on the variables of the form:

A major breakthrough on counting solutions to Vinogradov systems was famously obtained by J. Bourgain, C. Demeter and L. Guth (see also the text and the video of L. Pierce’s Bourbaki seminar talk on this subject).

Concerning ellipsephic solutions to Vinogradov systems (i.e., solutions with for all ), Kirsty Briggs showed that for , prime and , the trivial bound on the number of solutions of the Vinogradov system

with can be improved into whenever . (In particular, this result is saying that in the case , the main contribution to the number of ellipsephic solutions of the corresponding Vinogradov system comes from the trivial solutions .)

**6. Ellipsephic numbers in finite fields**

The notion of finite-field analogs of ellipsephic numbers was studied by several authors including Dartyge, Mauduit and Sárközy.

In order to explain some results in this direction, let us setup some notations. Let be the power of a prime number and denote by a primitive element generating a basis of over . In this way, we can represent a number as

Given a set of digits with , the associated subset of ellipsephic numbers is

Given a polynomial , we can study the set of its ellipsephic values via the set

The size of is described by the following theorem:

**Theorem 6 (Dartyge–Mauduit–Sárközy)** *If , then*

This result is specially interesting when contains a positive proportion of . Moreover, it can be improved when contains consecutive digits.

More recently, a better result was obtained by R. Dietmann, C. Elsholtz and I. Shparlinski for the case . Finally, the reader can consult the work of C. Swaenepoel for further results.

### Breuillard–Sert’s joint spectrum (II)

In the previous post of this series, we gave the statements of some of the results of Breuillard and Sert on the definition and basic properties of the joint spectrum, and we promised to discuss the proofs in subsequent posts.

Today, after a long hiatus, I’ll try to accomplish part of this promise. More precise, I’ll transcript below my notes for the two talks (by Rodolfo Gutiérrez-Romo and myself) aiming to explain to the participants of our *groupe de travail* the proof of the first portion of Theorem 5 in the previous post, i.e., the convergence of (cf. Theorem 5 below), the convergence of (cf. Theorem 7 below), and the equality of the limits (cf. Theorem 9 below).

Evidently, all mistakes in what follows are my sole responsibility.

**1. Spectral radius formula revisited**

Let be a reductive linear algebraic group. Recall that the Cartan projection and Jordan projection were defined in the previous post via the Cartan decomposition and the Jordan–Chevalley decomposition with elliptic, unipotent, and “hyperbolic” conjugated to .

The *semisimple rank* of is where is a maximal torus of and is the center of . We denote by , a system of roots such that is a base of simple roots.

Each induces a weight satisfying and for all , where is the Lie subalgebra of and is a fixed extension of the Killing form on the Lie subalgebra of to the Lie algebra of such that becomes an orthogonal decomposition.

The weights , , are the highest weights of distinguished representations , of . One has

where is a choice of -invariant norm with diagonalisable in an orthonormal basis and stable under the adjoint operation, and denotes the top eigenvalue of a matrix . In particular, the Cartan projection is represented by a vector of logarithms of norms of matrices, the Jordan projection is represented by a vector of the logarithms of the moduli of top eigenvalues of matrices, and, *a fortiori*, the usual formula for the spectral radius implies that:

*for every .*

**2. Proximal elements and Cartan projections**

As we indicated in § 2.1 of the previous post of this series, the convergence of relies on the notion of *proximal matrices*.

**Definition 2** *Let be the Fubini-Study metric on the projective space of a finite-dimensional real vector space equipped with an Euclidean norm .*Given , we say that is a -proximal matrix whenever:

- has an unique eigenvalue of maximal modulus with eigendirection and -invariant supplementary hyperplane ;
- ;
- for all with and .

In general, we say that an element of a reductive linear algebraic group with distinguished representations , , is –*proximal* when the matrices are -proximal for all .

A basic feature of proximal elements is the fact their Cartan and Jordan projections are comparable (cf. Lemmas 2.15 and 2.16 of Breuillard–Sert paper extracted from Benoist’s paper).

**Lemma 3** *There is a constant such that*

*and*

*for all .*For each , there is a constant such that the Cartan and Jordan projections of any -proximal element satisfy

Another crucial feature of proximal elements (discovered by Abels, Margulis and Soifer, see Theorem 4.1 of their paper) is their ubiquity in Zariski dense monoids:

**Theorem 4 (Abel–Margulis–Soifer)** *Let be a connected, reductive, real Lie group. Suppose that is a Zariski dense monoid. Then, there exists such that, for all , there exists a finite subset with the property: given , there exists so that is -proximal.*

At this point, we are ready to prove the convergence of Cartan projections:

**Theorem 5** *Let be a connected reductive real Lie group and suppose that is a compact subset generating a Zariski dense subgroup. Then,*

*converges in the Hausdorff topology as .*

*Proof:* By Lemma 2 of the previous post, our task is reduced to show that , , stays in a compact region of , and for each , there exists such that for all and .

By Lemma 3, there exists a constant such that

for all . It follows that for all , that is, is confined in a compact region of .

Let us now estimate for , say with . By Abels–Margulis–Soifer theorem 4, we can select a *finite* subset of the monoid generated by such that for each , there exists so that is -proximal. In particular, we can take such that is -proximal for all . By Lemma 3, we have

and

Since , it follows from the triangular inequality that

Therefore, if we fix and we write , , we can use the Euclidean division , to obtain an element of via the formula

It follows from the definitions and Lemma 3 that

Since

by taking (or equivalently ) we derive that

Hence, given , there exists such that

for all , . This completes the proof.

**3. Twisting and Jordan projections**

A Zariski dense monoid of matrices is *twisting* in the sense that it always contains an element putting a finite configuration of lines and hyperplanes in general positions:

**Lemma 6** *Let be a connected, reductive Lie group and suppose that is a Zariski dense monoid. Given a finite collection , , of irreducible representations of and finite configurations and , , of points and hyperplanes in , there is an element such that*

*for all and .*

*Proof:* Since are irreducible, the sets are non-empty and Zariski open in . Thus,

is Zariski open in and non-empty (because is connected). Since is Zariski dense,

This completes the argument.

**Remark 1** *The conclusion of this lemma can be reinforced as follows (cf. Remark 2.22 of Breuillard–Sert paper): it is possible to select from a finite subset of depending only on and (but not on and ).*

At this stage, we can start the discussion of the convergence of Jordan projections:

**Theorem 7** *Let be a connected reductive real Lie group and suppose that is a compact subset generating a Zariski dense subgroup. Then,*

*converges in the Hausdorff topology as .*

*Proof:* Similarly to the previous section (on convergence of Cartan projections), our task consists into showing that for each , there exists such that

for all , . In this direction, let us fix and let us take , . By the formula for the spectral radius (cf. Lemma 1), we can fix with

By Abels–Margulis–Soifer theorem 4, we can fix a finite subset of the monoid generated by such that for some , we have that is -proximal.

By Lemma 3,

where , and

for all .

Consider the distinguished representations , , of . Note that the dominant eigendirection and the dominated hyperplane for the actions of the proximal matrices on are the same for all .

We fix . By the twisting property in Lemma 6, there exists , say , such that

for all and .

The dynamics of projective actions of the iterates of a proximal matrix is easy to describe: any direction transverse to is attracted towards . By rendering this argument slightly more quantitative (with the aid of the so-called *Tits proximality criterion*), Breuillard and Sert proved in Lemma 3.6 of their paper that

**Lemma 8** *If is -proximal and is a finite subset such that*

*for all and , then there exists such that for all and , one has that is -proximal for all .*

By applying this lemma with , we can select such that is -proximal for all , and .

Once again, it follows from Lemma 3 that

and

for all and .

By Euclidean division, we can write with and define

From our discussion above, we derive that

Since and

by letting (or equivalently, ) we conclude that

for all . This completes the proof.

**4. Coincidence of the limits**

Let be a connected, reductive real Lie group and let be a compact subset generating a Zariski dense monoid. By Theorems 5 and 7, we have that

as .

*Proof:* By the formula for the spectral radius (cf. Lemma 1), for all , one has as . In particular, .

In order to derive the other inclusion, we recall that the proof of Theorem 5 about the convergence of Cartan projections revealed that there exists and a constant such that for all and , there exists with and

Therefore,

Since , we have that is bounded and, *a fortiori*,

as . This shows that , as desired.

**5. Realization of the joint spectrum by sequences**

Closing this post, let us further discuss Theorem 5 from the previous post by showing that with is realized by a single sequence in the sense that .

For this sake, we use Abels–Margulis–Soifer theorem 4 and the strong version in Remark 1 of the twisting property in Lemma 6 to select a finite subset of and some constants such that for each there are with the property that is a *Schottky family* in the sense that is -proximal, and and for all . (This nomenclature comes from the fact that the projective actions of the elements in a Schottky family resemble the classical Schottky groups.) Note that where .

Let us now choose a rapidly increasing sequence so that

for all , and we define by

By definition, any finite word has the form where and is a prefix of . Observe that

By Lemma 3, . Moreover, Lemma 2.17 in Breuillard–Sert paper ensures that the Schottky property for the family makes that is a -proximal element with

Therefore, it follows from Lemma 3 that

Since converges to , we conclude that converges to as .

### 254A, Notes 10 – mean values of nonpretentious multiplicative functions

Let us call an arithmetic function *-bounded* if we have for all . In this section we focus on the asymptotic behaviour of -bounded multiplicative functions. Some key examples of such functions include:

- The Möbius function ;
- The Liouville function ;
- “Archimedean” characters (which I call Archimedean because they are pullbacks of a Fourier character on the multiplicative group , which has the Archimedean property);
- Dirichlet characters (or “non-Archimedean” characters) (which are essentially pullbacks of Fourier characters on a multiplicative cyclic group with the discrete (non-Archimedean) metric);
- Hybrid characters .

The space of -bounded multiplicative functions is also closed under multiplication and complex conjugation.

Given a multiplicative function , we are often interested in the asymptotics of long averages such as

for large values of , as well as short sums

where and are both large, but is significantly smaller than . (Throughout these notes we will try to normalise most of the sums and integrals appearing here as averages that are trivially bounded by ; note that other normalisations are preferred in some of the literature cited here.) For instance, as we established in Theorem 58 of Notes 1, the prime number theorem is equivalent to the assertion that

as . The Liouville function behaves almost identically to the Möbius function, in that estimates for one function almost always imply analogous estimates for the other:

**Exercise 1** Without using the prime number theorem, show that (1) is also equivalent to

as . (Hint: use the identities and .)

Henceforth we shall focus our discussion more on the Liouville function, and turn our attention to averages on shorter intervals. From (2) one has

as if is such that for some fixed . However it is significantly more difficult to understand what happens when grows much slower than this. By using the techniques based on zero density estimates discussed in Notes 6, it was shown by Motohashi and that one can also establish \eqref. On the Riemann Hypothesis Maier and Montgomery lowered the threshold to for an absolute constant (the bound is more classical, following from Exercise 33 of Notes 2). On the other hand, the randomness heuristics from Supplement 4 suggest that should be able to be taken as small as , and perhaps even if one is particularly optimistic about the accuracy of these probabilistic models. On the other hand, the Chowla conjecture (mentioned for instance in Supplement 4) predicts that cannot be taken arbitrarily slowly growing in , due to the conjectured existence of arbitrarily long strings of consecutive numbers where the Liouville function does not change sign (and in fact one can already show from the known partial results towards the Chowla conjecture that (3) fails for some sequence and some sufficiently slowly growing , by modifying the arguments in these papers of mine).

The situation is better when one asks to understand the mean value on *almost all* short intervals, rather than all intervals. There are several equivalent ways to formulate this question:

**Exercise 2** Let be a function of such that and as . Let be a -bounded function. Show that the following assertions are equivalent:

As it turns out the second moment formulation in (iii) will be the most convenient for us to work with in this set of notes, as it is well suited to Fourier-analytic techniques (and in particular the Plancherel theorem).

Using zero density methods, for instance, it was shown by Ramachandra that

whenever and . With this quality of bound (saving arbitrary powers of over the trivial bound of ), this is still the lowest value of one can reach unconditionally. However, in a striking recent breakthrough, it was shown by Matomaki and Radziwill that as long as one is willing to settle for weaker bounds (saving a small power of or , or just a qualitative decay of ), one can obtain non-trivial estimates on far shorter intervals. For instance, they show

**Theorem 3 (Matomaki-Radziwill theorem for Liouville)** For any , one has

for some absolute constant .

In fact they prove a slightly more precise result: see Theorem 1 of that paper. In particular, they obtain the asymptotic (4) for *any* function that goes to infinity as , no matter how slowly! This ability to let grow slowly with is important for several applications; for instance, in order to combine this type of result with the entropy decrement methods from Notes 9, it is essential that be allowed to grow more slowly than . See also this survey of Soundararajan for further discussion.

**Exercise 4** In this exercise you may use Theorem 3 freely.

- (i) Establish the lower bound
for some absolute constant and all sufficiently large . (

*Hint:*if this bound failed, then would hold for almost all ; use this to create many intervals for which is extremely large.) - (ii) Show that Theorem 3 also holds with replaced by , where is the principal character of period . (Use the fact that for all .) Use this to establish the corresponding upper bound
to (i).

(There is a curious asymmetry to the difficulty level of these bounds; the upper bound in (ii) was established much earlier by Harman, Pintz, and Wolke, but the lower bound in (i) was only established in the Matomaki-Radziwill paper.)

The techniques discussed previously were highly complex-analytic in nature, relying in particular on the fact that functions such as or have Dirichlet series , that extend meromorphically into the critical strip. In contrast, the Matomaki-Radziwill theorem does *not* rely on such meromorphic continuations, and in fact holds for more general classes of -bounded multiplicative functions , for which one typically does not expect any meromorphic continuation into the strip. Instead, one can view the Matomaki-Radziwill theory as following the philosophy of a slightly different approach to multiplicative number theory, namely the *pretentious multiplicative number theory* of Granville and Soundarajan (as presented for instance in their draft monograph). A basic notion here is the *pretentious distance* between two -bounded multiplicative functions (at a given scale ), which informally measures the extent to which “pretends” to be like (or vice versa). The precise definition is

**Definition 5 (Pretentious distance)** Given two -bounded multiplicative functions , and a threshold , the *pretentious distance* between and up to scale is given by the formula

Note that one can also define an infinite version of this distance by removing the constraint , though in such cases the pretentious distance may then be infinite. The pretentious distance is not quite a metric (because can be non-zero, and furthermore can vanish without being equal), but it is still quite close to behaving like a metric, in particular it obeys the triangle inequality; see Exercise 16 below. The philosophy of pretentious multiplicative number theory is that two -bounded multiplicative functions will exhibit similar behaviour at scale if their pretentious distance is bounded, but will become uncorrelated from each other if this distance becomes large. A simple example of this philosophy is given by the following “weak Halasz theorem”, proven in Section 2:

**Proposition 6 (Logarithmically averaged version of Halasz)** Let be sufficiently large. Then for any -bounded multiplicative functions , one has

for an absolute constant .

In particular, if does not pretend to be , then the logarithmic average will be small. This condition is basically necessary, since of course .

If one works with non-logarithmic averages , then not pretending to be is insufficient to establish decay, as was already observed in Exercise 11 of Notes 1: if is an Archimedean character for some non-zero real , then goes to zero as (which is consistent with Proposition 6), but does not go to zero. However, this is in some sense the “only” obstruction to these averages decaying to zero, as quantified by the following basic result:

**Theorem 7 (Halasz’s theorem)** Let be sufficiently large. Then for any -bounded multiplicative function , one has

for an absolute constant and any .

Informally, we refer to a -bounded multiplicative function as “pretentious’; if it pretends to be a character such as , and “non-pretentious” otherwise. The precise distinction is rather malleable, as the precise class of characters that one views as “obstructions” varies from situation to situation. For instance, in Proposition 6 it is just the trivial character which needs to be considered, but in Theorem 7 it is the characters with . In other contexts one may also need to add Dirichlet characters or hybrid characters such as to the list of characters that one might pretend to be. The division into pretentious and non-pretentious functions in multiplicative number theory is faintly analogous to the division into major and minor arcs in the circle method applied to additive number theory problems; see Notes 8. The Möbius and Liouville functions are model examples of non-pretentious functions; see Exercise 24.

In the contrapositive, Halasz’ theorem can be formulated as the assertion that if one has a large mean

for some , then one has the pretentious property

for some . This has the flavour of an “inverse theorem”, of the type often found in arithmetic combinatorics.

Among other things, Halasz’s theorem gives yet another proof of the prime number theorem (1); see Section 2.

We now give a version of the Matomaki-Radziwill theorem for general (non-pretentious) multiplicative functions that is formulated in a similar contrapositive (or “inverse theorem”) fashion, though to simplify the presentation we only state a qualitative version that does not give explicit bounds.

**Theorem 8 ((Qualitative) Matomaki-Radziwill theorem)** Let , and let , with sufficiently large depending on . Suppose that is a -bounded multiplicative function such that

Then one has

for some .

The condition is basically optimal, as the following example shows:

**Exercise 9** Let be a sufficiently small constant, and let be such that . Let be the Archimedean character for some . Show that

Combining Theorem 8 with standard non-pretentiousness facts about the Liouville function (see Exercise 24), we recover Theorem 3 (but with a decay rate of only rather than ). We refer the reader to the original paper of Matomaki-Radziwill (as well as this followup paper with myself) for the quantitative version of Theorem 8 that is strong enough to recover the full version of Theorem 3, and which can also handle real-valued pretentious functions.

With our current state of knowledge, the only arguments that can establish the full strength of Halasz and Matomaki-Radziwill theorems are Fourier analytic in nature, relating sums involving an arithmetic function with its Dirichlet series

which one can view as a discrete Fourier transform of (or more precisely of the measure , if one evaluates the Dirichlet series on the right edge of the critical strip). In this aspect, the techniques resemble the complex-analytic methods from Notes 2, but with the key difference that no analytic or meromorphic continuation into the strip is assumed. The key identity that allows us to pass to Dirichlet series is the following variant of Proposition 7 of Notes 2:

**Proposition 10 (Parseval type identity)** Let be finitely supported arithmetic functions, and let be a Schwartz function. Then

where is the Fourier transform of . (Note that the finite support of and the Schwartz nature of ensure that both sides of the identity are absolutely convergent.)

The restriction that be finitely supported will be slightly annoying in places, since most multiplicative functions will fail to be finitely supported, but this technicality can usually be overcome by suitably truncating the multiplicative function, and taking limits if necessary.

*Proof:* By expanding out the Dirichlet series, it suffices to show that

for any natural numbers . But this follows from the Fourier inversion formula applied at .

For applications to Halasz type theorems, one sets equal to the Kronecker delta , producing weighted integrals of of “” type. For applications to Matomaki-Radziwill theorems, one instead sets , and more precisely uses the following corollary of the above proposition, to obtain weighted integrals of of “” type:

**Exercise 11 (Plancherel type identity)** If is finitely supported, and is a Schwartz function, establish the identity

In contrast, information about the non-pretentious nature of a multiplicative function will give “pointwise” or “” type control on the Dirichlet series , as is suggested from the Euler product factorisation of .

It will be convenient to formalise the notion of , , and control of the Dirichlet series , which as previously mentioned can be viewed as a sort of “Fourier transform” of :

**Definition 12 (Fourier norms)** Let be finitely supported, and let be a bounded measurable set. We define the *Fourier norm*

the *Fourier norm*

and the *Fourier norm*

One could more generally define norms for other exponents , but we will only need the exponents in this current set of notes. It is clear that all the above norms are in fact (semi-)norms on the space of finitely supported arithmetic functions.

As mentioned above, Halasz’s theorem gives good control on the Fourier norm for restrictions of non-pretentious functions to intervals:

**Exercise 13 (Fourier control via Halasz)** Let be a -bounded multiplicative function, let be an interval in for some , let , and let be a bounded measurable set. Show that

(Hint: you will need to use summation by parts (or an equivalent device) to deal with a weight.)

Meanwhile, the Plancherel identity in Exercise 11 gives good control on the Fourier norm for functions on long intervals (compare with Exercise 2 from Notes 6):

**Exercise 14 ( mean value theorem)** Let , and let be finitely supported. Show that

Conclude in particular that if is supported in for some and , then

In the simplest case of the logarithmically averaged Halasz theorem (Proposition 6), Fourier estimates are already sufficient to obtain decent control on the (weighted) Fourier type expressions that show up. However, these estimates are not enough by themselves to establish the full Halasz theorem or the Matomaki-Radziwill theorem. To get from Fourier control to Fourier or control more efficiently, the key trick is use Hölder’s inequality, which when combined with the basic Dirichlet series identity

The strategy is then to factor (or approximately factor) the original function as a Dirichlet convolution (or average of convolutions) of various components, each of which enjoys reasonably good Fourier or estimates on various regions , and then combine them using the Hölder inequalities (5), (6) and the triangle inequality. For instance, to prove Halasz’s theorem, we will split into the Dirichlet convolution of three factors, one of which will be estimated in using the non-pretentiousness hypothesis, and the other two being estimated in using Exercise 14. For the Matomaki-Radziwill theorem, one uses a significantly more complicated decomposition of into a variety of Dirichlet convolutions of factors, and also splits up the Fourier domain into several subregions depending on whether the Dirichlet series associated to some of these components are large or small. In each region and for each component of these decompositions, all but one of the factors will be estimated in , and the other in ; but the precise way in which this is done will vary from component to component. For instance, in some regions a key factor will be small in by construction of the region; in other places, the control will come from Exercise 13. Similarly, in some regions, satisfactory control is provided by Exercise 14, but in other regions one must instead use “large value” theorems (in the spirit of Proposition 9 from Notes 6), or amplify the power of the standard mean value theorems by combining the Dirichlet series with other Dirichlet series that are known to be large in this region.

There are several ways to achieve the desired factorisation. In the case of Halasz’s theorem, we can simply work with a crude version of the Euler product factorisation, dividing the primes into three categories (“small”, “medium”, and “large” primes) and expressing as a triple Dirichlet convolution accordingly. For the Matomaki-Radziwill theorem, one instead exploits the Turan-Kubilius phenomenon (Section 5 of Notes 1, or Lemma 2 of Notes 9)) that for various moderately wide ranges of primes, the number of prime divisors of a large number in the range is almost always close to . Thus, if we introduce the arithmetic functions

and more generally we have a twisted approximation

for multiplicative functions . (Actually, for technical reasons it will be convenient to work with a smoothed out version of these functions; see Section 3.) Informally, these formulas suggest that the “ energy” of a multiplicative function is concentrated in those regions where is extremely large in a sense. Iterations of this formula (or variants of this formula, such as an identity due to Ramaré) will then give the desired (approximate) factorisation of .

** — 1. Pretentious distance — **

In this section we explore the notion of pretentious distance. The following Hilbert space lemma will be useful for establishing the triangle inequality for this distance:

**Lemma 15 (Triangle inequality)** Let be vectors in a real Hilbert space with . Then

*Proof:* First suppose that are unit vectors: . Then by the cosine rule , and similarly for and . The claim now follows from the usual triangle inequality .

Now suppose we are in the general case when . In this case we extend to unit vectors by working in the product of with the Euclidean space and applying the previous inequality to the extended unit vectors

observing that the extensions have the same inner products as the original vectors.

**Exercise 16 (Basic properties of pretentious distance)** Let be -bounded multiplicative functions, and let .

- (i) (Metric type properties) Show that , with equality if and only if and for all primes . Furthermore, show that and . (Hint: for the last property, apply Lemma 15 to a suitable Hilbert space .)
- (ii) (Alternate triangle inequality) Show that .
- (iii) (Bounds) One has
and if , then

- (iv) (Invariance) One has , and
In particular, if for all , then .

**Exercise 17** If are Dirichlet characters of periods respectively induced from the same primitive character, and , show that for some absolute constant (the only purpose of which is to keep the triple logarithm positive). (*Hint:* control the contributions of the primes in each dyadic block separately for .)

Next, we relate pretentious distance to the value of Dirichlet series just to the right of the critical strip. There is an annoying minor technicality that the prime has to be treated separately, but this will not cause too much trouble.

**Lemma 18 (Dirichlet series and pretentious distance)** Let be a -bounded multiplicative function, , and . Then

In particular, we always have the upper bound

and if one imposes the technical condition that either for all or for all , then

If for all and , then we may delete the terms in the above claims.

*Proof:* By replacing with we may assume without loss of generality that . We begin with the first claim (8). By expanding out the Euler product, the left-hand side of (8) is equal to

and from Definition 5 and Mertens’ theorem we have

and so it will suffice on canceling the factor and taking logarithms to show that

For , the quantity differs from by at most . Also we have

and hence by Taylor expansion

By the triangle inequality, it thus suffices to show that

But the first bound follows from the mean value estimate and Mertens’ theorems, while the second bound follows from summing the bounds

that also arise from Mertens’ theorems.

The quantity is bounded in magnitude by , giving (9). Under either of the two technical conditions listed, this quantity is equal to either or , and in either case it is comparable in magnitude to , giving (10).

If for and , we may repeat the above arguments with the terms deleted, since we no longer need to control the tail contribution .

Now we explore the geometry of the Archimedean characters with respect to pretentious distance.

**Proposition 19** If is sufficiently large, then

for . In particular one has

for .

The precise exponent here is not of particular significance; any constant between and would work for our application to the Matomaki-Radziwill theorem, with the most important feature being that grows significantly faster than any fixed power of . The ability to raise the exponent beyond will be provided by the Vinogradov estimates for the zeta function. (For Halasz’s theorem one only needs these bounds in the easier range , which does not require the Vinogradov estimates.) As a particular corollary of this proposition and Exercise 16(iii), we see that

whenever ; thus the Archimedean characters do not pretend to be like each other at all once the parameter is changed by at least a unit distance (but not changed by an enormous amount).

*Proof:* By Definition 5, our task is to show that

We begin with the upper bound. For , the claim follows from Mertens’ theorems and the triangle inequality. For , we bound

and the claim again follows from Mertens’ theorems (note that in this case). For , we bound by for and by for , and the claim once again follows from Mertens’ theorems.

Now we establish the lower bound. We first work in the range . In this case we have a matching lower bound

for and some small absolute constant , and hence

giving the lower bound. Now suppose that . Applying Lemma 18 with and replaced by some , we have

and thus

or equivalently by Mertens’ theorem

Applying this bound with replaced by and by we conclude that

and hence by Mertens’ theorem and the triangle inequality (for small enough)

giving the claim.

Finally, assume that . From Mertens’ theorems we have

so by the triangle inequality it will suffice to show that

Taking logarithms in (11) for we have

and also

hence by the fundamental theorem of calculus

for some . However, from the Vinogradov-Korobov estimates (Exercise 43 of Notes 2) we have

whenever ; since we are assuming , the claim follows.

**Exercise 20** Assume the Riemann hypothesis. Establish a bound of the form

for some absolute constant whenever for a sufficiently large absolute constant . (*Hint:* use Perron’s formula and shift the contour to within of the critical line.) Use this to conclude that the upper bound in Proposition 19 can be relaxed (assuming RH) to .

**Exercise 21** Let be a -bounded multiplicative function with for all . For any , show that

Thus some sort of upper bound on in Proposition 19 is necessary.

**Exercise 22** Let be a non-principal character of modulus , and let be sufficiently large depending on . Show that

for all . (One will need to adapt the Vinogradov-Korobov theory to Dirichet -functions.)

Proposition 19 measures how close the function lies to the Archimedean characters . Using the triangle inequality, one can then lower bound the distance of any other -bounded multiplicative function to these characters:

**Proposition 23** Let be sufficiently large. Then for any -bounded multiplicative function , there exists a real number with such that

whenever . In particular we have

if and . If is real-valued, one can take .

*Proof:* For the first claim, choose to minimize among all real numbers with . Then for any other , we see from the triangle inequality that

But from Proposition 19 we have

giving the first claim. When is real valued, we can similarly use the triangle inequality to bound

which gives

giving the second claim.

We can now quantify the non-pretentious nature of the Möbius and Liouville functions.

- (i) If is sufficiently large, and is a real-valued -bounded multiplicative function, show that
whenever .

- (ii) Show that
whenever and is sufficiently large.

- (iii) If is a Dirichlet character of some period , show that
whenever and is sufficiently large depending on .

** — 2. Halasz’s inequality — **

We now prove Halasz’s inequality. As a warm up, we prove Proposition 6:

*Proof:* (Proof of Proposition 6) By Exercise 16(iv) we may normalise . We may assume that when , since the value of on these primes has no impact on the sum or on . In particular, from Euler products we now have the absolute convergence . Let be a small quantity to be optimized later, and be smooth compactly supported function on that equals one on with the derivative bounds , on , so on integration by parts we see that the Fourier transform obeys the bounds

for any and . From the triangle inequality have

Applying Proposition 10 applied to a finite truncation of , and then using the absolute convergence of and dominated convergence to eliminate the truncation (or by using Proposition 7 of Notes 2 and then shifting the contour), we can write the right-hand side as

which after rescaling by gives

Now from Lemma 18 one has

where , and thus we can bound (12) by

if is chosen to be a sufficiently large absolute constant. The claim then follows by optimising in .

We remark that an elementary proof of Proposition 6 with was given in Proposition 1.2.6 of .

It was observed \href

that we can also sharpen Proposition 6 when is non-negative by purely elementary methods:

**Proposition 25** Let , and suppose that is multiplicative, -bounded, and non-negative. Then

*Proof:* From Definition 5 and Mertens’ theorems the estimate is equivalent to

For the upper bound, we may assume that vanishes for since these primes make no contribution to either side. We can then bound

For the lower bound, we let be the -bounded multiplicative function with and for and all primes . Then we observe the pointwise bound for all , hence

By the upper bound just obtained and Mertens’ theorems, we have

and the claim follows.

Now we can prove Theorem 7.

*Proof:* (Proof of Theorem 7) We may assume that , since the claim is trivial otherwise. On the other hand, for for a sufficiently large , the second term on the right-hand side is dominated by the first, so the estimate does not become any stronger as one increases beyond , and hence we may assume without loss of generality that .

We abbreviate , thus we wish to show that

It is convenient to remove some exceptional values of . Let be a small quantity to be chosen later, subject to the restriction

From standard sieves (e.g., Theorem 32 from Notes 4), we see that the proportion of numbers in that do not have a “large” prime factor in , or do not have a “medium” prime factor in , is . Thus by paying an error of , we may restrict to numbers that have at least one “large” prime factor in and at least one “medium” prime factor in (and no prime factor larger than ). This is the same as replacing with the Dirichlet convolution

where is the restriction of to numbers with all prime factors in the “small” range , is the restriction of to numbers in with all prime factors in the “medium” range , and is the restriction of to numbers in with all prime factors in the “large” range . We can thus write

This we can write in turn as

where . It is not advantageous to immediately apply Proposition 10 due to the rough nature of (which is not even Schwartz). But if we let be a Schwartz function of total mass whose Fourier transform is supported on , and define the mollified function

then one easily checks that

which from the triangle inequality soon gives the bounds

Hence we may write

Now we apply Proposition 10 and the triangle inequality to bound this by

But we may factor

and we may rather crudely bound , hence we have

Using the Hölder inequalities (5), (6) we have

From Exercise 14 we have

Note that is supported on and hence is effectively also restricted to the range . From standard sieves (using (15)), we have

and thus

Similarly

Finally, from Lemma 18 one has

Putting all this together, we conclude that

Setting for some sufficiently small constant (which in particular will ensure (15) since ), we obtain the claim.

One can optimise this argument to make the constant in Theorem 7 arbitrarily close to ; see this previous post. With an even more refined argument, one can prove the sharper estimate

with , a result initially due to Montgomery and Tenenbaum; see Theorem 2.3.1 of this text of Granville and Soundararajan. In the case of non-negative , an elementary argument gives the stronger bound ; see Corollary 1.2.3 of . However, the slightly weaker estimates in Theorem \ref Let be sufficiently large. Then for any real-valued -bounded multiplicative function , one has

for an absolute constant .

Thus for instance, setting , we can use Wirsing’s theorem and Exercise 24 (or Mertens’ theorem) to recover a form of the prime number theorem with a modestly decaying error term, in that

for all large and some absolute constant . (Admittedly, we did use the far stronger Vinogradov-Korobov estimates earlier in this set of notes; but a careful inspection reveals that those estimates were not used in the proof of (16), so this is a non-circular proof of the prime number theorem.)

** — 3. The Matomaki-Radziwill theorem — **

We now give the proof of the Matomaki-Radziwill theorem, though we will leave several of the details to exercises. We first make a small but convenient reduction:

**Exercise 26** Show that to prove Theorem 8, it suffices to do so for functions that are completely multiplicative. (This is similar to Exercise 1.)

Now we use Exercise 11 to phrase the theorem in an equivalent Fourier form:

**Theorem 27 (Matomaki-Radziwill theorem, Fourier form)** Let , and let , with sufficiently large depending on . Let be a fixed smooth compactly supported function, and set . Suppose that is a -bounded completely multiplicative function such that

for some .

Let us assume Theorem 27 for the moment and see how it implies Theorem 8. In the latter theorem we may assume without loss of generality that is small. We may assume that , since the case follows easily from Theorem \reF{halasz}.

Let be a smooth compactly supported function with on . By hypothesis, we have

Let be a Schwartz function of mean whose Fourier transform is supported on . For any , we consider the expression

A routine calculation using the rapid decrease of shows that

and thus the expression (19) can be estimated as

By Cauchy-Schwarz we then have

Averaging this for and using Fubini’s theorem, we have

and thus from (18) and the triangle inequality we have

On the other hand, from Exercise 11 we have

Applying Theorem 27 (with a slightly smaller value of and ), we obtain the claim.

**Exercise 28** In the converse direction, show that Theorem 27 is a consequence of Theorem 8.

**Exercise 29** Let be supported on , and let . Show that

(Hint: use summation by parts to express as a suitable linear combination of sums and , then use the Cauchy-Schwarz inequality and the Fubini-Tonelli theorem.) Conclude in particular that

where . (This argument is due to Saffari and Vaughan.)

It remains to establish Theorem 27. As before we may assume that is small. Let us call a finitely supported arithmetic function *large* on some subset of if

and *small* on if

Note that a function cannot be simultaneously large and small on the same set ; and if a function is large on some subset , then it remains large on after modifying by any small error (assuming is small enough, and adjusting the implied constants appropriately). From the hypothesis (17) we know that is large on . As discussed in the introduction, the strategy is now to decompose into various regions, and on each of these regions split (up to small errors) as an average of Dirichlet convolutions of other factors which enjoy either good estimates or good estimates on the given region.

We will need the following ranges:

- (i) is the interval
- (ii) is the interval
- (iii) is the interval

We will be able to cover the range just using arguments involving the zeroth interval ; the range can be covered using arguments involving the zeroth interval and the first interval ; and the range can be covered using arguments involving all three intervals . Coverage of the remaining ranges of can be done by an extension of the methods given here and will be left to the exercises at the end of the notes.

We introduce some weight functions and some exceptional sets. For any , let be a bump function on of total mass , and let denote the arithmetic function

supported on the primes . We then define the following subsets of :

- (i) is the set of those such that
for some dyadic (i.e., is restricted to be a power of ).

- (ii) is the set of those such that
for some dyadic .

- (iii) is the set of those such that
for some dyadic .

We will establish the following claims:

- (i) If for some , then is small on .
- (ii) is small on .
- (iii) is small on .
- (iv) If is large on , then one has for some .

Note that parts (i) (with ) and (iv) of the claims are already enough to treat the case ; parts (i) (with ), (ii), and (iv) are enough to treat the case ; and parts (i) (with ), (ii), (iii), and (iv) are enough to treat the case .

We first prove (i). For , let denote the function

This function is a variant of the function introduced in (7). A key point is that the convolutions stay close to :

**Exercise 31 (Turan-Kubilius inequalities)** For , show that

(*Hint:* use the second moment method, as in the proof of Lemma 2 of Notes 9.)

Let . Inserting the bounded factor in the above estimates, and applying Exercise 14, we conclude in particular that the expression

is small on . Since is completely multiplicative, we can write this expression as

We now perform some technical manipulations to move the cutoff to a more convenient location. From (20) we have

We would like to approximate by . A brief triangle inequality calculation using the smoothness of , the -boundedness of , and the narrow support of shows that

where is defined similarly to but with a slightly larger choice of initial cutoff . Integrating this we conclude that

Using Exercise 31 and Exercise 14, the error term is small on . Thus we conclude that

is small on , and hence also on . Thus by the triangle inequality, it will suffice to show that

is small on for each . But by construction we definitely have

while from Exercise 14 and the hypothesis we have

and the claim (i) now follows from (6).

We now jump to (iv). The first observation is that the set is quite small. To quantify this we use the following bound:

**Proposition 32 (Large values of )** Let be sufficiently large depending on , and let . Then for any , the set

has measure at most .

*Proof:* We use the high moment method. Let be a natural number to be optimised in later, and let be the convolution of copies of . Then on we have . Thus by Markov’s inequality, the measure of is at most

To bound this we use Exercise 14. If we choose to be the first integer for which , then this exercise gives us the bound

From the fundamental theorem of arithmetic we see that , hence

From the prime number theorem we have . Putting all this together, we conclude that the measure of is at most

Since , we obtain the claim.

Applying this proposition with ranging between and and , and applying the union bound, the we see that the measure of is at most . To exploit this, we will need some bounds of Vinogradov-Korobov type:

**Exercise 33 (Vinogradov bounds)** For , establish the bound

for any and . (*Hint:* replace with a weighted version of the von Mangoldt function, apply Proposition 7 of Notes 2, and shift the contour, using the zero free region from Exercise 43 of Notes 2.)

Now we can establish we use the following variant of the Montgomery-Halasz large values theorem (cf. Proposition 9 from Notes 6):

**Proposition 34 (Montgomery-Halasz for primes)** Let , let , and have measure . Then for any -bounded function , one has

*Proof:* By duality, we may write

for some measurable function with . We can rearrange the right-hand side as

which by Cauchy-Schwarz is boudned in magnitude by

We can rearrange this as

which by the elementary inequality is bounded by

(cf. Lemma 6 from Notes 9). By Exercise 33 we have

The claim follows.

Now we can prove (iv). By hypothesis, is large on . On the other hand, the function (21) (with ) is small on . By the triangle inequality, we conclude that

is large on , hence by the pigeonhole principle, there exists such that

is large on . On the other hand, from Proposition 32 and Prosition 34 we have

hence by (6)

By the pigeonhole principle, this implies that

for some interval . The claim (iv) now follows from Exercise 13. Note that this already concludes the argument in the range .

Now we establish (ii). Here the set is not as well controlled in size as , but is still quite small. Indeed, from applying Proposition 32 with ranging between and and , and applying the union bound, the we see that the measure of is at most . This is too large of a bound to apply Proposition 34, but we may instead apply a different bound:

**Exercise 35 (Montgomery-Halasz for integers)** Let , and let have measure . For any -bounded function supported on , show that

(It is possible to remove the logarithmic loss here by being careful, but this loss will be acceptable for our arguments. One can either repeat the arguments used to prove Proposition 34, or else appeal to Proposition 9 from Notes 6.)

The point here is that we can get good bounds even when the function is supported at narrower scales (such as ) than the Fourier interval under consideration (such as or ). In particular, this exercise will serve as a replacement for Exercise 14, which will not give good estimates in this case.

As before, the function (21) is small on , so it will suffice by the triangle inequality to show that

is small on for all . From Exercise 35, we have

while from definition of we have

and the claim now follows from (6). Note that this already concludes the argument in the range .

Finally, we establish (iii). The function (21) (with ) is small on , so by the triangle inequality as before it suffices to show that

is small on for all . On the one hand, the definition of gives a good bound on :

To conclude using (6) we now need a good bound for . Unfortunately, the function is now supported on too short of an interval for Exercise 14 to give good estimates, and is too large for Exercise 35 to be applicable either.

But from definition we see that for , we have

for at least one . We can use this to amplify the power of Exercise 14:

**Proposition 36 (Amplified mean value estimate)** One has .

*Proof:* For each , let denote the set of where (23) holds. Since there are values of , and , it suffices by the union bound to show that

(say) for each . Let be the first integer for which the quantity , thus

From (23) we have the pointwise bound

and hence by Exercise 14

where . As is -bounded, and the summand only vanishes when , we can bound the right-hand side by

where denotes the set of primes in the interval .

Suppose has prime factors in this interval (counting multiplicity). Then vanishes unless , in which case we can bound

and

Thus we may bound the above sum by

By the prime number theorem, has elements, so by double counting we have

and thus the previous bound becomes

which sums to

Since

we thus have

so that , we obtain the claim.

Combining this proposition with (22) and (6), we conclude part (iii) of Proposition 30. This establishes Theorem 27 up to the range .

**Exercise 37** Show that for any fixed , Theorem 27 holds in the range

where denotes the -fold iterated logarithm of . (*Hint:* this is already accomplished for . For higher , one has to introduce additional exceptional intervals and extend Proposition 30 appropriately.)

**Exercise 38** Establish Theorem 27 (and hence Theorem 8) in full generality. (This is the preceding exercise, but now with potentially as large as , where the inverse tower exponential function is defined as the least for which . Now one has to start tracking dependence on of all the arguments in the above analysis; in particular, the convenient notation of arithmetic functions being “large” or “small” needs to be replaced with something more precise.)

### Maryam Mirzakhani New Frontiers Prize

Just a short post to announce that nominations are now open for the Maryam Mirzakhani New Frontiers Prize, which is a newly announced annual $50,000 award from the Breakthrough Prize Foundation presented to early-career, women mathematicians who have completed their PhDs within the past two years, and recognizes outstanding research achievement. (I will be serving on the prize committee.) Nominations for this (and other breakthrough prizes) can be made at this page.

### Eigenvectors from Eigenvalues: a survey of a basic identity in linear algebra

Peter Denton, Stephen Parke, Xining Zhang, and I have just uploaded to the arXiv a completely rewritten version of our previous paper, now titled “Eigenvectors from Eigenvalues: a survey of a basic identity in linear algebra“. This paper is now a survey of the various literature surrounding the following basic identity in linear algebra, which we propose to call the *eigenvector-eigenvalue identity*:

**Theorem 1 (Eigenvector-eigenvalue identity)** Let be an Hermitian matrix, with eigenvalues . Let be a unit eigenvector corresponding to the eigenvalue , and let be the component of . Then

where is the Hermitian matrix formed by deleting the row and column from .

When we posted the first version of this paper, we were unaware of previous appearances of this identity in the literature; a related identity had been used by Erdos-Schlein-Yau and by myself and Van Vu for applications to random matrix theory, but to our knowledge this specific identity appeared to be new. Even two months after our preprint first appeared on the arXiv in August, we had only learned of one other place in the literature where the identity showed up (by Forrester and Zhang, who also cite an earlier paper of Baryshnikov).

The situation changed rather dramatically with the publication of a popular science article in Quanta on this identity in November, which gave this result significantly more exposure. Within a few weeks we became informed (through private communication, online discussion, and exploration of the citation tree around the references we were alerted to) of over three dozen places where the identity, or some other closely related identity, had previously appeared in the literature, in such areas as numerical linear algebra, various aspects of graph theory (graph reconstruction, chemical graph theory, and walks on graphs), inverse eigenvalue problems, random matrix theory, and neutrino physics. As a consequence, we have decided to completely rewrite our article in order to collate this crowdsourced information, and survey the history of this identity, all the known proofs (we collect seven distinct ways to prove the identity (or generalisations thereof)), and all the applications of it that we are currently aware of. The citation graph of the literature that this *ad hoc* crowdsourcing effort produced is only very weakly connected, which we found surprising:

The earliest explicit appearance of the eigenvector-eigenvalue identity we are now aware of is in a 1966 paper of Thompson, although this paper is only cited (directly or indirectly) by a fraction of the known literature, and also there is a precursor identity of Löwner from 1934 that can be shown to imply the identity as a limiting case. At the end of the paper we speculate on some possible reasons why this identity only achieved a modest amount of recognition and dissemination prior to the November 2019 Quanta article.

### Hédi Daboussi, in memoriam

I was very sad to learn today from R. de la Bretèche and É. Fouvry that Hédi Daboussi passed away yesterday. Hédi played an important part in my life; he was the first actual analytic number theorist that I met, one day at IHP in 1987, at the beginning of my first bachelor-thesis style project. (This was before internet was widely available.) Fouvry and him advised me on this project, which was devoted to the large sieve, especially the proof of Selberg based on the Beurling functions. They also introduced me to Henryk Iwaniec, who was visiting Orsay at the time (in fact, the meeting at IHP was organized to coincide with a talk of Iwaniec).

Daboussi is probably best known outside the French analytic number theory community for two things: his elegant elementary proof of the Prime Number Theorem, found in 1983, which does not use Selberg’s identity, and which is explained in the nice book of Mendès-France and Tenenbaum, and the “Rencontres de théorie élémentaire et analytique des nombres”, which he organized for a long time as a weekly seminar in Paris, before they were transformed, after his retirement, into (roughly) monthly meetings, which are still known as the “Journées Daboussi”, and are organized by Régis de la Bretèche. The first of these were two days of a meeting in 2006 in honor of Hédi.

For me, the original Monday seminar organized by Daboussi was especially memorable, both because I gave my first “real” mathematics lecture there (I think that it was about my bachelor project), and also because on another occasion (either the same time or close to that), I first met Philippe Michel in Hédi’s seminar. It is very obvious to me that, without him, my life would have been very different, and I will always remember him because of that.

### An uncountable Moore-Schmidt theorem

Asgar Jamneshan and I have just uploaded to the arXiv our paper “An uncountable Moore-Schmidt theorem“. This paper revisits a classical theorem of Moore and Schmidt in measurable cohomology of measure-preserving systems. To state the theorem, let be a probability space, and be the group of measure-preserving automorphisms of this space, that is to say the invertible bimeasurable maps that preserve the measure : . To avoid some ambiguity later in this post when we introduce abstract analogues of measure theory, we will refer to measurable maps as *concrete measurable maps*, and measurable spaces as *concrete measurable spaces*. (One could also call a concrete probability space, but we will not need to do so here as we will not be working explicitly with abstract probability spaces.)

Let be a discrete group. A *(concrete) measure-preserving action* of on is a group homomorphism from to , thus is the identity map and for all . A large portion of ergodic theory is concerned with the study of such measure-preserving actions, especially in the classical case when is the integers (with the additive group law).

Let be a compact Hausdorff abelian group, which we can endow with the Borel -algebra . A *(concrete measurable) –cocycle* is a collection of concrete measurable maps obeying the *cocycle equation*

for -almost every . (Here we are glossing over a measure-theoretic subtlety that we will return to later in this post – see if you can spot it before then!) Cocycles arise naturally in the theory of group extensions of dynamical systems; in particular (and ignoring the aforementioned subtlety), each cocycle induces a measure-preserving action on (which we endow with the product of with Haar probability measure on ), defined by

This connection with group extensions was the original motivation for our study of measurable cohomology, but is not the focus of the current paper.

A special case of a -valued cocycle is a *(concrete measurable) -valued coboundary*, in which for each takes the special form

for -almost every , where is some measurable function; note that (ignoring the aforementioned subtlety), every function of this form is automatically a concrete measurable -valued cocycle. One of the first basic questions in measurable cohomology is to try to characterize which -valued cocycles are in fact -valued coboundaries. This is a difficult question in general. However, there is a general result of Moore and Schmidt that at least allows one to reduce to the model case when is the unit circle , by taking advantage of the Pontryagin dual group of characters , that is to say the collection of continuous homomorphisms to the unit circle. More precisely, we have

**Theorem 1 (Countable Moore-Schmidt theorem)** Let be a discrete group acting in a concrete measure-preserving fashion on a probability space . Let be a compact Hausdorff abelian group. Assume the following additional hypotheses:

- (i) is at most countable.
- (ii) is a standard Borel space.
- (iii) is metrisable.

Then a -valued concrete measurable cocycle is a concrete coboundary if and only if for each character , the -valued cocycles are concrete coboundaries.

The hypotheses (i), (ii), (iii) are saying in some sense that the data are not too “large”; in all three cases they are saying in some sense that the data are only “countably complicated”. For instance, (iii) is equivalent to being second countable, and (ii) is equivalent to being modeled by a complete separable metric space. It is because of this restriction that we refer to this result as a “countable” Moore-Schmidt theorem. This theorem is a useful tool in several other applications, such as the Host-Kra structure theorem for ergodic systems; I hope to return to these subsequent applications in a future post.

Let us very briefly sketch the main ideas of the proof of Theorem 1. Ignore for now issues of measurability, and pretend that something that holds almost everywhere in fact holds everywhere. The hard direction is to show that if each is a coboundary, then so is . By hypothesis, we then have an equation of the form

for all and some functions , and our task is then to produce a function for which

for all .

Comparing the two equations, the task would be easy if we could find an for which

for all . However there is an obstruction to this: the left-hand side of (3) is additive in , so the right-hand side would have to be also in order to obtain such a representation. In other words, for this strategy to work, one would have to first establish the identity

for all . On the other hand, the good news is that if we somehow manage to obtain the equation, then we can obtain a function obeying (3), thanks to Pontryagin duality, which gives a one-to-one correspondence between and the homomorphisms of the (discrete) group to .

Now, it turns out that one cannot derive the equation (4) directly from the given information (2). However, the left-hand side of (2) is additive in , so the right-hand side must be also. Manipulating this fact, we eventually arrive at

In other words, we don’t get to show that the left-hand side of (4) vanishes, but we do at least get to show that it is -invariant. Now let us assume for sake of argument that the action of is ergodic, which (ignoring issues about sets of measure zero) basically asserts that the only -invariant functions are constant. So now we get a weaker version of (4), namely

for some constants .

Now we need to eliminate the constants. This can be done by the following group-theoretic projection. Let denote the space of concrete measurable maps from to , up to almost everywhere equivalence; this is an abelian group where the various terms in (5) naturally live. Inside this group we have the subgroup of constant functions (up to almost everywhere equivalence); this is where the right-hand side of (5) lives. Because is a divisible group, there is an application of Zorn’s lemma (a good exercise for those who are not acquainted with these things) to show that there exists a retraction , that is to say a group homomorphism that is the identity on the subgroup . We can use this retraction, or more precisely the complement , to eliminate the constant in (5). Indeed, if we set

then from (5) we see that

while from (2) one has

and now the previous strategy works with replaced by . This concludes the sketch of proof of Theorem 1.

In making the above argument rigorous, the hypotheses (i)-(iii) are used in several places. For instance, to reduce to the ergodic case one relies on the ergodic decomposition, which requires the hypothesis (ii). Also, most of the above equations only hold outside of a set of measure zero, and the hypothesis (i) and the hypothesis (iii) (which is equivalent to being at most countable) to avoid the problem that an uncountable union of sets of measure zero could have positive measure (or fail to be measurable at all).

My co-author Asgar Jamneshan and I are working on a long-term project to extend many results in ergodic theory (such as the aforementioned Host-Kra structure theorem) to “uncountable” settings in which hypotheses analogous to (i)-(iii) are omitted; thus we wish to consider actions on uncountable groups, on spaces that are not standard Borel, and cocycles taking values in groups that are not metrisable. Such uncountable contexts naturally arise when trying to apply ergodic theory techniques to combinatorial problems (such as the inverse conjecture for the Gowers norms), as one often relies on the ultraproduct construction (or something similar) to generate an ergodic theory translation of these problems, and these constructions usually give “uncountable” objects rather than “countable” ones. (For instance, the ultraproduct of finite groups is a hyperfinite group, which is usually uncountable.). This paper marks the first step in this project by extending the Moore-Schmidt theorem to the uncountable setting.

If one simply drops the hypotheses (i)-(iii) and tries to prove the Moore-Schmidt theorem, several serious difficulties arise. We have already mentioned the loss of the ergodic decomposition and the possibility that one has to control an uncountable union of null sets. But there is in fact a more basic problem when one deletes (iii): the addition operation , while still continuous, can fail to be measurable as a map from to ! Thus for instance the sum of two measurable functions need not remain measurable, which makes even the very definition of a measurable cocycle or measurable coboundary problematic (or at least unnatural). This phenomenon is known as the *Nedoma pathology*. A standard example arises when is the uncountable torus , endowed with the product topology. Crucially, the Borel -algebra generated by this uncountable product is *not* the product of the factor Borel -algebras (the discrepancy ultimately arises from the fact that topologies permit uncountable unions, but -algebras do not); relating to this, the product -algebra is *not* the same as the Borel -algebra , but is instead a strict sub-algebra. If the group operations on were measurable, then the diagonal set

would be measurable in . But it is an easy exercise in manipulation of -algebras to show that if are any two measurable spaces and is measurable in , then the fibres of are contained in some countably generated subalgebra of . Thus if were -measurable, then all the points of would lie in a single countably generated -algebra. But the cardinality of such an algebra is at most while the cardinality of is , and Cantor’s theorem then gives a contradiction.

To resolve this problem, we give a coarser -algebra than the Borel -algebra, namely the *Baire -algebra* , thus coarsening the measurable space structure on to a new measurable space . In the case of compact Hausdorff abelian groups, can be defined as the -algebra generated by the characters ; for more general compact abelian groups, one can define as the -algebra generated by all continuous maps into metric spaces. This -algebra is equal to when is metrisable but can be smaller for other . With this measurable structure, becomes a measurable group; it seems that once one leaves the metrisable world that is a superior (or at least equally good) space to work with than for analysis, as it avoids the Nedoma pathology. (For instance, from Plancherel’s theorem, we see that if is the Haar probability measure on , then (thus, every -measurable set is equivalent modulo -null sets to a -measurable set), so there is no damage to Plancherel caused by passing to the Baire -algebra.

Passing to the Baire -algebra fixes the most severe problems with an uncountable Moore-Schmidt theorem, but one is still faced with an issue of having to potentially take an uncountable union of null sets. To avoid this sort of problem, we pass to the framework of *abstract measure theory*, in which we remove explicit mention of “points” and can easily delete all null sets at a very early stage of the formalism. In this setup, the category of concrete measurable spaces is replaced with the larger category of *abstract measurable spaces*, which we formally define as the opposite category of the category of -algebras (with Boolean algebra homomorphisms). Thus, we define an *abstract measurable space* to be an object of the form , where is an (abstract) -algebra and is a formal placeholder symbol that signifies use of the opposite category, and an *abstract measurable map* is an object of the form , where is a Boolean algebra homomorphism and is again used as a formal placeholder; we call the *pullback map* associated to . [UPDATE: It turns out that this definition of a measurable map led to technical issues. In a forthcoming revision of the paper we also impose the requirement that the abstract measurable map be -complete (i.e., it respects countable joins).] The composition of two abstract measurable maps , is defined by the formula , or equivalently .

Every concrete measurable space can be identified with an abstract counterpart , and similarly every concrete measurable map can be identified with an abstract counterpart , where is the pullback map . Thus the category of concrete measurable spaces can be viewed as a subcategory of the category of abstract measurable spaces. The advantage of working in the abstract setting is that it gives us access to more spaces that could not be directly defined in the concrete setting. Most importantly for us, we have a new abstract space, the *opposite measure algebra* of , defined as where is the ideal of null sets in . Informally, is the space with all the null sets removed; there is a canonical abstract embedding map , which allows one to convert any concrete measurable map into an abstract one . One can then define the notion of an abstract action, abstract cocycle, and abstract coboundary by replacing every occurrence of the category of concrete measurable spaces with their abstract counterparts, and replacing with the opposite measure algebra ; see the paper for details. Our main theorem is then

**Theorem 2 (Uncountable Moore-Schmidt theorem)** Let be a discrete group acting abstractly on a -finite measure space . Let be a compact Hausdorff abelian group. Then a -valued abstract measurable cocycle is an abstract coboundary if and only if for each character , the -valued cocycles are abstract coboundaries.

With the abstract formalism, the proof of the uncountable Moore-Schmidt theorem is almost identical to the countable one (in fact we were able to make some simplifications, such as avoiding the use of the ergodic decomposition). A key tool is what we call a “conditional Pontryagin duality” theorem, which asserts that if one has an abstract measurable map for each obeying the identity for all , then there is an abstract measurable map such that for all . This is derived from the usual Pontryagin duality and some other tools, most notably the completeness of the -algebra of , and the Sikorski extension theorem.

We feel that it is natural to stay within the abstract measure theory formalism whenever dealing with uncountable situations. However, it is still an interesting question as to when one can guarantee that the abstract objects constructed in this formalism are representable by concrete analogues. The basic questions in this regard are:

- (i) Suppose one has an abstract measurable map into a concrete measurable space. Does there exist a representation of by a concrete measurable map ? Is it unique up to almost everywhere equivalence?
- (ii) Suppose one has a concrete cocycle that is an abstract coboundary. When can it be represented by a concrete coboundary?

For (i) the answer is somewhat interesting (as I learned after posing this MathOverflow question):

- If does not separate points, or is not compact metrisable or Polish, there can be counterexamples to uniqueness. If is not compact or Polish, there can be counterexamples to existence.
- If is a compact metric space or a Polish space, then one always has existence and uniqueness.
- If is a compact Hausdorff abelian group, one always has existence.
- If is a complete measure space, then one always has existence (from a theorem of Maharam).
- If is the unit interval with the Borel -algebra and Lebesgue measure, then one has existence for all compact Hausdorff assuming the continuum hypothesis (from a theorem of von Neumann) but existence can fail under other extensions of ZFC (from a theorem of Shelah, using the method of forcing).
- For more general , existence for all compact Hausdorff is equivalent to the existence of a lifting from the -algebra to (or, in the language of abstract measurable spaces, the existence of an abstract retraction from to ).
- It is a long-standing open question (posed for instance by Fremlin) whether it is relatively consistent with ZFC that existence holds whenever is compact Hausdorff.

Our understanding of (ii) is much less complete:

- If is metrisable, the answer is “always” (which among other things establishes the countable Moore-Schmidt theorem as a corollary of the uncountable one).
- If is at most countable and is a complete measure space, then the answer is again “always”.

In view of the answers to (i), I would not be surprised if the full answer to (ii) was also sensitive to axioms of set theory. However, such set theoretic issues seem to be almost completely avoided if one sticks with the abstract formalism throughout; they only arise when trying to pass back and forth between the abstract and concrete categories.