Euler, Erdős

Random addendum

By Jean Pierre Mutanguha

Published: Mon 01 May 2017 Modified: Mon 01 May 2017

tags: Dynamics Group Theory

After writing the previous post, I came across several related things that I thought I'd like to share. This serves as an appendix of sorts. The first is an answer to the question I posed at the end of that post:

Is there a situation where we know that random objects have a given property but we still have no example of an object with said property?

Like I said before, for properties like being trascendental, it is easy to use a counting argument to show that there are uncountable transcendental numbers but proving that a particular number is transcendental is rather involved. It has been done however. So I was asking for a property shared by almost all numbers but for which we have no explicit example.

About a week after asking the question I got an answer from an unexpected place. PBS Infinite Series did a video on exchanging the digits of \( \pi \) and \( e \). Around the 8:30 mark, she discusses normal numbers and being normal is precisely the kind of property I was looking for!

Let's focus on the set \( (0,1) \). Pick a number \( r \) in this set and write its decimal representation. There's some ambiguity here since some rational numbers have two representations: 1 = 0.999... You should choose the terminating one in this case but in the end you will notice it won't matter either way. If the representation terminates, we can ask, what is the proportion of a given digit in the presentation? For example: if \( r = .123 \) then \( \frac{1}{3} \) of the digits are \( 1 \), \( 2 \), \( 3 \) and the other 7 digits do not appear. Let \( d \in { 0, \ldots, 9} \) denote a digit and \( P_r(d) \) the proportion of \( d \) in the (finite) decimal representation of \( r \). So in our example,

\[ P_{.123}(d) =\begin{cases} \frac{1}{3}, & d = 1,2,3 \\ 0, & \text{otherwise} \end{cases} \]

For another example, let \( r = .0123456789 \) then \( P_r(d) = \frac{1}{10} \), i.e. r has an equal distribution of all digits.

In general, \( r \) will not have a terminating representation and \( P_r(d) \) will be the limit of the proportion of \( d \) as you look at more and more digits of \( r \). So, if \( r = \frac{12}{99} = .121212... \) then

\[ P_{r}(d) =\begin{cases} \frac{1}{2}, & d = 1,2 \\ 0, & \text{otherwise} \end{cases} \]

But \( P_r(d) \) is defined in terms of limits and limits don't always exist! So you shouldn't expect \( P_r(d) \) to be a well-defined function for all \( r \in (0,1) \).

Definition 1. However, suppose that for some \( r \), \( P_r(d) \) is well-defined and in fact \( P_r(d) = \frac{1}{10} \) for all digits \( d \). Then we say that \( r \) is simply normal in base 10. You repeat the whole process in different base representation, and if \( P_r(d) = \frac{1}{b} \) for all digits \( d \) in base \( b \), then \( r \) is simply normal in base \( b \). \( r \) is (simply) normal if it is simply normal in all bases \( b \ge 2 \).

So \( r=.0123456789 \) is simply normal in base 10. If a number is simply normal in one base, there is no reason to expect it to be simply normal in another. Rational numbers are numbers that have terminating representation in some base \( b \). Even though \( \frac{1}{3} \) doesn't terminate in base 10, it is equal to \( .1_3 \) in base 3. In fact, a rational number (in \( (0,1) \)) will have a terminating representation with only one digit in some base (the denominator) and so will not be simply normal in this base. The rational number will actually have two representations in this base but neither one will be simply normal. So the choice we made earlier doesn't actually matter and rational numbers will fail to be simply normal in some base. This summarizes as:

Remark 1. If \( r \in (0,1) \) is rational, then \( r \) is not normal.

It is really easy to construct simply normal numbers in a specific base but I don't know if there are examples of numbers that simply normal in two bases. This might be easy to do if one base is a power of another but I haven't given it much thought. What I want to focus on is that being normal is a very strong condition. You're saying something about the representation of a number in all bases. My intuition would say that no such numbers exist but Emile Borel proved almost the opposite.

Borel 1909. almost all numbers are normal.

The proof I've seen uses ergodic theory and I can't get into it in this post. It is a proof by contradiction and, at the end, one shows that it is impossible for the statement to be false. Unfortunately, the proof doesn't give even a single example of normal numbers. Wikipedia references a paper by Bercher and Figueira that gives an example of a computable normal number. But no digit is actually explicitly computed in the paper. So can we really say that we have an example? Wikipedia also mentions that a Chaitin's constant as being normal but it is (provably) uncomputable.

There's another issue going on here as well. So I'm not entirely sure what would satisfy me as an example of a normal number. Being normal is an asymptotic property that we would never be able to verify in a lifetime. There's an algorithm that outputs the digits of a normal number (in a requested base) but the algorithm is so slow that no digit will actually be output before the heatdeath of the universe, do we then have an example of a normal number? Even if the algorithm outputs a digit everyday, will this mean then that we have an explicit example? This becomes a philosophical question and it seems to me that, for some inexplicable reason, I will not be satisfied with a construction of a number that was made specificially to exhibit the property of normality. Something about such a construction doesn't feel concrete. On the other hand, if \( pi \) is proven to be normal then I would accept it as an explicit example, maybe because \( \pi \) came from an unrelated area.

Anyway, let me leave it at that with normal numbers. Jeremy Kun mentioned Kolmogorov randomness to me a few weeks ago in response to the blog. He blogs about the idea himself and you can read it: here. Roughly speaking, Kolmogorov randomness seems to say that a string of letters/digits is random if there is no shorter complete description of them than themselves. Like some strings of letters can be described in few letters like “a million a's in a row” but others like “kasndfkjwrefusdlsdfasdfskiowreutwevbs” can't be described in fewer letters. This is a completely different characterization of randomness that seems independent of probability but seems to capture our intuitive notion of what “random” letters should look like. There's an extension of this idea of randomness to infinite strings and according to wikipedia, random infinite strings will be normal in the given alphabet. If the alphabet are digits of a given base, it is not clear to me if the number described by this infinite string is a normal number, but according to Borel's theorem, it almost surely will be!

Finally, there's one last cool idea that I have been reading about lately: amenability of groups. So last time, I wrote about how it would be impossible to put a uniform probability measure on \( {\mathbb Z} \). I didn't really say what uniform meant and I only used it to claim that all singletons have the same probability; I also didn't define a measure but a property of measures, countably additive, allowed me to add up all the (countable) probabilities and get a contradiction. After that, I described a way we could define probability \( P \) on \( {\mathbb Z} \) by looking at the limit of uniform probabilities of larger and larger finite sets \( S_n \). In what sense would this new probability be uniform? Is the probability defined for all sets?

To answer the first question, you could start by assuming that singletons would have equal probability. Even go one step further and assume finite sets of the same size should have equal probability. But you quickly run into trouble with infinite subsets since they're all countable and hence the “same size” as one another and yet they can't all have the same probability. So uniformity should at least use part of the structure of \( {\mathbb Z} \). In the last post, I said that the set \( S_n \) I used to construct the probability were preferred since they used the linear order of the integers. But integers have more structure than just linear order, they form a group! We will say that the probability is uniform if it isn't affected by addition! So the sets \( A = { 0, 3, 10} \) and \( B = { 2, 5, 12 } \) should have the same probability since \( B = A + 2 \), i.e., \( B \) is obtained by shifiting \( A \) to right by two units. This idea already gives us that singletons all have the same probability. The set of odd numbers and set of even numbers will have the same probability since shift one by one unit gives you the other; so, \( P(even) = P(odd) \). So how can we know that the limit of probabilities that I described above will indeed be compatible with the group structure of the integers?

Another useful property of probability is that we can add probabilities of mutually exclusive events. It is known as the additive property. This gives us, \( P(even) + P(odd) = P(even or odd) = P(integer) = 1 \). By uniformity, \( 1 = P(even) + P(odd) = 2\cdot P(even) \) and \( P(even) = P(odd) = 0.5 \). So before even defined an explicity probability on \( {\mathbb Z} \), we know that for it to be uniform, there's 50-50 chance that an integer is even-odd. However, due to my argument from last time, we won't be able to add countably many probabilities of mutually exclusive events. So we want \( P \) to only be a finitely additive instead of countably additive like measures usually are. One last thing we would like is being able to find the probability of any event/subset of \( {\mathbb Z} \). Can \( P \) defined as a limit of probabilities compute the probability of any set? This is the really hard part!

Like I said before with normal numbers, limits do not always exist. It is easy to construct sets \( X \) so that \( \lim_{n \to \infty}\frac{|X \cap S_n|}{|S_n|} \) does not exist. So we can't ask what the probability \( P(X) \) is. This doesn't fail because of the way I chose the \( S_n \)'s; it was bound to fail because sets of \( {\mathbb Z} \) can be arbirtrarily complicated but limits require a lot of order. To fix this we need to choose (nontrivial) ultrafilter and take the limit with respect to this ultrafilter. Ultrafilters are generalizations of regular limits that were created to solved problems just like this. In an ultrafilter, all bounded sequences have limits! Unfortunately, we need to invoke the axiom of choice to use them and are thus this definition of \( P \) is nonconstructive. It means that even though a probability on every subset of \( {\mathbb Z} \) now exists, some of them can't be computed. We have now defined a finitely additive probability `measure' on \( {\mathbb Z} \).

Definition 2. A (finitely generated) group is amenable you can give it a finitely additive probability `measure.'

Remark 2. \( {\mathbb Z} \) is amenable.

It's really remarkable that when I wrote the post last month, I was talking about the amenability of \( {\mathbb Z} \) and I didn't know about it. Of course there was more to it because we needed an ultrafilter to fix the issue with limits not existing. Turns out that \( S_n \)'s I described are an example of a Følner sequence, which are generally not even required to be nested or neatly ordered. The sets in the sequences are characterized by having boundaries with vanishing relative sizes. So like my \( S_n \)'s boundaries always have two elements (maximum/minimum) but the \( S_n \) themselves are growing in size: \( 2n+1 \). Then the relative sizes of the boundaries \( \frac{2}{2n+1} \) tends to \( 0 \).

Even more remarkable is how I found out about amenability. My department hosted this year's spring redbud topology conference and one speaker briefly mentioned that counting lattice points (square grids) inside a circle,Gauss Circle Problem, is related to the amenability of the \( {\mathbb Z}^2 \). I then googled amenability and was really surprised to see it was the rigorous version of what I had written about. And how on earth are probability measures on to lattice points inside a circle? Roughly speaking, the lattice points in bigger and bigger circles for a Følner sequence since and the relative sizes of the boundaries is approximated by the ratio of circumference to area of a circle \( \approx \frac{2\pi r}{\pi r^2} \) and this tends to \( 0 \) as \( r \) grows. This gives a link between amenability and the isoperimetric inequality, which I briefly mention in my maths & crafts post.

Amenable groups were first studied by Lebesgue as he tried to understand the Banach-Tarski paradox. The paradox says that one can take a ball, “split” it up to finitely many pieces, and rearranged the pieces to form two identical copies of the original ball. Mathematically, this only leads to a contradiction if you attempted to assign size to every subset of three-space, \( {\mathbb R}^3 \). If you assume every subset has a volume, then it seems as if we can create volume out of nothing. So to resolve this paradox, one would need to prove that the notion of volume can not be extended to every set, which gives rise to amenability of group of isometries. Terrence Tao has a two posts on amenability and the Banach-Tarski paradox and they go into more details on this.

I love it when math takes me to such unexpected areas.