Large cardinals and knot theory

Saturday 2008 January 26

I just came across the paper by P. Dehornoy: From large cardinals to braids via distributive algebra. J. Knot Theory Ramifications 4 (1995), no. 1, 33-79, which rather amazingly describes an application
of large cardinals in set theory to knot theory. The connection goes via left distributive algebras:
these are sets with a binary operation a[b] satisfying a[b[c]]=a[b][a[c]]. Typical examples are given by
a group G with a[b] given by the conjugation action of the group on itself: a[b] = aba-1. (By coincidence, Grigor Sargsyan mentioned this a few days ago in his comments on my post about set theory.)

Most of the obvious examples of left distributive algebras satisfy a[a]=a. Some examples not satisfying this come from the elementary embeddings of suitable models of set theory into themselves. There are two natural operations on the elementary embeddings of a suitable model into itself:
composition (corresponding to the group product in the example above) and and action a[b], which can be thought of informally as the image of the elementary embedding b under the action of a (corresponding to the action of a group on itself). The operation a[b] makes the set of elementary embeddings into a left distributive algebra that in general does not satisfy a[a]=a. (The existence of suitable elementary embeddings is essentially a rather powerful large cardinal axiom: the smallest ordinal not fixed by an elementary embedding turns out to be a very large cardinal.)

Dehorney’s paper explains how these new left distributive algebras first constructed using large cardinals were used to prove new results about braid groups in knot theory. A typical application is a definition of a linear order on braid groups, extending several previously known partial orders.

My impression is that most (and maybe all) the results about braid groups and left distributive algebras first proved using large cardinals have also later been proved by more elementary methods. This is rather like the applications of von Neumann algebras to knot theory by Vaughan Jones: they provided the initial motivation, but once the new results were known they were also proved by more elementary means.

Probability paradox

Thursday 2007 December 27

A result that amazed me when I first heard of it is that Gaussian measure on infinite dimensional Hilbert spaces does not exist.

At first sight it seems obvious that it exists: on any finite dimensional quotient one can define a Gaussian probability measure, and these are all “compatible”. So their “inverse limit” should give a Gaussian probability measure on the original infinite dimensional Hilbert space.

In fact this cannot work: the ball of radius R in 1 dimensional space has Gaussian volume V<1, so the ball of radius R in n dimensions has Gaussian measure at most V^n which tends to 0 as n tends to infinity. So the ball of radius R in infinite dimensions has Gaussian measure 0 for any R. As Hilbert space is the union of a countable number of such balls, it has Gaussian measure 0, contradicting the fact that Gaussian measure gives it measure 1.

In fact it is possible to construct Gaussian measure on a Hilbert space H, but it has support larger than H. More precisely, if S is a Hilbert-Schmidt operator from H to K, then although Gaussian measure is not well defined, its image under S is a well defined probability measure on K (Sazonov’s theorem).

The non-existence of Gaussian measure on infinite dimensional Hilbert spaces is one of the things that makes quantum (or rather Euclidean) field theory hard: roughly speaking the functions one wants to integrate are only defined on H and not on the larger space K.

Uselessness of set theory?

Tuesday 2007 December 25

Mathematics is supposedly based on set theory. But if one looks closely, it is amazing (to me) just how little set theory is really used. Almost all “ordinary” mathematics seems to use no sets larger than the continuum. In other words most of mathematics can be carried out in second order arithmetic: roughly speaking this means that one can use the integers and subsets of the integers but nothing more complicated. Although third order and higher order constructions occasionally occur, it usually seems easy to replace them by second order constructions. (By ordinary math I mean things like the Atiyah-Singer index theorem, Fermat’s last theory, the Weil conjectures, and so on; more or less everything that is not some sort of set theory or logic.)

In fact one seems to need a lot less than this if one is willing to work harder. I suspect that a lot of stuff can be encoded in Peano arithmetic if one is willing to work hard (for example, encoding constructible reals in terms of integers is tiresome but possible). The few examples known of theorems that cannot be proved using Peano arithmetic (as in the Paris-Harrington theorem) tend to have extremely rapidly growing functions appearing, and I would guess that proofs without such large functions somewhere can usually be encoded in Peano arithmetic. One can probably go a lot lower: Peano arithmetic has much weaker fragments, such a primitive recursive arithmetic. In practice it seems rare to have functions that require more than a finite tower of exponentials to describe, which presumably corresponds to something even weaker than primitive recursive arithmetic (does anyone know what this is called?).

Harvey Friedman has a program called “reverse mathematics” to identify what axioms are really needed to prove various results, but this seems to concentrate on various fragments of second order arithmetic. He has found many examples of results about the integers, often with a Ramsey-theoretic flavor, that need reasonably strong subsystems of second order arithmetic to prove.

It is easy to find mathematical results whose proofs need much stronger systems using Godel’s theorem: for example, the consistency of second order arithmetic is a perfectly good statement about integers that is (presumably…) true, but cannot be proved in second order arithmetic, though it can easily be proved even in weak set theories. I do not know of any mathematical theorems about the integers that cannot be proved in second order arithmetic that are not related to consistency results.

So my question is the following: why do we use so little set theory in ordinary mathematics? Is it because almost all interesting mathematical results only need tiny fragments of Peano arithmetic to use, or is is because we are too stupid to make proper use of the vastly more powerful axioms in set theory?

TeXmacs

Friday 2007 June 15

I recently discovered the TeXmacs editor, a sort of WYSIWYG editor similar to emacs
for TeX (or more precisely Latex ). It’s freely available from http://texmacs.org/ and should run on any computer with TeX. It makes writing stuff in TeX a lot easier (no more tracking down missing $ or } signs for example). Strongly recommended.

My dad has more money than yours.

Saturday 2007 May 26

In this well known children’s game, both player name an integer, and the winner is the one who names the largest integer, or better still the one who utterly humiliates his opponent by naming a vastly larger integer. Integers must be named in a constructive way, say by writing down an explicit Turing machine which halts after a large number of steps (so for example values of the Busy Beaver function are not allowed, as this function cannot be defined constructively).

The problem is to find a good strategy for this game, in other words to give a (constructive) definition of a very large integer.

Any easy way to write down a large integer is to write down a rapidly growing function, and take some value of this function.

If the only functions we use are addition of 1, (or addition of some fixed constant), we get numbers like 1+1+1+1+1+1 or MMMCCXXXVIII.

If we allow addition and multiplication as operations we get numbers such as 18362528394746. This covers essentially all numbers needed in physics (unless one wants to know how long it will take for a large black hole to decay by Hawking radiation, or something like that). A common mistake made at this level is to write a long sequence of nines: since ones are thinner it is better to write a long sequence of ones.

The next step is to allow exponentiation. This allows one to write down numbers such as Skewes number
10^{10^{10^{34}}}
This covers almost all numbers used in mathematics.

Exponentiation is repeated multiplication, so it does not take much imagination to define an operation of repeated exponentiation, and so on. Repeating this idea gives the class of primitive recursive functions. These are roughly the functions that can be defined by starting with the successor function and allowing functions to be defined by recursion over the integers. A typical example of a number produced with such functions is Graham’s number, which seems to be the largest number that has turned up “naturally” in mathematics (rather than by someone deliberately trying to produce a large number).

The next step is functions such as Ackermann’s function, one of the simplest functions that is not primitive recursive. Primitive functions allow one to define functions by recursion over the first countable ordinal ω, but Ackermann’s function allows one to use recursion over the ordinal ω×ω=ω2. To define bigger functions, one can use larger (countable, constructive) ordinals. Some obvious ways of constructing these are ordinal addition, multiplication, and exponentiation.

The smallest ordinal one cannot form in this way is called ε0, the limit of
\omega, \omega^\omega,  \omega^{\omega^\omega}, \omega^{ \omega^{\omega^\omega}}...
which is the ordinal measuring the strength of Peano arithmetic, in the sense that it is the smallest ordinal that Peano arithmetic cannot prove to be well ordered (Gentzen’s theorem). A reasonably natural function at this level, related to Ramsey’s theorem, was found by Paris and Harrington as part of the proof of the Paris-Harrington theorem .

More generally, for any theory extending Peano arithmetic there is a countable ordinal measuring its strength, defined as the smallest (constructive) ordinal that the theory cannot prove is well ordered. There are also some associated rapidly increasing functions, which one can in fact define directly from the theory without mentioning ordinals: take all computable functions fn that the theory can prove are well defined, and “diagonalize” over them. (More precisely, define a new function f(n) = 1+maxi,j≤n fi(j); this new function is increasing and eventually larger than any of the functions fn.)
This produces a function that is well defined (at least if the theory is ω-consistent) but that grows so fast that the theory cannot prove it is well defined.

Incidentally, this gives a proof of one version of Godel’s incompleteness theorem: the theory cannot prove that the function f is well defined, as it is larger than all functions it can prove are well defined, but the ω-consistency of the theory easily implies f is well defined, so the theory cannot prove its own ω-consistency.

So to find large functions we need strong theories. Beyond Peano arithmetic there is second order arithmetic, followed by various set theories, such as Zermelo-Fraenkel set theory. We can go further using the hierarchy of known large cardinal axioms, the largest of which is the axiom for Reinhardt cardinals. (These are so large that they are not consistent with the axiom of choice, so one has to use ZF (Zermelo Fraenkel set theory without the axiom of choice) rather than ZFC.)

So to summarize, we get a really large integer as follows: take Zermelo-Fraenkel set theory with a Reinhardt cardinal. List all computable functions that it can prove are well defined, and diagonalize over them. The value of this function at some reasonably large integer (say a trillion) will be larger than any (constructive) number written down by anyone without training in logic and set theory. It is necessary to take the value at quite a large integer because the function starts off growing quite slowly, since most functions that the theory proves are well defined are quite small.

There is one problem: it is not clear that this function is well defined, and there is no known way of proving that it is. This is inevitable: any function that one can see is well defined must be quite small.

(Added later): if one allows numbers that are just defined, rather than constructively defined,
one can do much better; for example, the busy beaver function at reasonably large values will exceed all the numbers mentioned above. It is easy to define very large numbers from powerful theories with a minor variation of the ideas above as follows: for each theory one can define a function f(n) to be the largest number that the theory can specify and prove is well defined in less than n symbols. For example, one could take the largest number that Zermelo-Frenkel theory with a Reinhardt cardinal can define and prove is well defined in at most a googolplex symbols.

Toroidal black holes?

Tuesday 2007 May 22

A well known theorem about black holes (due to Hawking?) states that their boundary must be a sphere, rather than a higher genus surface. For genus greater than 1 the proof seems fine, but the proof for genus 1 (a torus) seems incomplete :it rules them out physically as being unstable, but this seems to allow the possibility that they exist as unstable mathematical objects.

Consider a very thin torus of dust, rotating at just the right speed for the “centrifugal force” to balance the gravitational attraction. If the torus is thin enough this seems to produce a rotating toroidal black hole.

The obvious way to prove these exist is to write down an explicit formula for the metric, but this seems quite hard. There are many known exact solutions with the right symmetry, but they are mostly rather messy and it is hard to see what is going on (and in any case if they gave toroidal black holes the people who found them would probably have noticed).

The complement of a toroidal black hole is not simply connected, so by taking its universal cover one gets a chain of universes which one can travel between by going though the hole in the torus.

Linked and knotted black holes are left as an exercise for the reader.

Planck units (uselessness of).

Wednesday 2007 May 16

Planck units are determined by setting the speed of light c, Planck’s constant h, and the gravitational constant all equal to 1. The speed of light and Planck’s constant (give or take a factor of two pi) seem fundamental, but it is not clear why the gravitational constant G should be 1.

A minor problem is that it is probably missing a factor of 4π or maybe 8π as this is what appears in the Lagrangian. This is not serious but is a little worrying: no-one would suggest using 2c or c/2 as a fundamental unit of velocity, so an ambiguity of 2 is not a good sign for a supposedly fundamental unit.

A much more serious problem is that G appears as a non-renormalizable term in the Lagrangian of the standard model with gravitation; in fact, the only non-renormalizable term that has been measured to be non-zero. According to Wilson’s view of the renormalization group flow, this Lagrangian should be thought of as a low energy effective Lagrangian of some unknown theory. His theory predicts that there should be lots of non-zero non-renormalizable terms, which should all be extremely small. The constant G by a fluke happens to be detectable, because gravity just happens to be cumulative and is not masked by renormalizable interactions. So this suggests that G is just one of an infinite number of terms in the Lagrangian that could be used to determine a set of units.

Another problem is that according to Wilson’s theory all the coupling constants (presumably including G) change under the renormalization group flow so are not really constant. This suggests that there is nothing particularly significant about the Planck mass, or length, or energy, or whatever: all we know is that classical general relativity breaks down by Planck densities or lengths. By analogy, the Fermi constant for weak interactions could be used to produce a fundamental set of units, where the fundamental energy is about 300 GeV, but nothing special happens at this energy; it is just an order of magnitude estimate for the point where the old Fermi theory of the weak interaction breaks down (and for the masses of the vector bosons of the electroweak interaction).