6accdae13eff7i3l9n4o4qrr4s8t12ux
On this foundation I have also tried to simplify the theories which concern the squaring of curves, and I have arrived at certain general Theorems. The anagram displays the number of occurring letters in a sentence, which Newton wanted to keep in secret. It’s a simple way of making sure that the information is secure and its keeper, in this case Newton, can easily prove his priority to others by revealing the encrypted message. So let’s turn to the content of the above anagram. It expresses the fundamental theorem of calculus by stating that
“Data aequatione quotcunque fluentes quantitates involvente, fluxiones invenire; et vice versa”
which means “Given an equation involving any number of fluent quantities to find the fluxions, and vice versa”, which is the fundamental theorem of calculus using Newton’s terminology. Nevertheless, the priority debate is not settled, since at the time of writing the cited letter neither Newton nor Leibniz had published anything specific. Only half-baked ideas, manuscripts and vague statements were circulated. In short, the whole story is much more subtle, than it’s usually told to be. For more on this topic, see [2]. Finally, here the page containing the anagram:
So what is the kissing number problem? The task sounds fairly simple. Take a unit ball in dimensions and put as many unit balls in contact with it as you can in such a way that the surrounding spheres do not overlap. What’s the maximal number of touching unit spheres? This number is the -dimensional kissing number, which we denote by .
One-dimensional balls are just intervals and a given interval can only be in contact with two intervals so it is clear that .
In two dimensions, if two unit disks touch the given one , their centres form a triangle with sides of length and , where is the angle at . The overlap condition on implies that , thus the minimal length of third side is , which happens at , i.e. when forms an equilateral triangle. This shows that six such disks can be arranged around with no space left between them yielding .
The kissing number in three dimensions was the subject of a famous correspondence between Isaac Newton and David Gregory in 1694. Newton correctly thought that the kissing number was twelve, as found in the fcc lattice, but Gregory thought that a thirteenth sphere could fit. That’s why the case is often referred to as the problem of thirteen spheres. One would think that this debate should be easily resolved, but a rigorous mathematical proof was lacking until 1953, when Schütte and van der Waerden [2] gave a detailed argument proving that . An elementary proof was given by Leech in 1956 [3], but his proof is still far from being trivial. There has been effort to bring Leech’ proof down the level of an elementary undergraduate course [4].
We mention as an interesting fact, that although Gregory was wrong about the thirteenth sphere, there is still a lot of space left between the spheres after the kissing number is reached. In fact, there is enough room for the balls that any permutation of them can be achieved by moving them continuously on the surface of the middle sphere, while staying in touch with it. There is a nice constructive proof of this, which uses the fact that is realized by the icosahedron, as depicted below. For details, see [1] pp. 29-30.
The -dimensional case already demonstrates that the kissing number problem is highly non-trivial and things become even more difficult in higher dimensions. No general formula is known for .
The solution for four dimensions was given by Oleg Musin in 2003 [5]. He proved that as realized by the 24-cell. For the kissing number is not known, only lower and upper bound are shown. Somewhat surprisingly, the eight- and twenty-four-dimensional kissing numbers are proved to be and . Both was proved by Odlyzko/Sloane [6] and Levenshtein [7] independently in 1979. These special dimensions (8 and 24) seem so random at a first glance, but in fact they’re not a coincidence. Their exceptional nature is explained by the existence of some highly symmetric objects. The former case is realized by the root system, which I’ve previously mentioned here and here. is realized by the Leech lattice. As for the bounds, Bachoc and Vallentin [8] developed a method, based on semidefinite programming, to find upper bounds for the kissing number. This was used in high accuracy calculations [9] to produce upper bounds on the kissing number in -dimensions for . The table below contains bounds (found in [10]) which might not be up-to-date.
Dimension | Lower bound on | Upper bound on |
1 | 2 | |
2 | 6 | |
3 | 12 | |
4 | 24 | |
5 | 40 | 45 |
6 | 72 | 78 |
7 | 126 | 134 |
8 | 240 | |
9 | 306 | 364 |
10 | 500 | 554 |
11 | 582 | 870 |
12 | 840 | 1357 |
13 | 1154 | 2069 |
14 | 1606 | 3183 |
15 | 2564 | 4866 |
16 | 4320 | 7355 |
17 | 5346 | 11,072 |
18 | 7398 | 16,572 |
19 | 10,668 | 24,812 |
20 | 17,400 | 36,764 |
21 | 27,720 | 54,584 |
22 | 49,896 | 82,340 |
23 | 93,150 | 124,416 |
24 | 196,560 |
Table. Bounds on kissing numbers in -dimensions for . (Known kissing numbers are in boldface.)
These numbers suggest an exponential growth of and indeed it has been shown that
E6 | E7 | E8 | F4 | G2 |
I’d like to devote this post to describing all five exceptional root systems and how the above-mentioned string arts were made. First, explaining the mathematics behind the construction. Then, considering each root system separately and learning some interesting facts about them.
The main objects of our interest here are root systems, so let me quickly repeat the definition.
A finite set of non-zero vectors in some real (finite-dimensional) Euclidean space with scalar product is called a root system if the following conditions are met:
It can be shown that for any root system there exists a subset of roots, called base, such that:
Let stand for the dimension of and be a base of the root system . We also let denote the reflection through the hyperplane perpendicular to , that is
where is the identity transformation and for any we have . These are the simple reflections and generate a group , named after Weyl. The product of simple reflections is called a Coxeter element
Although clearly depends on the choice of base and the order in which the simple reflections are multiplied in (2), any two ordering gives Coxeter elements which are conjugated to each other by some element of the Weyl group. Thus the order of is always the same. It is called the Coxeter number of and it is the smallest positive integer , such that
Root system of | |||||||||
Coxeter number |
Table. The Coxeter number of simple Lie algebras.
Let’s give some other characterizations of the Coxeter number, which could serve as alternative definitions:
It is clear from (3) that any eigenvalue of must be an -th root of unity, that is of the form
It can be shown that is always an eigenvalue of . Let denote the corresponding eigenvector, i.e.
and consider the real and imaginary parts of ,
The Coxeter plane is the 2-dimensional real vector space spanned by and :
A Coxeter projection is the projection of a root system onto its Coxeter plane. Any root has horizontal () and vertical () components given by
This procedure provides the points for the needles in the string art. Next, we draw lines between roots that are closest to each other. That is, between roots and if their distance is minimal. The colour of the lines in the projection has no particular meaning, it depends only on the distance from the origin.
Here I describe the exceptional root systems in the increasing order of ranks.
The Lie algebra has the simplest root system among the exceptional ones. There are root vectors, which can be nicely represented by points in . A base is given by the vectors
Permutations and sign changes of the components of and give all root vectors of . Its Weyl group is of order and is isomorphic to the dihedral group . Clearly, there are two different root lengths whose ratio is . Since the root system is already in a plane, it coincides with its Coxeter projection.
The root vectors form two hexagons hence the root system of contains the solution for the Kissing number problem in two dimensions.
The next root system is the one corresponding to the exceptional Lie algebra . It consists of root vectors, which can be constructed from the base
Its Weyl group is has elements and it is the symmetry group of the so-called 24-cell, which is contained in the root system twice. Thus the root system of gives the solution for the Kissing number problem in four dimensions.
Next in line is the root system of , which has elements and a possible base is
The order of the Weyl group of is . In its Coxeter projection two -element sets of roots project on top of each other, therefore we see points.
The root system of has vectors and a base is given by
The size of the Weyl group of is .
I’ve already written about the root system of . For the sake of completeness, I repeat that previous description below.
Among the exceptional Lie algebras, is the largest one. Its root system lives in -dimensional space and can be described as follows. Let denote the standard basis in , that is
The root system consists of vectors. It has vectors of the form
with every possible indices and signs. By the way these vectors constitute the root system. The remaining vectors can be written as
where the number of minus signs must be even.
Notice that these points are on the surface of the -sphere of radius centered at the origin. In addition, each root has exactly closest neighbours (at distance ). If we put spheres with radius around every point and the origin, we get a very tightly packed configuration. The sphere at the origin is in contact with all spheres. It turns out that this is the solution of the kissing number problem in dimensions.
The number of symmetries of the root system is immense. Its symmetry group (Weyl group) is of order . Compare this with the -cube which has `only’ symmetries.
Bonus Video: The Beauty of E8
“The E_{8} root system, or Gosset 4_{21} polytope, is an exceptional uniform polytope in 8 dimensions, having 240 vertices and 6720 edges. This video shows a 2-dimensional projection of this polytope as it rotates in various ways.”
In 1913, the Hungarian mathematician George Pólya also defined a function [2], but instead of a square, its range is a right triangle . In this post, I’ll describe Pólya’s function and mention an interesting fact about its differentiability, proved by Peter Lax [3].
The construction of for any number is fairly simple. Write each into a binary fraction
meaning that with , and assign a sequence of nested triangles to it as follows. First, split the initial triangle into two similar triangles using its altitude, like this:
Then with the additional assumption that is not isosceles, the two smaller triangle aren’t of equal size. Let’s call the smaller one and the larger . Set if and if . Now, replace with and iterate the process to get and so on. At each step, these triangles become smaller and smaller, shrinking to a point, which they all have in common. This point is . Here are some examples. For , we have for all , hence is always the smaller triangle and the limit point is (see Figure 1). Similarly, for , we must always choose the larger triangle, therefore . There are, however, numbers which have two different representations (1). For example, can be written as or as . As it turns out, it doesn’t matter, which we pick will be the same for both, .
For to be a triangle-filling curve, it must be a continuous surjective function of . This is established by
Pólya’s Theorem. The function maps the interval continuously onto the triangle .
As for smoothness, space-filling curves are very ugly functions. Since they’re not rectifiable, one expects them to be not differentiable. In fact, this was the homework Peter Lax gave his students. Prove that Pólya’s function is nowhere differentiable! When nobody turned in the proof, He decided to work out the details and got the following surprising result:
Lax’s Differentiability Theorem. Denote by the smaller angle of .
A nice elementary proof can be found in [3]. It’s really worth checking out.
This story began in 1834 when a 26-year-old Scottish engineer named John Scott Russell noticed a strangely behaving water wave while conducting an experiment to determine the most efficient design for canal boats. Let him tell us about this first encounter:
“I was observing the motion of a boat which was rapidly drawn along a narrow channel by a pair of horses, when the boat suddenly stopped – not so the mass of water in the channel which it had put in motion; it accumulated round the prow of the vessel in a state of violent agitation, then suddenly leaving it behind, rolled forward with great velocity, assuming the form of a large solitary elevation, a rounded, smooth and well-defined heap of water, which continued its course along the channel apparently without change of form or diminution of speed. I followed it on horseback, and overtook it still rolling on at a rate of some eight or nine miles an hour [14 km/h], preserving its original figure some thirty feet long [9 m] and a foot to a foot and a half [30-45 cm] in height. Its height gradually diminished, and after a chase of one or two miles [2-3 km] I lost it in the windings of the channel. Such, in the month of August 1834, was my first chance interview with that singular and beautiful phenomenon which I have called the Wave of Translation…” — J.S. Russell, Report on waves [1].
After his discovery, Russell studied these stable solitary waves in a more controlled environment by building a 9-meter long wave tank in his back garden and he made further important observations. One of his findings was that the higher the solitary wave is the faster it moves. Although Russell was convinced that his discovery is of great importance, not all of his contemporaries were on his side. For example, Airy was highly sceptical of solitary waves, because it seemed to contradict his shallow water theory, in which a finite amplitude wave cannot propagate without changing its profile. This tension was resolved by Boussinesq and Rayleigh, who showed (independently) that the nonlinear effect, which causes the change of shape in a travelling wave, can be balanced by dispersion thus solitary waves can form.
This is best captured by a nonlinear partial differential equation, which was (re)discovered by the Dutch mathematician Korteweg and his student de Vries in 1895 [2]. The Korteweg-de Vries (KdV) equation reads
where is the unknown function in two variables, space () and time ().
Let us look for solutions which move from left to right with constant velocity and have a localized shape, i.e. we search for solutions satisfying
Plugging (2) into (1) gives us
which integrated with respect to leads to
Multiplying this by and integrating once more, turns it into
where and are constants of integration. The conditions of vanishing derivatives in (2) imply that leaving us with
Solving by separation of variables yields the solution
where . This is called a soliton and behaves as Russell’s wave of translation.
There are also 2-soliton solutions of (1), in which two solitary waves seem to be unaware of each other for large . In the middle (around ), they overlap and interact nonlinearly. Shortly after the interaction, they reappear with no apparent change in size or shape. Nevertheless, there is some mark of the interaction left on them, namely phase shifts.
Despite these early results, solitary waves fell into scientific obscurity for almost 70 years.
When von Neumann‘s computer, named MANIAC-I, was completed in Los Alamos in 1951, Enrico Fermi, John Pasta and Stanisław Ulam promptly started to use it for numerical experiments on nonlinear physics and mathematics. As a test, Fermi suggested to study something to which he thought the answer was obvious: A fixed-end chain of 64 masses connected by springs exerting a nonlinear force between neighbouring weights. The force was taken to be not linear, but a quadratic or cubic function of the displacement. They started with a periodic vibration, a single sine wave, which, in case of a linear force, would oscillate in that mode indefinitely.
They expected that a nonlinear force, perturbing the periodic linear solution, will cause the oscillations to be of an ever-increasing complexity, i.e. get into states, where more and more Fourier modes are present. Physicists would say, that they expected the system to “thermalize”. They programmed the computer* and let it calculate. At first, they got the expected result and went out for lunch, but they had forgotten to turn the machine off! When they returned to the lab, they saw something incredible:
“…we indeed observe initially a gradual increase of energy in the higher modes as predicted. Mode 2 starts increasing first, followed by mode 3, and so on. Later on, however, this gradual sharing of energy among successive modes ceases. Instead, it is one or the other mode that predominates. For example, mode 2 decides, as it were, to increase rather rapidly at the cost of all other modes and becomes predominant. At one time, it has more energy than all others put together! The mode 3 undertakes this role. It is only the first few modes which exchange energy among themselves and they do this in a rather regular fashion. Finally, at a later time mode 1 comes back to within one percent of its initial value so that the system seems to be almost periodic.”
*It was Mary Tsingou, who has programmed the dynamics, ensured its accuracy and provided the graphs of the results.
For more on the FPU problem, see [3,4].
To make sense of the results of the FPU experiment Zabusky and Kruskal [5] considered the continuous version of it by shrinking the masses and springs to infinitesimal size, hence producing a line of deformable material. The corresponding partial differential equation can be transformed into the KdV equation. (For details, click here!) Thus Zabusky and Kruskal [5] started to study numerical solutions of the KdV-type equation
with and the periodic initial condition . At start, we have , so the dispersive term can be neglected, leaving us with the equation
Its solution can be given implicitly by
Such a tends to develop a discontinuity at at critical time . Yet, this is not what happens, since the term neglected at the beginning becomes significant, small wavelength oscillations form and the shock is eluded. Around these waves reach their final size and start to move as a train of solitons.
An amazing thing happens around . The spatially periodic solution of (8) through nonlinear interaction at arrive almost in the same phase and almost reconstruct the initial cosine curve. Hence is referred to as recurrence time. If this is the case in general, it explains the result of FPU.
Shortly after these initial results, Gardner et al. [6] gave a method of solving the KdV equation and formulated the following
Conjecture. Let be any solution of (1) which is defined for all and and which vanishes at . The there exist a discrete set of positive numbers — called the eigenspeeds of — and sets of phase shifts such that
It was proven by Lax [7] that for any pair of speeds there is a corresponding solution of KdV, which satisfies (11). The Conjecture was proven in many context and by many authors, including Ablowitz and Newell, Manakov, and Sabat.
Gardner, Kruskal and Miura [6] also made a remarkable discovery. The eigenvalues of the Schrödinger operator
are invariant in time if is a solution of the KdV equation (1). This means that there are infinitely many conserved quantities, i.e. functions of and its derivatives, which are constant along solutions . For example,
Indeed, Kruskal, Gardner and Miura constructed explicitly an infinite sequence of conserved quantities: . In 1968, Lax showed that the KdV equation is equivalent to an equation of operators (now called Lax equation) of the form
where is the Schrödinger operator (12) and is the skew-symmetric operator
(If two operators satisfy (14), they are called a Lax pair.) For this, solve the operator differential equation
Then
and due to (14) and (16) we have
Therefore , i.e. and are similar, thus have the same eigenvalues. is isospectral.
In 1971, Zakharov and Faddeev [8] showed that the KdV equation is a completely integrable Hamiltonian system with infinitely many degrees of freedom. Indeed, it can be written in the Hamiltonian form
where (13) and denotes the Fréchet derivative of the function :
The symplectic form is given by
and the Poisson bracket of two functions is defined by
It can be shown that for any two first integrals and one has
thus if both indices are odd or both of them are even, iterating this identity leads to
for some . If and has different parity, then we get
which by antisymmetry implies again that
Thus the conserved quantities are in involution, meaning that the KdV equation is completely integrable.
The ideas and results presented so far were all about the KdV equation. However, there are lots of other physically relevant nonlinear PDE’s, which have soliton solutions. Here we briefly describe two of them, namely the sine-Gordon and the nonlinear Schrödinger equation. We also mention several applications.
The sine-Gordon equation is a nonlinear PDE in dimensions, It has the form
(Its name is a pun due to Kruskal and refers to the low-amplitude () approximation, which is called the Klein-Gordon equation.) It can be interpreted as the equation describing the twisting of a continuous chain of needles attached to a flexible string. 1-soliton solutions are given by
where is arbitrary real, is real satisfying , and
Depending on the sign, we call (28) a kink () or an antikink ().
Similarly to KdV, 2-soliton solutions are as if 1-solitons would interact, namely we have (anti)kink-(anti)kink and kink-antikink collisions. But there is another type of 2-soliton solution, called breather, which looks like a coupled kink-antikink pair.
The sine-Gordon equation is relativistic, meaning that it is invariant under the Poincaré transformations of the -dimensional space-time. For details, see [9].
The nonlinear Schrödinger (NLS) equation reads
where is a complex-valued wave function and is constant. The equation is non-relativistic (Galilei invariant). This is also an exactly solvable Hamiltonian system [10]. Its Hamiltonian is given by
Now let us list some applications of these soliton equations. The Korteweg-de Vries equation can be applied to describe shallow-water waves with weakly non-linear restoring forces and long internal waves in a density-stratified ocean. It is also useful in modelling ion acoustic waves in a plasma and acoustic waves on a crystal lattice.
The sine-Gordon solitons, kinks and breathers used as models of nonlinear excitations in complex systems in physics and even in cellular structures.
The nonlinear Schrödinger equation appears in the Manakov system, a model of wave propagation in fiber optics. The function represents a wave and NLS describes the propagation of the wave through a nonlinear medium. The second-order derivative represents the dispersion, while nonlinearity is in the term. The equation models many nonlinearity effects in a fiber, including but not limited to self-phase modulation, four-wave mixing, second harmonic generation, stimulated Raman scattering, etc.
Dynamics is governed by a smooth real-valued function on the phase space called the energy function or Hamiltonian and is prescribed by a system of ODE’s, the equations of motion, which read as
This can be written in the concise matrix-vector form
where and is the matrix
The following properties of can be easily checked
To obtain a particular solution initial values must be specified. Transformations of the phase space which preserve Hamilton’s equations (2) are clearly of high importance. They’re called canonical transformations. For example, when a system evolves for some time , a canonical transformation is induced by taking every point as an initial value and mapping them into . Now let’s consider an arbitrary transformation of the phase space
Then the time derivative of along a solution reads as
and hence
where is the Jacobian of the transformation . Now, it is clear that is canonical iff satisfies the following condition
Such matrices are called symplectic^{(*)}. Let’s take a closer look at the structure of such a matrix by splitting it up into four blocks
Then condition (8) on can be reformulated for the blocks as
What’s the determinant of a symplectic matrix? Well, since the determinant is multiplicative it is immediate from (8) that . Actually, we can show that it’s .
Claim. Every symplectic matrix has determinant .
Proof. Assuming that the block is invertible can be written as a product
thus its determinant is
where and the multiplicative property of the determinant were also utilized. Now, using eqs. (10) and (12) we get
The proof is completed by noticing that matrices with form a dense subset of symplectic matrices and since the determinant is a continuous map we have for all symplectic matrices.
The above fact has a profound consequence. Canonical transformations preserve the phase space volume element! This is Liouville’s theorem. Furthermore, if the system has only bounded orbits we get Poincaré recurrence theorem which states that any neighbourhood of an initial value is infinitely many intersected by the solution starting from as goes to infinity.
In a previous post a funny application of the recurrence theorem to the academic administrative system was described. Here we mention another three applications which are at least as interesting as the one mentioned before:
The question is the probability distribution of . Taking logarithm of both sides we get
where . Multiplying (17) by , we can rephrase the question as follows: How often will the orbit intersect the intervals , . These clearly are the length of the intervals, therefore
Here’s a video of pendulums of different lengths swinging around the same axis. One can clearly see that around 0:55 there are three sets of pendulums and in each set the pendulums are synchronized. Then at 1:23 there are two partitions and the recurrence time is ca. 2:45 minutes.
^{(*) The word ‘symplectic’ means complex and was coined by Weyl in 1939.}
Kenneth R. Meyer: An Application of Poincare’s Recurrence Theorem to Academic Administration
The present trend in science is to apply classical mathematics to nontraditional areas. This note gives an application of a classical theorem of dynamical systems to a long neglected area of study, academic administration, and thus proves that scientific research and academic administration are not mutually disjoint. It is the author’s hope that other administrations will apply their early training in scientific research to study the quagmire (= palude) into which they have slipped and thus carry forth this work.
A recurrent orbit in a system is one that returns infinitely often arbitrarily close to its initial position. Poincaré’s recurrence theorem [1] states: In a compact conservative system almost all orbits are recurrent. Poincaré discovered this theorem in his investigations into the motion of celestial bodies, and until now it has not had applications to such terrestrial matters as academic administrative structures. However, we shall show that this theorem can easily be applied to explain an often observed phenomena.
Lemma 1. An academic administrative system is conservative.
Proof. All decisions are made by applying the principle of least action and therefore the system is conservative by a classical theorem of Maupertuis [2].
Lemma 2. An academic administrative system is compact.
Proof. The system is governed by a finite number of arbitrarily short-sighted deans and is compact by definition.
Lemmas 1 and 2 verify the hypothesis of Poincaré’s recurrence theorem and therefore the conclusions hold for all academic administrations. An immediate consequence of this result is:
Theorem 1. Almost all administrators vacillate.
Finally, since many conservative systems are reversible, an administrator will not only return infinitely often to the same position but must have been there infinitely often in the past.
References
1. H. Poincaré, Les Methodes Nouvelles de la Mécanique Celeste, Gauthier-Villars, Paris. 1892.
2. E. T. Whittaker, A Treatise on the Analytic Dynamics of Particles and Rigid Bodies. Cambridge University Press. 1904.
[1] K.R. Meyer, An Application of Poincaré’s Recurrence Theorem to Academic Administration, American Mathematical Monthly 88, 32-33, 1981. [PDF]
Consider the complex plane extended with the point at infinity, . The linear fractional transformations of , that is
form a group called the Möbius group. There are four basic types of such transformations:
These can be combined to reproduce any transformations of the form (1). This fact shows that the Möbius group consist of conformal (= angle preserving) transformations of the complex plane. Möbius transformations can be visualized by placing a sphere on the plane, and using stereographic projection to identify a point on the surface of the sphere with a point of the extended plane as follows:
Then any motion of the sphere induces a Möbius transformation of . This identification is usually referred to as the Riemann sphere. Here is a nice video [1] demonstrating this:
Another trivial observation is that multiplying the components , , , by an overall non-zero factor gives the same transformation. On `matrix language’ this means an equivalence relation on given by
and one has to consider the quotient group which is called the projective linear group and isomorphic to the Möbius group
This shows that is connected, i.e. there always exists a path (of transformations) between any two elements. One might ask whether it’s simply connected, which is a stronger property, meaning the any loop can be (continuously) deformed to a point. To answer this question it’s useful to choose such a in (4) that
and by that restricting the homomorphism, given in (3), to . Then it’s apparent that any path connecting and in is a loop around the identity in and therefore clearly cannot be shrunk to a point. Thus the Möbius group is not simply connected.
Now, let’s just consider those Möbius transformations which correspond to rotations of the Riemann sphere. The rotations about third coordinate axis is obviously coming from a rotation (); the corresponding matrices in is . The homomorphism
then implies, by the same argument as the one below (6), that is not simply connected. The rotations around the other two axes are given by the matrices
The three (actually six) matrices given in (7) and (8) are unitary with determinant one. This means that we obtained a homomorphism between the groups and . Again, it’s a -fold cover, and since is simply connected, it’s the universal covering of . As a side note, we mention that the unitary matrices in (7) and (8) can be written as
where , , are the famous Pauli matrices
A useful application that a rotation about an axis, given by the unit vector , through an angle can be obtained from the unitary matrix
where .
^{Footnote 1. Or rather combined with a reflection to the real axis. ^}
The puzzle
I took some nephews and nieces to the Zoo, and we halted at a cage marked
- Tovus Slithius, male and female.
- Beregovus Mimsius, male and female.
- Rathus Momus, male and female.
- Jabberwockius Vulgaris, male and female.
The eight animals were asleep in a row, and the children began to guess which was which. “That one at the end is Mr. Tove.” “No, no! It’s Mrs. Jabberwock,” and so on. I suggested that they should each write down the names in order from left to right, and offered a prize to the one who got most names right.
As the four species were easily distinguished, no mistake would arise in pairing the animals; naturally a child who identified one animal as Mr. Tove identified the other animal of the same species as Mrs. Tove.
The keeper, who consented to judge the lists, scrutinised them carefully. “Here’s a queer thing. I take two of the lists, say, John’s and Mary’s. The animal which John supposes to be the animal which Mary supposes to be Mr. Tove is the animal which Mary supposes to be the animal which John supposes to be Mrs. Tove. It is just the same for every pair of lists, and for all four species.
“Curiouser and curiouser! Each boy supposes Mr. Tove to be the animal which he supposes to be Mr. Tove; but each girl supposes Mr. Tove to be the animal which she supposes to be Mrs Tove. And similarly for the other animals. I mean, for instance, that the animal Mary calls Mr. Tove is really Mrs. Rathe, but the animal she calls Mrs. Rathe is really Mrs. Tove.”
“It seems a little involved,” I said, “but I suppose it is a remarkable coincidence.”
“Very remarkable,” replied Mr. Dodgson (whom I had supposed to be the keeper) “and it could not have happened if you had brought any more children.”
How many nephews and nieces were there? Was the winner a boy or a girl? And how many names did the winner get right?
A possible solution involves a clever application of matrices in which each child’s list is evaluated in a 4×4 matrix in the following way. We label each species with a number: 1 for Tovus, 2 for Beregovus, 3 for Rathus, and 4 for Jabberwockius. Then a guess is nothing, but a permutation of the numbers 1, 2, 3, 4. For example, the guess which gets the Beregovus(2) right, but supposes a Tovus(1) to be a Rathus(3), a Rathus(3) to be a Jabberwockius(4) and a Jabberwockius(4) to be a Tovus(1) can be written as , , , . Thus a so-called permutation matrix can be constructed whose only non-zero elements are , , so it looks something like this
In addition, we must indicate whether the genders are guessed correctly, or not. This can be achieved by setting to , if the child got the genders right and , if he/she didn’t. Let assume that in the previous example the gender of the Tovuses and Beregovus were guessed right, then we get the matrix
Let denote the matrices made from the lists as explained above. The first matrices are of the boys, and the rest of matrices belongs to the girls.
Now, it’s time to decrypt the keeper’s observations. Mr. Dodgson’s first clue simply means that any pair of different matrices anti-commute, i.e.
Let’s translate into matrix form the second clue about the matrices of boys and girls. It states that any matrix belonging to a boy has a square equal to the identity matrix, and any girl’s matrix has square equal to minus the identity, that is
and
These matrix equations are very important. It was solved in 1928 by Dirac, who found a set of matrices satisfying these conditions and used it to construct his relativistic equation for the electron, which led him to predict the existence of positron (discovered in 1932). The famous Dirac equation can be written as
where now we should concentrate on the objects . These are the matrices in question. They have the form
(The index 5 is due to the fact that physicists used to call to .)
To answer Eddington’s question, there were children, boys (, , ) and girls (, ), the winner was a boy (with matrix ), who guessed animals correctly in total, both in species and gender.
The investigation of random matrices was initiated by the early works of Wishart [1] in 1928, when he introduced the probability distribution of random matrices now bearing his name. The Wishart distribution generalizes the sum of squares of normal variables (a.k.a. -distribution) and can be used for the maximum-likelihood estimation of the covariance matrix of i.i.d. vectors drawn from a multivariate normal distribution. The Wishart ensemble has degrees of freedom and is characterized by a fixed real positive definite matrix . It consists of real symmetric, positive definite matrices of size with the probability density function
where denotes the gamma function . In the special case , the density (1) takes the form that of the -distribution.
The first surprising employment of random matrices was in nuclear physics. It was Wigner in his 1951 paper [2], who proposed “a new kind of statistical physics” based on random matrices to model certain properties of excited states of heavy nuclei. His intent was to gain some meaningful information about the high-energy states of complex systems by considering ensembles of different Hamiltonians, which are related by their ‘symmetry’. The theoretical footing of Wigner’s idea is Bohr‘s Compound-nucleus model from 1936 [3], which was designed to explain nuclear reactions, such as neutrons bombarding a nucleus of not too small atomic weight, as two-stage events:
Between the two events there is a relatively long period of time, typically to seconds. To demonstrate his model Bohr made a toy model (see Figure 1 below). A number of billiard balls in a shallow basin represent the target nucleus. The struck ball enters the basin hitting other particles and quickly distributing its energy among them. If the basin and the particles are regarded as perfectly smooth and elastic, there will be a point when some particles close to the basin’s edge receive so much energy that they exit the compound.
At a first glance, Wigner’s proposal seems baffling for (at least) two reasons. First, the energy levels and eigenstates of a quantum mechanical system is completely determined by its Hamiltonian, therefore one could ask: “How could a statistical approach be of any use?” One reason is that energy levels of highly excited states of heavy atoms become so dense that precise information about individual levels cannot be obtained using exact methods. Second, the usage of an ensemble of Hamiltonians to describe a single system. This is in opposition with standard statistical mechanics, where one considers copies of identical physical systems, all governed by the same Hamiltonian but differing in initial conditions, and calculates thermodynamic functions by averaging over this ensemble. As Dyson explains in [4]:
“We picture a complex nucleus as a ‘black box’ in which a large number of particles are interacting according to unknown laws. The problem then is to define in a mathematically precise way an ensemble of systems in which all possible laws of interaction are equally probable.”
Wigner considered real symmetric matrices with the algebraically independent entries , being also statistically independent normal variables with the probability distribution
where is a normalization constant. This is called the Gaussian Orthogonal Ensemble (GOE). The word ‘orthogonal’ is meant to indicate that the distribution (2) is invariant under conjugation with orthogonal matrices, that is with . Various spectral statistics of this ensemble can be calculated, such as the joint eigenvalue distribution, nearest neighbour distribution, correlation functions, etc. When compared with experimental data, the effectiveness of GOE is striking (see Figure 2).
In 1962, Dyson showed [4] that there are three invariant ensembles of random matrices: orthogonal, unitary, and symplectic. Again, orthogonal, unitary, and symplectic refers to the symmetry groups. Those with Gaussian components are called GOE, GUE, and GSE. The joint distribution of their eigenvalues can be written as
with for orthogonal, for unitary, and for symplectic. For more on the topic of random matrices in physics see, for example Mehta’s book [5] or the 7th John von Neumann Lecture given by Wigner [6].
An astounding connection was discovered between analytic number theory and random matrices in 1972, when Montgomery gave a talk in Princeton at the Institute for Advanced Study about his results on the non-trivial zeros of the Riemann zeta function
He explained how the zeros on the critical line, viz. such that , tend to repel each other and that he conjectured a formula for the distribution of the gaps between them. After the talk during teatime Montgomery spoke to Dyson, who didn’t attend the talk. Here is Montgomery’s recollection of the conversation:
“Freeman Dyson was standing across the room. I had spent the previous year at the Institute and I knew him perfectly well by sight, but I had never spoken to him. Chowla said: ‘Have you met Dyson?’ I said no, I hadn’t. He said: ‘I’ll introduce you.’ I said no, I didn’t feel I had to meet Dyson. Chowla insisted, and so I was dragged reluctantly across the room to meet Dyson. He was very polite, and asked me what I was working on. I told him I was working on the differences between the non-trivial zeros of Riemann’s zeta function, and that I had developed a conjecture that the distribution function for those differences had integrand . He got very excited. He said: ‘That’s the form factor for the pair correlation of eigenvalues of random Hermitian matrices!’ “
For a more detailed description, see [7].
A significant boost to random matrix theory was the discovery its connection with classical and quantum chaos. In the late 70’s and early 80’s the quantum spectra of conservative systems, which have chaotic classical analogues, was explored (e.g. Sinai billiard). Sequences of energy levels of the same symmetry class were generated numerically. In 1984, Bohigas, Giannoni, and Schmit [8] produced a statistically significant set of data consisting of more than 700 eigenvalues of the Sinai billiard. They applied methods of statistical analysis (invented by Dyson and Mehta for problems in nuclear physics) and found that nearest neighbour spacings of levels were distributed like that of the GOE.
They formulated the conjecture that quantum systems with time-reversal symmetry, whose classical counterparts are chaotic, have spectral statistics of the GOE. In the absence of time-reversal symmetry GOE is replaced by GUE. In contrast, it was conjectured by Berry and Tabor in 1977 [9] that if the corresponding classical dynamics is completely integrable, then the nearest neighbour spacing is equal to the waiting time between consecutive events of a Poisson process. Both of these conjectures are still open, but their statement is strongly supported by a vast number of examples.
Consult [10] for an extensive review about random matrix theories and their applications in quantum physics.
References