You probably know that, for natural number exponents anyways, exponentiation can be treated as repeated multiplication.
More formally, we define exponentiation recursively by
for . You probably also know, however, that exponentiation isn’t solely defined for natural number exponents. Raising a number to the power of , for example, gives us its multiplicative inverse (if it exists), and is just the square root of . We can even raise complex numbers to complex numbers (e.g., ) (*).
When I first learned this, I began to wonder if repeated exponentiation, or tetration, could be extended to non-integer heights. In other words, I wondered if it was possible to make sense of for in the same way we can make sense of for . By the way, we denote with or .
My first thought was to find any nice properties that tetration satisfies so that I could use those to at least extend tetration to real heights. However, because exponentiation isn’t associative, properties like don’t carry over to tetration (so in most cases). In fact, the only real nice property of tetration comes from its recursive definition: . This meant I’d have to be more clever to find the most ‘natural’ extension of the tetrational to non-integer heights. Eventually, I gave up on the problem and moved on to other things.
Around six years later, however, I learned that there is, in fact, a way to systematically find non-integer iterates of certain functions, including exponential functions. This makes evaluating, say, or possible. Not only that, but the way in which we can find non-integer iterates of functions is strikingly similar to one of the more well-known methods of finding non-integer powers of certain square matrices.
As a refresher, linear transformations are functions between vector spaces which preserve addition and scalar multiplication. Specifically, if and are vector spaces over a field and is a linear transformation between and , then for all vectors and scalars ,
for any and .
Every finitely-dimensional vector space has a basis , a subset of which satisfies two properties:
- is linearly independent, i.e., there does not exist a vector in that is a linear combination of other vectors in .
- spans , i.e., every vector in is a linear combination of some vectors in .
This allows us to represent vectors as columns of scalars. If is a vector space over the field , is a vector in , is an ordered basis of , and
for some , then we may represent by the coordinate vector
, by the way, are known as the coordinates of .
Furthermore, if is another vector space over , is an ordered basis of , and is a linear transformation, then for each , can be written as a linear combination of vectors in . In other words,
for some . Hence,
This means that, as long as we know the coordinates of , all we need in order to find the coordinates of are the coordinates of for . We may write these coordinates in an -by- rectangular array known as a matrix.
Notice how for each , the -th column of the matrix above is just
Because we want to behave like , we define
Furthermore, if is yet another vector space over , is a basis of , and is a linear transformation, then, letting
we define the matrix product of and by
so that matrix multiplication acts like the composition of linear transformations.
Note that has the same number of columns as has rows ( in this case). This is because the domain (the set of inputs) of is the same as the codomain (the set of possible outputs) of . In fact, two matrices can be multiplied if and only if the number of columns in the left-hand matrix is the same as the number of rows in the right-hand matrix.
Matrices with the same number of rows and columns are known as square matrices, and what makes square matrices special is that they can be multiplied by themselves. This means it makes sense to recursively define non-negative integer powers of square matrices. Specifically, if is an matrix, then we define
for all .
Furthermore, if is invertible (i.e., there exists a matrix such that ), we can define
for all .
Square matrices whose entries outside the main diagonal are all equal to are known as diagonal matrices, and they have the nice property that raising them to a given power is the same as raising their entries along the main diagonal to that power.
This gives us an easy way to compute arbitrarily high powers of diagonalizable matrices, matrices which can be written in the form where is a diagonal matrix and is an invertible matrix.
Notice how the adjacent ’s and ’s cancel each other out.
Interestingly, it also gives us a way to raise diagonalizable matrices to fractional powers, and I’ll end this section with an example of a ‘square root’ of a diagonalizable matrix.
for all .
A function composed with itself a certain number of times is known as an iterated function. Normally, we denote by , but can also mean raised to the power of (e.g., almost always means and not ), so we’ll denote by instead to avoid ambiguity.
Formally, we define recursively as follows:
, by the way, is the identity function, which maps each element of to itself.
If has an inverse , then we can define by
for all .
Now in dynamics, its often useful to consider dynamical systems defined by functional iteration, treating the number of iterations as time.
For , evaluating such a system’s state at is as simple as applying some function or its inverse to the initial state a certain amount of times.
For , however, it’s not that easy. I mean, you can’t apply to a non-integer number of times, so if we want to construct a smooth orbit for our dynamical system, we’ll have to find the fractional iterates of . This is akin to the problem of finding fractional powers of square matrices, which we’re able to do via diagonalization, so the question is: can we do something similar to find fractional iterates of real/complex analytic functions?
Surprisingly, the answer is yes, and the analogue to diagonalization for analytic functions is Schröder’s equation, named after German mathematician Ernst Schröder (not to be confused with Erwin Schrödinger).
Given a function and a constant , we are asked to solve for a function such that
for all in the domain of . If has an inverse , then we can apply to both sides and obtain
and you might see where this is going from here. Iterating gives us
Functional iteration has now been transformed into exponentiation! What this means is that, as long as we can solve for , we can solve for with not necessarily in .
But how do we solve for ? How can we tell whether or not a solution for even exists?
Solving Schröder’s Equation I
No matter our choice of , there’s always one trivial solution to Schröder’s equation: . Indeed, we see that for all . However, this solution isn’t helpful because it’s not invertible, or in other words, doesn’t exist.
So, let’s assume that a non-trivial solution to Schröder’s equation exists. Typically, solving for constants is easier than solving for functions in a functional equation, so a good first course of action would be to solve for .
Because doesn’t depend on , we should substitute with a value which simplifies our expression somehow. A good candidate would be a fixed point of , which we’ll denote by , since by definition.
Assuming exists, substituting into Schröder’s equation gives us
and we’re now left with only two possibilities: either or and . If , then Schröder’s equation becomes effectively useless for finding iterates of . This is because for all , so we’re left with
Thus, we’ll only focus on the case where . Taking the derivative of both sides of Schröder’s equation with respect to yields
and once again, we’ll substitute into this new equation.
Because we might want to be analytic at , we’ll assume exists. We’re again left with two possibilities: either or and . If , then
so doesn’t exist, and hence, is not analytic at . However, one might want to be analytic at , so we’ll assume . Consequently, .
We can now rewrite Schröder’s equation like this:
As a side note, there are three types of fixed points of differentiable functions.
The first is the attracting fixed point. A fixed point of is said to be attracting if .
The second is the repelling fixed point. A fixed point of is said to be repelling if .
Finally, the third is the neutral fixed point. A fixed point of is said to be neutral if .
A subtype of the attracting fixed point is the superattracting fixed point. A fixed point of is said to be superattracting if .
Solving Schröder’s Equation II
To answer whether or not a non-trivial solution for exists, we turn to a sufficient, though not necessary, condition given by French mathematician Gabriel Koenigs in 1884: if is an attracting, but not superattracting, fixed point of and is analytic on the unit disk centered at , then we are guaranteed a unique analytic function such that and (†). , by the way, is known as the Koenigs function.
To actually compute , we first note that and . With this, we can find the linear approximation of at .
We can improve this approximation by substituting it into Schröder’s equation as such:
Cancelling on the right-hand side, we get
We can do this again to get an even better approximation.
In fact, we can keep doing this to obtain better and better approximations of .
Taking the limit as makes our approximation exact.
Indeed, we can check that
It should be noted that the reason this method works is because is an attracting fixed point of , so for sufficiently close to , as . Specifically, for really close to , roughly exponentially decays at the same rate that grows as , so the entire limit converges for within a certain neighborhood of .
Through similar reasoning, one can show that can be expressed by the limit
Challenge Exercise (Complex Analysis):
Prove Gabriel Koenigs’ 1884 theorem.
The representation theorists among you might recall that linear transformations (and hence, matrices) can represent more abstract mathematical objects. For example, rotations about the origin of Euclidean space can be represented by rotation matrices, with the multiplication of rotation matrices representing the composition of rotations.
This raises the question: does a representation of analytic functions exist, and if so, would Schröder’s equation translate into matrix diagonalization under such a representation?
Let be analytic at . By the analyticity of at , the Maclaurin series
converges to for sufficiently close to . We may interpret this sum as a dot product of two infinite-dimensional vectors.
Alternatively, we may view it as the sole entry of the matrix product of an infinite-dimensional row vector and an infinite-dimensional column vector.
It’s tempting to consider the row vector a representation of , the column vector a representation of , and the matrix a representation of . However, if is to be represented by an infinite-dimensional column vector, then so should . Specifically, should be represented by the column vector
and not the matrix , so instead of representing by an infinite-dimensional row vector, we must instead represent it by the infinite matrix
is the -th coefficient of the Maclaurin series of . Indeed one can check that,
so is a valid representation of . Furthermore,
We refer to as the Carleman matrix of , named after Swedish mathematician Torsten Carleman.
Under this representation, Schröder’s equation translates into
so if is invertible, then
can then be readily worked out as
which is a diagonal matrix! The punchline is that Schröder’s equation does, in fact, translate into diagonalization under Carleman’s matrix representation of analytic functions.
Prove that is a diagonal matrix for any .
Addendum: Abel’s Equation and Böttcher’s Equation
Applying to both sides of Schröder’s equation and replacing with yields the Abel equation, named after Norwegian mathematician Niels Henrik Abel.
Meanwhile, applying to both sides of Schröder’s equation and replacing with gives us Böttcher’s equation, named after Polish mathematician Lucjan Böttcher.
When is a superattracting fixed point of , solving Böttcher’s equation tends to be less cumbersome than solving Schröder’s equation, so it’s sometimes useful to convert Schröder’s equation into Böttcher’s equation.
As stated in the introduction, tetration can be seen as repeated exponentiation in the same way that exponentiation can be seen as repeated multiplication and multiplication can be seen as repeated addition. For natural heights, we define
For in the interior of the Shell-Thron region, the set of such that exists, has the attracting fixed point where is the principal branch of the Lambert W function (the inverse of ), so Schröder’s equation can be used to find and calculate for , and hence, extend to non-integer heights.
We can also calculate the inverse of tetration, known as the super-logarithm, using Schröder’s equation. We denote the base- super-logarithm by .
We can also define a function, which I’ll call the product tetrational function and denote with , by
As it turns out, the equality
holds for , so we’re left with a valid extension of to complex inputs.
A similar function, which I’ll call the sum tetrational function and denote with , can be defined by
and it can readily be shown that
so we can extend as such:
where controls the branch cuts of .
is a fixed point of for .
Show that has period
for all in the interior of the Shell-Thron region.
holds for all .
* Here, where is the principal branch of the natural logarithm. Complex exponentiation isn’t always well-defined since logarithms are multivalued functions, so in most cases, we have to choose a principal value. For example, we choose to be equal to by convention, but we could also choose to be equal to for any .
† Koenigs’ original statement has the extra condition that be equal to . However, it’s possible to show that this seemingly stronger statement in this post is, in fact, equivalent to the original, which I’ll leave as an exercise for the reader.
Sources and Additional Reading
Special thanks to Grant Sanderson and James Schloss for hosting the annual Summer of Math Exposition.