1 Introduction

The equation \(x^y = y^x\), where x and y are positive, has been studied by many people over many years but, by contrast, the equation \(z^w=w^z\), where z and w are complex numbers, has received almost no attention. The history of the equation \(x^y=y^x\), where x and y are positive, goes back to Daniel Bernoulli, Euler and Goldbach [9, 10]. In 1728 Daniel Bernoulli proved that the only positive integral solutions of \(x^y=y^x\) with \(x\ne y\) are (2, 4) and (4, 2). The rational solutions \((x_n,y_n)\) given by

$$\begin{aligned} x_n = \left( 1 + \frac{1}{n}\right) ^n, \quad y_n = \left( 1+\frac{1}{n} \right) ^{n+1}, \qquad n=1,2,\ldots , \end{aligned}$$
(1)

were known to Euler and Goldbach (see [9, p. 687]), and these may be the origin of the familiar modern exercise that \(x_n \rightarrow \mathrm{e}\) as \(n\rightarrow \infty \). Much later it was shown that these are the only positive rational solutions of the equation \(x^y = y^x\); see [11, 12, 14, 19, 21] for proofs and generalisations of this.

Fig. 1
figure 1

The graph of \((\log _e x)/x\)

For positive x and y, we have \(x^y = y^x\) if and only if \(\log _e x/x = \log _e y/y\), where \(\log _e\) is the (real) natural logarithm. It follows from the graph of \(y = (\log _e x)/x\) (see Fig. 1) that if \(x^y=y^x\), \(x>0\), \(y > 0\) and \(x\ne y\) then either \(1< x< \mathrm{e}< y\), or \(1< y< \mathrm{e}< x\). A more careful examination of this graph shows that such solutions (xy) with \(x\ne y\) correspond to a set of points in the first quadrant that lie on a curve \({\mathscr {C}}\) which is the graph of some function F that is a decreasing homeomorphism of \((1,+\infty )\) onto itself; this curve \({\mathscr {C}}\) is illustrated in Fig. 2. The curve \({\mathscr {C}}\) is symmetric about the line \(y=x\) which it meets at \((\mathrm{e},\mathrm{e})\); further, the only positive, integer solutions of \(x^y=y^x\) with \(x\ne y\) are (2, 4) and (4, 2) because the only integer in the interval \((1,\mathrm{e})\) is 2. In the following we shall ignore the solutions with \(x=y\) except for the point \((\mathrm{e},\mathrm{e})\) which lies on \({\mathscr {C}}\).

Fig. 2
figure 2

The point \((\mathrm{e},\mathrm{e})\), and the curve \({\mathscr {C}}\) of solutions of \(x^y = y^x\) with \(x \ne y\)

Goldbach showed that the set of all positive solutions (xy) of \(x^y = y^x\) can be parametrised by the functions x(t) and y(t), where \(t > 0\) and

$$\begin{aligned} x(t) = t^{1/(t-1)} = \exp \left( \frac{\log _e t}{t-1}\right) , \quad y(t) = t^{t/(t-1)} = \exp \left( \frac{t\log _e t}{t-1}\right) , \end{aligned}$$
(2)

so that \(x(t) > \mathrm{e}\) if \(0<t<1\), and \(1< x(t) < \mathrm{e}\) if \(t>1\). The substitution \(t = 1 + 1/s\) in (2) gives an alternative parametric form which provides the rational solutions in (1). If we use this parametric form to provide a computer generated figure, then we obtain the curve \({\mathscr {C}}\) illustrated in Fig. 2. It is clear from this that there is a continuous, strictly decreasing, function F of \((1,+\infty )\) onto itself (whose graph is the curve \({\mathscr {C}}\)) such that for x and y positive, \(x^y=y^x\) if and only if \(y=F(x)\). At this point we simply draw attention to the two different approaches: either the equation \(y=F(x)\), or the parametric form \(\big (x(t),y(t)\big )\).

Over a hundred years ago Moulton considered real (but not necessarily positive) solutions of the equation \(x^y = y^x\). He noted that, in general, the expression \(x^y\) is an infinite set of complex numbers, and in [18] he chose to interpret the pair (xy) of real numbers to be a solution of \(x^y = y^x\) if and only if some (possibly complex) value in \(x^y\) coincides with some (possibly complex) value in \(y^x\); thus, in terms of the sets \(x^y\) and \(y^x\), he required that \(x^y \cap y^x \ne \varnothing \) rather than that the sets \(x^y\) and \(y^x\) be identical. Moulton then considered, in each of the four open quadrants in \({\mathbb {C}}\), all real pairs (xy) for which \(x^y \cap y^x \ne \varnothing \) but, since then, very little has appeared in the literature about the equation \(z^w=w^z\). For more details, and more references, see, for example [2, 3, 16,17,18, 20].

The Lambert function, which is usually denoted by W, arose out of work by Lambert in 1758, and Euler in 1783. By definition, W is the many–valued inverse of the holomorphic function E, where \(E(z) = z\exp (z)\), and this is defined informally using branch cuts and different sheets or, more formally, by the theory of covering surfaces. In any event, it is easy to see that W is linked to the equation \(x^y = y^x\), for \(x\,\log y = y\,\log x\) is equivalent to \(E(X)=E(Y)\), where \(X = \log (1/x)\) and \(Y = \log (1/y)\). There are a few brief comments in the literature about this link, but usually without any careful analysis of the use of W in this argument.

In this paper we are concerned with a question which has not been considered before, namely the problem of uniformising the equation \(z^w = w^z\). The uniformisation of, for example, the relation \(z^2+w^2=1\) is achieved with the functions \(z(t) = \cos t\) and \(w(t) = \sin t\), where \(t \in {\mathbb {C}}\), and also by the two rational functions \(z(t) = 2t/(1+t^2)\) and \(w(t) = (1-t^2)/(1+t^2)\). Another well known example is the uniformisation of the relation \(w^2 = az^3 + bz + c\), where \(az^3 + bz + c\) has distinct zeros, by the pair \(\big (\wp (z),\wp '(z)\big )\), where \(\wp \) is the classical Weierstrass (doubly periodic) elliptic function; here, \(\wp '(z)^2 = a\wp (z)^3 + b\wp (z)+c\) providing that the periods \(\lambda \) and \(\mu \) of \(\wp \) are suitably chosen. Our aim, then, is to find a simply connected region D in the complex plane \({\mathbb {C}}\), and functions z(t) and w(t) that are single-valued, holomorphic, and non-zero for t in D, and single-valued choices of \(\log z(t)\) and \(\log w(t)\) in D, such that \(z(t) \log w(t) = w(t) \log z(t)\) for all t in D. We shall argue that it is Goldbach’s parametrisation, rather than the use of the Lambert function, that is best suited to uniformise the complex equation \(z^w = w^z\). The complex logarithm and complex exponents are, of course familiar ideas; nevertheless, as we depend on the exact details of, and the notation for, these ideas we believe that it is appropriate in this paper to begin with a brief, but thorough, discussion of them. We then discuss the ideas of Euler and Lambert, and the complex equation \(z^w = w^z\).

2 The Complex Exponential, Logarithm and Exponents

The function \(z \mapsto \exp z\), defined by \(\exp z = \sum _{n=0}^\infty \frac{z^n}{n\,!}\), is holomorphic and non-zero in the complex plane \({\mathbb {C}}\). For all complex numbers a and b, \(\exp (a+b) = \exp (a)\exp (b)\), and \(\exp z = 1\) if and only if \(z = 2n\pi i\) for some integer n. Thus, in group-theoretic language, \(\exp \) is a homomorphism of the additive group \(({\mathbb {C}},+)\) into the multiplicative group \(({\mathbb {C}}^*,\times )\) of non-zero complex numbers whose kernel \({\mathcal {K}}\) is the cyclic additive group \(2\pi i{\mathbb {Z}}\) generated by \(2\pi i\). It is elementary that \(\exp \) maps \({\mathbb {C}}\) onto \({\mathbb {C}}^*\); thus the quotient group \({\mathbb {C}}/{\mathcal {K}}\) is isomorphic to \({\mathbb {C}}^*\).

As the restriction of \(\exp \) to \({\mathbb {R}}\) is a strictly increasing, continuous map of \({\mathbb {R}}\) onto \((0,+\infty )\), its inverse, which is the real logarithm \(\log _e\), is an isomorphism of the group \(\big ((0,+\infty ),\times \big )\) onto the group \(({\mathbb {R}},+)\). If \(x>0\) then \(\log _e x\) is the pre-image of x under the map \(\exp \). By analogy, the complex logarithm \(\log w\) of a non-zero complex number w is the pre-image of w under the exponential map \(z \mapsto \exp z\); explicitly, \(\log w\) is the set given by

$$\begin{aligned} \log w = \{z \in {\mathbb {C}}:\exp z = w\}. \end{aligned}$$

As \(\exp :{\mathbb {C}} \rightarrow {\mathbb {C}}^*\) is a homomorphism, this means that \(\log w\) is a coset with respect to the subgroup \({\mathcal {K}}\); equivalently, \(\log w\), where \(w\ne 0\), is an element of the quotient group \({\mathbb {C}}/{\mathcal {K}}\). Note that according to this definition, \(\log 0\) is defined, and it is the empty set \(\varnothing \) because, for all z, \(\exp z \ne 0\). Next, let \({\varOmega }= {\mathbb {C}}\backslash (-\infty ,0]\); that is, the complex plane \({\mathbb {C}}\) cut along the negative real axis. Then the principal branch \(\mathrm{Log}\,\) of the complex logarithm is the single-valued function, defined on \({\varOmega }\), whose value at w is the unique member of \(\log w\) that lies in the strip \({\varSigma }\) given by \(\{u+iv:|v| < \pi \}\). Thus \(\mathrm{Log}\,\) is a conformal map of \({\varOmega }\) onto \({\varSigma }\) whose restriction to the interval \((0,+\infty )\) is \(\log _e\).

We now discuss exponents; that is, z ‘raised to the power’ w, which we denote by \(z^w\). If x is positive and y is real we define \(x^y\) to be \(\exp (y\log _e x)\); equivalently, \(x^y = \exp (yt)\), where \(\exp t = x\). By analogy, if z and w are complex numbers with \(z \ne 0\), we define \(z^w\) to be the set given by

$$\begin{aligned} z^w = \{\exp \big (w\zeta \big ) :\exp \zeta = z\}. \end{aligned}$$

As \(\exp \) maps \({\mathbb {C}}\) onto \({\mathbb {C}}^*\), and \(z\ne 0\), there is some \(\zeta _0\) such that \(\exp \zeta _0 = z\), so that

$$\begin{aligned} z^w = \{\exp \big (w\zeta _0\big )\exp (2nw\pi i) :n\in {\mathbb {Z}}\} \end{aligned}$$

(note that this definition does not explicitly involve the logarithm). According to this definition, \(1^w\) is the cyclic subgroup of \({\mathbb {C}}^*\) generated by \(\exp (2\pi i w)\), and \(z^w\) is a coset with respect to this subgroup. We leave the reader to confirm that the standard construction of a quotient group shows that the multiplication of cosets in \({\mathbb {C}}^*/1^w\) is \(({z_1}^w,{z_2}^w) \mapsto (z_1z_2)^w\), and that \(z \mapsto z^w\) is a homomorphism. However, in general it is not true that \((z^v)^w = z^{vw}\). Also, as \(\exp z\) is a complex number, and \(\mathrm{e}^z\) is a subset of \({\mathbb {C}}\), \(\mathrm{e}^z\) is not an alternative notation for \(\exp z\). Although rarely stated in this way, these facts are most efficiently summarized in the language of group theory as follows.

Theorem 1

Let \({\mathbb {C}}\) be the additive group of complex numbers with subgroup \({\mathscr {K}}\), where \({\mathscr {K}} = 2\pi i {\mathbb {Z}}\), and let \({\mathbb {C}}^*\) be the multiplicative group of non-zero complex numbers with subgroup \(1^w\), where \(w\ne 0\) and \(1^w = \{\exp (2\pi ni w):n\in {\mathbb {Z}}\}\). Suppose that \(w\in {\mathbb {C}}\); then

  1. (1)

    the map \(\exp \) is a surjective homomorphism of \({\mathbb {C}}\) onto \({\mathbb {C}}^*\) with kernel \({\mathscr {K}}\);

  2. (2)

    \(\log w\) is an element of the quotient group \({\mathbb {C}}/{\mathscr {K}}\);

  3. (3)

    if \(z \ne 0\), \(z^w\) is an element of the quotient group \({\mathbb {C}}^*/1^w\).

3 The Parametrisation of the Equation \(x^y = y^x\)

In this section we connect Goldbach’s parametrisation of the positive solutions of \(x^y = y^x\) to the function F defined on \((1,+\infty )\) by \(y = F(x)\) if and only if \(x^y = y^x\). Now for us, the parametrisation of the solutions \(x^y = y^x\) is best written in the form

$$\begin{aligned} x(t) = \exp U(t), \quad y(t) = \exp V(t), \end{aligned}$$

where

$$\begin{aligned} U(t) = \frac{\log t}{t-1}, \quad V(t) = \frac{t\log t}{t-1} = tU(t), \quad t > 0, \end{aligned}$$

where \(U(1)=V(1)=1\). The functions U and V will play a prominent role in this discussion so the following lemma is useful.

Lemma 1

The function U is a strictly decreasing homeomorphism of \((0,+\infty )\) onto itself. The function V is a strictly increasing homeomorphism of \((0,+\infty )\) onto itself, and \(V(t) = U(1/t)\).

Proof

As \(U(t) = \big (\log t - \log 1\big )/(t-1)\), we let \(U(1)=1\), so that the function U is continuous at 1. With this, it is clear that U is a continuous map of \((0,+\infty )\) into itself. Next, for \(t>0\) we have

$$ U'(t) = \frac{\sigma (t)}{(t-1)^2}, \quad \sigma (t) = 1-\log _e t - \frac{1}{t}. $$

As \(\sigma \) is strictly increasing on (0, 1), and strictly decreasing on \((1,+\infty )\), with \(\sigma (1)=0\), we see that \(U'(t) \le 0\) with a strict inequality when \(t \ne 1\). As \(U(t) \rightarrow +\infty \) as \(t \rightarrow 0\), and \(U(t) \rightarrow 0\) as \(t\rightarrow +\infty \), this shows that U is a decreasing homeomorphism of \((0,+\infty )\) onto itself. Finally, as \(V(t) = U(1/t)\), the properties of V follow from those of U. \(\square \)

Note that \(U(1/t) = V(t)\) implies that \(x(1/t) = y(t)\) and \(y(1/t) = x(t)\), so the involution \(t \mapsto 1/t\) of \((1,+\infty )\) corresponds to the symmetry of \({\mathscr {C}}\) about the line \(y=x\).

We now recall that \(x(t) = \exp U(t)\). Then \(y(t) = \exp V(t) = \exp U(1/t) = x(1/t)\). By Lemma 1, \(t \mapsto x(t)\) is a homeomorphism of \((1,+\infty )\) onto itself so we now see that

$$\begin{aligned} y(t) = x(1/t) = x\left( \frac{1}{x^{-1}\big (x(t)\big )} \right) . \end{aligned}$$

If we now let \({\varPhi }(t) = \exp U(t)\), and \(\psi (t)=1/t\), and rewrite this in terms of x rather than t, we see that \(y=F(x)={\varPhi }\psi {\varPhi }^{-1}(x)\); thus the curve \({\mathscr {C}}\) is (in some sense) ‘conjugate’ to the hyperbola given by \(xy=1\).

4 The Uniformisation of \(z^w=w^z\)

In some sense the uniformisation of \(z^w = w^z\) has already been achieved by Goldbach (although, of course, he did not consider it as such, nor did he consider complex variables). We recall that \({\varOmega }= {\mathbb {C}}\backslash (-\infty ,0]\), and \(\mathrm{Log}\,z\) is the principal value of the complex logarithm on \({\varOmega }\). As always, our interest centres on the different branches of the logarithm. Suppose that f is a function that is continuous and never zero in a region D in the complex plane. Then we say that a function g is a continuous branch of \(\log f\) in D if and only if g is continuous and satisfies \(\exp g = f\) throughout D. The existence of a continuous branch of \(\log f\) in D is given by the following well known basic result: if D is a simply connected subregion of \({\mathbb {C}}\), and f is continuous and non-zero in D, then there exists a continuous branch of \(\log f\) in D; see [1, p. 33] and [13, p. 8].

We shall now focus our attention on the functions

$$\begin{aligned} U(z) = \frac{\mathrm{Log}\,z}{z-1}, \quad V(z) = \frac{z\;\mathrm{Log}\,z}{z-1}, \quad z\in {\varOmega }, \end{aligned}$$

that are meromorphic on \({\varOmega }\), and we shall now show that

$$\begin{aligned} V(z)\exp U(z) = U(z)\exp V(z), \quad z \in {\varOmega }. \end{aligned}$$
(3)

As \(\mathrm{Log}\,z\) is single-valued in \({\varOmega }\) with a simple zero at \(z=1\), the functions U and V have a removable singularity at 1 (with \(U(1) = V(1) = 1\)), so U and V are both single-valued and holomorphic in \({\varOmega }\). Moreover, if \(t > 0\) then \(\mathrm{Log}\,t = \log _e t\), so that \(x(t) = \exp U(t)\) and \(y(t) = \exp V(t)\). As Euler showed that \(y(t)\log _e x(t) = x(t)\log _e y(t)\), we deduce that, for all positive t,

$$\begin{aligned} V(t)\exp U(t) = U(t)\exp V(t). \end{aligned}$$

This shows that (3) holds when \(z \in (0,+\infty )\), and as both sides of the equation are holomorphic in \({\varOmega }\), and the two sides are equal to each other on \((0,+\infty )\), we now see that (3) holds throughout \({\varOmega }\). This can also be proved by direct computation.

Now note that U and V are never zero in \({\varOmega }\). As U and V are holomorphic in \({\varOmega }\) we can define holomorphic functions X and Y on \({\varOmega }\) by

$$\begin{aligned} X(\zeta ) = \exp U(\zeta ), \quad Y(\zeta ) = \exp V(\zeta ), \qquad \zeta \in {\varOmega }, \end{aligned}$$

and these functions are also holomorphic, and never zero, in \({\varOmega }\). Now, by construction, U is a continuous branch of the logarithm of X in \({\varOmega }\), so we may write, say, \(U = \log _1 X\) there. Similarly, V is a continuous branch of the logarithm of Y, say \(\log _2 Y\), in \({\varOmega }\). It is now clear that

$$\begin{aligned} Y(\zeta )\log _1 X(\zeta ) = X(\zeta ) \log _2 Y(\zeta ) \end{aligned}$$
(4)

throughout \({\varOmega }\). Since (4) implies that

$$\begin{aligned} \exp \big [Y(\zeta )\log _1 X(\zeta ) \big ] = \exp \big [X(\zeta ) \log _2 Y(\zeta )\big ] \end{aligned}$$
(5)

throughout \({\varOmega }\), we have now proved the following result which provides the uniformization of \(X^Y = Y^X\) that we are seeking.

Theorem 2

There are single-valued functions X and Y that are holomorphic and never zero in the simply-connected region \({\varOmega }\), and single valued branches \(\log _1 X\) and \(\log _2 Y\) of X and Y, respectively, such that (5) holds throughout \({\varOmega }\).

The basic ideas of analytic continuation, and an informal understanding of Riemann surfaces, are enough to show that if \(\mathrm{Log}\,\) is analytically continued over the set \({\mathbb {C}}^*\) of non-zero complex numbers then it is defined and single-valued on a Riemann surface which, informally, is an infinite ‘spiral-staircase’ which is constructed by taking an infinite number of copies of \({\varOmega }\) and identifying the edges in the usual way. As the relation (5) is preserved under analytic continuation, it will continue to hold on any region beyond \({\varOmega }\) (and lying on a suitable Riemann surface) that is obtained by a chain of analytic continuations starting with the functions U and V on \({\varOmega }\). It is clear from the forms of U and V that these functions can be continued analytically over the Riemann surface of \(\mathrm{Log}\,z\) except that for branches of the logarithm other than the principal branch, the functions U and V will have a pole at the points lying over \(z=1\); for example, for one such branch the continuation of U will give rise to the function \(z \mapsto (\mathrm{Log}\,z + 2\pi i)/(z-1)\) near the point \(z=1\), and the function \(\exp U\) will have an essential singularity at these points.

5 The Lambert Function and the Real Equation \(x^y=y^x\)

In this section we return to the positive solutions of \(x^y = y^x\) and show how they are related to two of the branches (which are customarily labelled \(W_0\) and \(W_{-1}\)) of the restriction of the Lambert function to \({\mathbb {R}}\). The Lambert function is the many-valued holomorphic inverse of the function \(E:z\mapsto z\exp z\), so we must consider the inverse of the restriction \(E_0\) of this function to \({\mathbb {R}}\). The function \(E_0\) defined by \(E_0(x)= x\exp x\) maps \({\mathbb {R}}\) onto \([-1/\mathrm{e},+\infty )\), and the graph of \(E_0\) is illustrated in Fig. 3.

Fig. 3
figure 3

The graph of the function \(E_0(x) = x \exp x\)

As x increases from \(-\infty \) to \(+\infty \), so \(E_0(x)\) decreases strictly from 0 to \(-1/\mathrm{e}\) on \((-\infty ,-1]\), and then increases strictly from \(-1/\mathrm{e}\) to \(+\infty \) on \([-1,+\infty )\). It follows that there is

  • a branch \(W_0\) of the Lambert function W which is a strictly increasing map of \((-1/\mathrm{e},+\infty )\) onto \((-1,+\infty )\), and

  • a branch \(W_{-1}\) which is a strictly decreasing map of \((-1/\mathrm{e},0)\) onto \((-\infty ,-1/\mathrm{e})\).

This elementary discussion leads immediately to the next result which is often stated in the literature and, for the convenience of the reader, we include a formal proof.

Theorem 3

Let F be the function defined by \(y = F(x)\), where \(x^y=y^x\), \(x>0\) and \(y>0\), and, for \(x>1\), let \(\beta (x) = -(\log _e x)/x\). Then

$$\begin{aligned} y = F(x) = {\left\{ \begin{array}{ll} \dfrac{W_{-1}\big (\beta (x)\big )}{\beta (x)} &{}\quad \text {if } \, 1 < x\le \mathrm{e}; \\ \dfrac{W_{0}\big (\beta (x)\big )}{\beta (x)} &{}\quad \text {if } \, x \ge \mathrm{e}. \end{array}\right. } \end{aligned}$$
(6)

Proof

Suppose that \(x>0\), \(y>0\) and \(x^y=y^x\). Then \(y\log _e x = x\log _e y\) so that \(\beta (x)=\beta (y)\), and \(E(-\log _e x) = \beta (x) = \beta (y) = E(-\log _e y)\). Now suppose that \(x>\mathrm{e}\); then \(1< y < \mathrm{e}\). If we let \(a = -\log _e x\) and \(b = -\log _e y\), then \(a< -1< b < 0\), and

$$\begin{aligned} b = y \beta (x), \quad E(a)=\beta (x), \quad E(b)=\beta (y). \end{aligned}$$

As \(b \in (-1,0)\) we have \(W_0\big (E(b)\big )=b\). Thus, finally, we have

$$\begin{aligned} W_0\big (\beta (x)\big ) = W_0\big (\beta (y)\big ) = W_0\big (E(b)\big ) = b = y\beta (x), \end{aligned}$$

which is the second formula in (6). If \(1< x <\mathrm{e}\), the first formulae follows in a similar way noting that, in this case,

$$\begin{aligned} W_{-1}\big (\beta (x)\big ) = W_{-1}\big (\beta (y)\big ) = W_{-1}\big (E(b)\big ) = b = y\beta (x), \end{aligned}$$

and we leave the details to the reader. Theorem 3 clarifies Lòczi’s remark (Remark 4 in [17, p. 222]) that (in our notation) the function F is not given by \(W_0\big (\beta (x)\big )/\beta (x)\) throughout the curve \({\mathscr {C}}\). \(\square \)

6 The Lambert function: An Exposition

In this section we give a brief exposition of the Lambert function and, although we shall not consider applications, it is perhaps worth mentioning that the Lambert function has many application to physical problems (for example, to combinatorics, iterated exponentiation, fuel and combustion problems, enzyme kinetics, delay differential equations, atomic physics, population growth, epidemics, and many more); see [7] for more details. Our specific aim here is to give a short exposition on the branches \(W_0\) and \(W_{-1}\) of the Lambert function that appeared in Theorem 3.

The Lambert function W is the inverse of the entire function \(E:z \mapsto z\exp z\), so we begin by describing the properties of the function E; then we consider the analytic continuation of the different branches of its inverse. First, \(E({\overline{z}}) = \overline{E(z)}\); thus the action of E, and therefore of any branch of its inverse, is symmetric with respect to the real axis. Next, \(E'(z) = (z+1)\exp z\), so that \(E'(z)=0\) if and only if \(z=-1\); thus, providing that \(z_0 \ne -1\), the map E provides a conformal bijection of some open neighbourhood \({\mathscr {N}}_0\) of \(z_0\) onto some open neighbourhood \({\mathscr {N}}_1\) of \(E(z_0)\), and this has a holomorphic inverse of \({\mathscr {N}}_1\) onto \({\mathscr {N}}_0\). Finally, as E is an entire function (that is, it is holomorphic throughout \({\mathbb {C}}\)) with an essential singularity at \(\infty \), Picard’s great theorem (see, for example [5, p. 203] and [13, p. 45]) implies that, with at most two exceptional values of a in \({\mathbb {C}}_\infty \), the equation \(E(z) = a\) has infinitely many solution in \({\mathbb {C}}\). Now \(E \ne \infty \) in \({\mathbb {C}}\), and \(f(z)=0\) if and only if \(z=0\) so that the two exceptional values for E are 0 and \(\infty \). Thus, for every non-zero complex number w, \(E(z) = w\) has infinitely many solutions in \({\mathbb {C}}\). In particular, this shows that E maps \({\mathbb {C}}\) onto itself, and as each non-zero w is ‘covered’ infinitely often, so the Lambert function W has infinitely many branches at every non-zero point of \({\mathbb {C}}\), and exactly one branch at 0. Thus, formally, the Lambert function W is a single-valued, holomorphic, branched covering map of some Riemann surface onto \({\mathbb {C}}\). For a discussion of the branches of the Lambert function, see [7, 8, 15].

A formal discussion of the Lambert function requires the notion of analytic continuation, and we shall assume that the reader is familiar with the ideas of the analytic continuation of a function element along a curve. Briefly, a function element is a pair (fD), where D is a region (that is, an open and connected set) in \({\mathbb {C}}\), and f is holomorphic in D. The function elements \((f_1,D_1)\) and \((f_2,D_2)\) are said to be analytic continuations of each other, and we write \((f_1,D_1) \sim (f_2,D_2)\), if \(D_1 \cap D_2 \ne \emptyset \), and \(f_1 = f_2\) on \(D_1 \cap D_2\). Also, we shall write \((f_1,D_1) \approx (f_2,D_2)\) if \((f_1,D_1)\) and \((f_2,D_2)\) are the two ends of a finite sequence of function elements in which each pair of consecutive terms are analytic continuations of each other. Clearly \(\approx \) is an equivalence relation on the class \({\mathcal {F}}\) of all function elements, and each equivalence class is called a complete analytic function. Of course, the main theorem in analytic continuation is the monodromy theorem: Suppose that D is a simply connected region in \({\mathbb {C}}\). If a function element \((f,{\varDelta })\), where \({\varDelta }\subset D\), can be continued analytically along every curve in D, then f can be extended to a unique, single-valued, holomorphic function throughout D, and the map \(f:D \rightarrow f(D)\) is a holomorphic bijection with a holomorphic inverse \(f^{-1}:f(D) \rightarrow D\). For more details see, for example [1, Ch. 1] and [6, Ch. IX].

We shall now discuss the principal branch \(W_0\) of the Lambert function. As \(E(0)=0\) and \(E'(0)=1\), we see that E has a holomorphic inverse \(W_0\) that is defined in some neighbourhood of 0 and, moreover, this function \(W_0\) must agree at real points near 0 with the real function \(W_0\) that was defined earlier. Standard analytic arguments about power series (see [4, pp. 229–232] and [7, 8]) show that the Taylor expansion of \(W_0\) at the origin is given by

$$\begin{aligned} W_0(z) = \sum _{n=0}^\infty \frac{(-n)^{n-1}}{n\,!}z^n, \end{aligned}$$
(7)

and the ratio test shows that this series has radius of convergence \(1/\mathrm{e}\). A standard test for singular points on the circle of convergence (see [22]) shows that \(-1/\mathrm{e}\) is a singular point of the series in (7); thus \(W_0\), which is defined on the open disc \(\{|z| < 1/\mathrm{e}\}\), cannot be continued analytically to any neighbourhood of \(-1/\mathrm{e}\).

A more detailed analysis (see [7]) shows that \(W_0\) can be continued analytically beyond the open disc \(\{|z| < 1/\mathrm{e}\}\) to give a conformal map of the cut plane \({\mathbb {C}}\backslash (-\infty ,-1/\mathrm{e}]\), which we denote by \({\varOmega }_1\), onto the U-shaped region \({\mathscr {U}}\) which is illustrated in Fig. 4. Thus E is a conformal map of \({\mathscr {U}}\) onto \({\varOmega }_1\), and its inverse is the principal branch \(W_0\) of the Lambert function which is a conformal map of \({\varOmega }_1\) onto \({\mathscr {U}}\). In fact, we have the following more precise result.

Theorem 4

Let \({\mathscr {U}}\) be the subregion of \({\mathbb {C}}\) that is bounded by the curve given by \(x\sin y + y \cos y = 0\), and that contains 0. Then E is a conformal map of \({\mathscr {U}}\) onto \({\varOmega }_1\), and \(W_0\) is the inverse of this conformal map. Further, the image of the imaginary axis under \(W_0\) is the curve given by \(x = y\tan y\).

Fig. 4
figure 4

The principal branch \(W_0:{\varOmega }_1 \rightarrow {\mathscr {U}}\) of the Lambert function

We shall not give a proof of Theorem 4; instead, we note that the proof uses the fact that if x and y are real, and \(E(x+iy) = u+iv\) then, by direct computation,

$$\begin{aligned} u = \big (x\cos y - y \sin y\big )\exp x, \quad v = \big (x\sin y + y \cos y\big )\exp x. \end{aligned}$$

The proof then proceeds by a long and detailed analysis of various curves which are defined in terms of the standard trigonometric functions. For example, to begin, it is clear that E maps the curve given by \(x\sin y + y \cos y = 0\) into the real axis.

7 Concluding Remarks

In order to take this discussion further, we would need to have a careful discussion of the branch \(W_{-1}\), or even (perhaps) an alternative definition of \(W_{-1}\) to that given in [7], in which \(W_{-1}\) (in the alternative definition) is defined on a region that is symmetric with respect to the real axis. Once this has been done we could then consider the roles of \(W_0\) and \(W_{-1}\) in the uniformisation of the equation \(z^w=w^z\). However, it does seem that Goldbach’s parametrisation is the best way to uniformise the equation \(z^w=w^z\), whereas the Lambert function is more suited to writing the solutions in the form \(y=F(x)\).