5.1 Dynamical Systems

We start by clarifying notation and terminology. Let \(\textsf{M}\) be a topological manifold, \((\mathcal {T},\cdot )\) a commutative topological group with \(e\in \mathcal {T}\) denoting its identity element, i.e., \(s\cdot e = e\cdot s= s\) \(\forall s\in \mathcal {T}\) and let \(\varphi :\mathcal {T}\times \textsf{M}\rightarrow \textsf{M}\) be a continuous map. The triple \((\textsf{M},\mathcal {T},\varphi )\) is a dynamical system when the following axioms are satisfied

  1. (i)

    the identity property: \(\varphi (e,p)=p\) for all \(p\in \textsf{M}\); and

  2. (ii)

    the group property: \(\varphi (s\cdot t,p)=\varphi (s,\varphi (t,p))\) for all \(s,t\in \mathcal {T}\) and \(p\in \textsf{M}\).

When \(\mathcal {T}=\mathbb {R}\) the map \(\varphi \) is said to be a (global) flow. In that case, the axioms are conveniently written as \(\varphi ^0=\textrm{id}_{\textsf{M}}\) and \(\varphi ^s\circ \varphi ^t=\varphi ^{s+t}\) \(\forall s,t\in \mathbb {R}\), with \(\varphi ^t\) denoting the homeomorphism \(\varphi (t,\cdot ):\textsf{M}\rightarrow \textsf{M}\). Indeed, see that a flow naturally induces the homotopy equivalence \(\varphi ^t(U){\simeq }_h\varphi ^s(U)\) for any \(s,t\in \mathbb {R}\) and open set \(U\subseteq \textsf{M}\). See also that \((\textsf{M},\mathcal {T},\varphi )\) is time-invariant in the sense that \(\varphi (s,p)\) only depends on the current point \(p\in \textsf{M}\) and the “times to propagate, i.e., \(\varphi (s, \varphi (s^{-1},\varphi (s,p)))=\varphi (s,p)\).

Let \(\textsf{M}\) be a smooth manifold and let \(X\in \mathfrak {X}^{r}(\textsf{M})\), with \(r\in \mathbb {N}\cup \{\infty \}\), denote a \(C^r\)-smooth vector field over \(\textsf{M}\), that is, \(X:\textsf{M}\rightarrow T\textsf{M}\) is a \(C^r\)-smooth map and \(\pi _p\circ X=\textrm{id}_{\textsf{M}}\) for \(\pi _p\) being the canonical projection \(\pi _p:T\textsf{M}\rightarrow \textsf{M}\) defined by \(\pi _p:(p,v)\mapsto p\), sometimes simply written as \(\pi \). Equivalently, \(C^r\) vector fields X on \(\textsf{M}\) can be understood as \(C^r\) sections of the tangent bundle, denoted \(X\in \Gamma ^r(T\textsf{M})\). The evaluation of \(X\in \mathfrak {X}^{r}(\textsf{M})\) at \(p\in \textsf{M}\) is a tangent vector \(X(p)=X_p\in T_p\textsf{M}\).

Then, a differentiable curve \(\xi :\mathcal {I}\subseteq \mathbb {R}\rightarrow \textsf{M}\), for some appropriate interval \(\mathcal {I}\), is called an integral curve of the vector field X when \(\dot{\xi }(t)=X(\xi (t))\) for all \(t\in \mathcal {I}\). A manifestation of a flow that will be of interest is as a map parametrizing all integral curves of a vector field. Given a vector field \(X\in \mathfrak {X}^r(\textsf{M})\), then via the relation

$$\begin{aligned} \left. \frac{\textrm{d}}{\textrm{d}t}\varphi ^t ( p )\right| _{t=t'} = X(\varphi ^{t'}(p)), \end{aligned}$$
(5.1)

we can define a \(C^r\) local flow \(\varphi :\textrm{dom}(\varphi )\rightarrow \textsf{M}\), where the regularity is commonly with respect to the second argument of the flow. See that the map \(t\mapsto \varphi ^t(p)\) is always at least \(C^{1}\) when the flow is generated by a continuous vector field. A flow is not necessarily well-defined for all \(t\in \mathbb {R}\) or even all \(t\ge 0\),Footnote 1 e.g., consider \(\dot{x}=x^2\) with \(x(0)=1\). When X does give rise to a globally well-defined flow the vector field is said to be complete. Regarding the case study from Sect. 1.3, every left-invariant vector field on a Lie group is complete [43, Theorem 9.18]. In particular, any smooth vector field over a compact manifold is complete [43, Theorem 9.16]. When a flow originates from a vector field X, we denote this by \(\varphi _X\).

We will follow the terminology from Chap. 3; given a smooth map G, we speak of the differential of G or the pushforward under G, denoted DG and \(G_{*}\), respectively. Now we formalize what this means in case G acts on vector field or is a vector field itself. Following [43], given a smooth map \(G:\textsf{M}\rightarrow \textsf{N}\) the pushforward of G at \(p\in \textsf{M}\), that is, \(DG_p:T_p\textsf{M}\rightarrow T_{G(p)}\textsf{N}\), is defined pointwise by \(DG_p(X_p)g=X_p(g\circ G)\) for any smooth function \(g:\textsf{N}\rightarrow \mathbb {R}\) and any \((p,X_p)\in T\textsf{M}\). Here, the action of a tangent vector \(X_p\) on a \(C^1\) function g should be understood, in coordinates, as the directional derivative of g in the direction of \(X_p\). To turn this pointwise pushforward, that is, all the vectors \(DG_p(X_p)\), into a vector field on \(\textsf{N}\), we must be able to supply any point \(q\in \textsf{N}\) and get an element in \(T_q\textsf{N}\), that is, we need p such that \(G(p)=q\). Hence, we assume G to be a \(C^{r+1}\) diffeomorphism and compose with \(G^{-1}\), this defines the pushforward \(G_{*}X\) of a vector field \(X\in \mathfrak {X}^r(\textsf{M})\) under G as the vector field \(G_{*}X\in \mathfrak {X}^r(\textsf{N})\) satisfying \((DG(X)g)\circ G^{-1}=G_{*}Xg\) for any smooth \(g:\textsf{N}\rightarrow \mathbb {R}\) cf. Sect. 3.5, also understood via the following diagram

figure a

Moreover, we make frequent use of the differential of \(X\in \mathfrak {X}^{r\ge 1}(\textsf{M})\) at \(p\in \textsf{M}\) such that \(X(p)=0\), written as \(DX_p:T_p\textsf{M}\rightarrow T_{X(p)}T\textsf{M}\). With some abuse of notation this map is, however, commonly written as \(DX_p:T_p\textsf{M}\rightarrow T_p\textsf{M}\) such that, in coordinates, one can consider the eigenvalues of \(DX_p\). This simplification hinges on the fact that for \(p\in \textsf{M}\), \(X:p\mapsto (p,v)\) with \(v\in T_p\textsf{M}\) such that the first part of \(DX_p:T_p\textsf{M}\rightarrow T_{X(p)}T\textsf{M}\) is just the identity map on \(T_p \textsf{M}\), hence the simplification. Note, without imposing further assumptions, this “connection-free” construction of \(DX_p\) will only make sense when \(X_p=0\in T_p\textsf{M}\). To see this, recall that \(X\in \mathfrak {X}^{r\ge 1}(\textsf{M})\) induces a \(C^{r\ge 1}\) (local) flow \(\varphi ^t:\textsf{M}\rightarrow \textsf{M}\) [43, Theorem D.5]. Now, consider the time-derivative of \(D\varphi ^t_p:T_p\textsf{M}\rightarrow T_{\varphi ^t(p)}\textsf{M}\) and see that we work within the same tangent space when p is a fixed point of \(\varphi ^t\). In that case we do not need additional structure and can define \(DX_p:T_p\textsf{M}\rightarrow T_p\textsf{M}\) via

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}D\varphi ^t_p v = DX_p v,\qquad v\in T_p\textsf{M}. \end{aligned}$$

Via (5.1) one can observe that in coordinates this construction entails the Jacobian of X. One can also see this from \(T_{X(p)=0}T\textsf{M}\simeq T_p\textsf{M}\oplus T_p\textsf{M}\) [2, p. 72] or by observing that the non-Euclidean part of the covariant derivative vanishes [42, Chap. 4].

As most of this work will be centred around providing necessary conditions, we will not be further concerned with integrability. We point the reader to, [43, 57, Appendix D], [62, Appendix C] and note that a uniquely integrable continuous vector field \(X\in \mathfrak {X}^{r\ge 0}(\textsf{M})\) under locally Lipschitz regularity conditions gives rise to a unique maximal flow \(\varphi _X:\mathcal {I}\times \textsf{M}\rightarrow \textsf{M}\), for some \(\mathcal {I}\subseteq \mathbb {R}\). To simplify the overall exposition, we will assume—unless stated otherwise—the following throughout.

Assumption 5.1

(Unique global integrability) Every vector field \(X\in \mathfrak {X}^{r\ge 0}(\textsf{M})\) considered in this work is complete and uniquely integrable, that is, X gives rise to a unique global continuous flow \(\varphi _X:\mathbb {R}\times \textsf{M}\rightarrow \textsf{M}\).

Nevertheless, notable results that can do with weaker assumptions will be highlighted. Next we introduce a notion due to Birkhoff [10, p. 197].

Definition 5.1

(The \(\omega \)-limit set [56, p. 148]) Given a flow \(\varphi \) on a topological manifold \(\textsf{M}\), the \(\mathbf {\omega }\) -limit set of \(p\in \textsf{M}\) is

$$\begin{aligned} \omega (\varphi ,p) = \textstyle \bigcap _{T\ge 0}\textrm{cl}\, \bigcup _{t\ge T}\{\varphi ^t(p)\}. \end{aligned}$$
(5.2)

Differently put, as we work with global flows (complete vector fields), \(y\in \omega (\varphi ,p)\) when there is a monotonically increasing sequence \(\{t_n\}_{n\in \mathbb {N}}\subset \mathbb {R}\) with \(\lim _{n\rightarrow \infty }t_n = +\infty \) such that \(\lim _{n\rightarrow \infty }\varphi ^{t_n}(p)=y\) [2, Proposition 6.1.2]. The \(\omega \)-limit set captures any type of asymptotic recurrent behaviour, like the convergence to equilibrium points but also limit cycles and so forth, as further detailed below. Analogously, one can define the \(\alpha \) -limit by reversing time. When generalizing Definition 5.1 to sets \(P\subseteq \textsf{M}\) it is important to see that \(\omega (\varphi ,P)\) is not necessarily equivalent to \(\cup _{p\in P}\,\omega (\varphi ,p)\).

Recall, given a vector field \(X\in \mathfrak {X}^r(\textsf{M})\), we call \(p^{\star }\in \textsf{M}\) an equilibrium point of X when \(X_{p^{\star }}=0\in T_{p^{\star }}\textsf{M}\), equivalently, \(\omega (\varphi _X,p^{\star })=\{p^{\star }\}\).Footnote 2 The point \(p^{\star }\) is called isolated when there is an open neighbourhood \(U\subseteq \textsf{M}\) of \(p^{\star }\) such that for all \(p\in U\setminus \{p^{\star }\}\) one has \(X_p\ne 0\), equivalently, \(\omega (\varphi _X,p)\ne \{p\}\) for all \(p\in U\setminus \{p^{\star }\}\). Note, this is different from an isolated set in the sense of Conley cf. Sect. 8.5. Away from equilibrium points the flow is locally a straight line [53, Theorem 2.26]. This flow-box theorem indicates why the behaviour around equilibrium points is interesting to study, but also that periodic orbits are inherently hard to study locally.

Definition 5.2

(Hyperbolic equilibrium points [55, p. 58]) Let \(p^{\star }\in \textsf{M}^m\) be an equilibrium point of \(X\in \mathfrak {X}^{r\ge 1}(\textsf{M}^m)\). Then, \(p^{\star }\) is a hyperbolic equilibrium point if \(DX_{p^{\star }}:T_{p^{\star }}\textsf{M}^m\rightarrow T_{p^{\star }}\textsf{M}^m\) defines a hyperbolic linear vector field, that is, all eigenvalues (in coordinates) of \(DX_{p^{\star }}\) in \(\mathbb {R}^{m\times m}\) have a non-zero real part.

Hyperbolic equilibrium points are generic [55, p. 58] and isolated, cf. Proposition 3.5. Next, we define the qualitative behaviour of interest for time-invariant dynamical systems.

Fig. 5.1
figure 1

Definition 5.3: (i) Lyapunov stability; (ii) attractivity; and (iii) asymptotic stability

Definition 5.3

(Time-invariant stability notions of equilibrium points) Given a vector field \(X\in \mathfrak {X}^r(\textsf{M})\) defining the system (5.1) we distinguish the following notions of stability for an equilibrium point \(p^{\star }\in \textsf{M}\) of X:

  1. (i)

    Lyapunov stability: for each neighbourhood \(U_{\epsilon }\) of \(p^{\star }\) there is neighbourhood \(U_{\delta }\) of \(p^{\star }\) such that for all \(p_0\in U_{\delta }\) one has \(\varphi _X^t(p_0)\in U_{\epsilon }\) for all \(t\ge 0\), that is, \(\varphi ^t_X(U_{\delta })\subseteq U_{\epsilon }\) for all \(t\ge 0\);

  2. (ii)

    Local attractivity: there is a neighbourhood U of \(p^{\star }\) such that for all \(p_0\in U\) one has \(\lim _{t\rightarrow +\infty }\varphi _X^t(p_0)=p^{\star }\);

  3. (iii)

    Local asymptotic stability: \(p^{\star }\) satisfies both (i) and (ii).

If (ii) holds with \(U=\textsf{M}\), then the point \(p^{\star }\) is globally attractive and similarly, if \(p^{\star }\) is globally attractive and stable in the sense of Lyapunov, then, \(p^{\star }\) is globally asymptotically stable, see Fig. 5.1. If (i) fails to hold, \(p^{\star }\) is called unstable.

We will be mostly interested in studying asymptotic stability, however, not only regarding equilibrium points. A compact set \(A\subseteq \textsf{M}\) is called a local attractor of the flow \(\varphi \) when A is \(\varphi \) -invariant, that is, \(\varphi ^t(A)\subseteq A\) for all \(t\in \mathbb {R}\), and there is an open neighbourhood \(U\subseteq \textsf{M}\) of A such that \(\cap _{t\ge 0}\,\varphi ^t(U)=A\). In fact, this is equivalent to saying that the invariant set A is locally asymptotically stable [28, Lemma 1.6].Footnote 3 Attractivity is clear, for Lyapunov stability, consider for example Fig. 1.1(vi), \(\cap _{t\ge 0}\,\varphi ^t(U)\) will correspond to a set which is both not closed and open, e.g., of the form [ab). Indeed, lacking Lyapunov stability or lacking local attractivity are not mutually exclusive notions, see also [26, Sect. 40].

As global asymptotic stability will turn out to be often impossible, we make a special distinction, we will consider local asymptotic multistability, e.g., all isolated equilibrium points of \(X\in \mathfrak {X}^r(\textsf{M})\) are locally asymptotically stable. The importance of this notion follows from the fact that having multiple attractors means that disturbances can qualitatively change the nominal behaviour, moving from one attractor to another. Multistability appears for example in the study of laser dynamics [6] and neural networks [14], with the importance of multistability being especially acknowledged in biology, e.g., see [5, 41, 49].

So far, everything was qualitative, yet, when imposing a metric on \(\textsf{M}\), stability (the rate of convergence) can be quantified [12, Sect. 6.1.5]. See [25] for the relation between asymptotic- and this quantitative notion called exponential stability.Footnote 4 Also, a metric enables handling non-compact attractors, e.g., see [73, Sect. 3].

When the dynamical system is time-varying, e.g., when \(X:\mathcal {I}\times \textsf{M}\rightarrow T\textsf{M}\) with \(\mathcal {I}\subseteq \mathbb {R}\) is a continuous time-varying vector field, we need to generalize our stability notions. First note that time-varying vector fields do not necessarily give rise to flows, however, time-dependent generalizations are possible [43, Theorem 9.48]. One still speaks of \(\xi :\mathcal {I}\rightarrow \textsf{M}\) as an integral curve of the time-varying vector field X when \(\dot{\xi }(t)=X(t,\xi (t))\) for all \(t\in \mathcal {I}\), however, one might do with a weak solution to the differential equation. That is, one allows for \(\xi \) to be merely absolutely continuous and to satisfy the differential equation almost everywhere, with respect to the Lebesgue measure. In what follows, when a vector field is time-varying, we take this viewpoint, see [1, Chap. 4], [12, Appendix A], [62, Appendix C] and [30] for more on solutions of time-varying vector fields. In general, for a time-varying dynamical system the \(\epsilon -\delta \) definition of Lyapunov stability might depend on time, that is, \(\delta \) might depend on time. When this is not the case, we speak of uniform stability, which coincides with stability in case the vector field is time-invariant. Similarly, one can extend the other notions of stability, see [36, Sect. 4.5], [18, Sect. 11.2] and Example 6.4 below. In particular, see [69, Sect. 5.1] for illustrative examples of the aforementioned stability notions due to Massera and Vidyasagar and see [66] for misconceptions when it comes to uniform stability. In particular, a lack of uniformity can compromise robustness, e.g., \(\delta \rightarrow 0\) for \(t\rightarrow +\infty \).

Now given an equilibrium point \(p^{\star }\in \textsf{M}\) for some vector field \(X\in \mathfrak {X}^r(\textsf{M})\), it is worthwhile to characterize the set of points that flow towards \(p^{\star }\) under X.

Definition 5.4

(Domain of attraction of an equilibrium point) Let \(p^{\star }\) be a local asymptotically stable equilibrium point of \(X\in \mathfrak {X}^r(\textsf{M})\) defining the system (5.1). The domain of attraction of \(p^{\star }\) is the set

$$\begin{aligned} \mathcal {D}(\varphi _X,p^{\star }) =\{p\in \textsf{M} : \lim _{t\rightarrow +\infty } \varphi _X^t(p)=p^{\star }\}. \end{aligned}$$
(5.3)

With some abuse of notation one could also write \(\mathcal {D}(\varphi _X,p^{\star })\) as \(\omega ^{-1}(\{p^{\star }\})\). The domain of attraction is also called the basin- or region of attraction and is in other work occasionally denoted as \(B(p^{\star })\) or \(A(p^{\star })\). Estimating the domain of attraction has a variety of applications, for example in cancer treatments [51, 58]. See also the 1985 survey paper by Genesio, Tartaglia and Vicino for more historical context [23]. For general attractors \(A\subseteq \textsf{M}\), one generalizes Definition 5.4 consistently, that is, \(\mathcal {D}(\varphi _X,A)=\{p\in \textsf{M}:\lim _{n\rightarrow +\infty }\varphi ^{t_n}_X(p)= a,\,a\in A\), for some monotonically increasing sequence \(\{t_n\}_{n\in \mathbb {N}}\subset \mathbb {R}\), with \(\lim _{n\rightarrow \infty }t_n=+\infty \}\).

In general, an equilibrium point \(p^{\star }\in \textsf{M}\) of some vector field \(X\in \mathfrak {X}^r(\textsf{M})\) is not locally asymptotically stable. In this case it might be of interest to split \(\textsf{M}\), locally, in stable and unstable parts. Assume that \(p^{\star }\) is a hyperbolic equilibrium point under the flow \(\varphi _X\) and define the so-called stable and unstablemanifolds” of \(p^{\star }\) by \(W^s(\varphi _X,p^{\star }) = \{p\in \textsf{M}:\omega (\varphi _X,p)=\{p^{\star }\}\}\) and \(W^u(\varphi _X,p^{\star }) = \{p \in \textsf{M}:\alpha (\varphi _X,p)=\{p^{\star }\}\}\). As \(p^{\star }\) is hyperbolic one can split \(T_{p^{\star }}\textsf{M}\) as \(T_{p^{\star }}\textsf{M}=T_{p^{\star }}W^s(\varphi _X,{p^{\star }})\oplus T_{p^{\star }} W^u(\varphi _X,p^{\star })\). In fact, \(Tp^{\star }\textsf{M}\) splits according to the generalized eigenvectors of \(DX_{p^\star }\). More can be said about these stable- and unstable manifolds, see [33, Chap. 6]. Also, when \(p^{\star }\) is not hyperbolic one can appeal to center manifold theory, e.g., see [13, 29].

This section ends with two explicit vector field examples from biology.

Fig. 5.2
figure 2

Continuous transformation of Stepanova’s model, removing the malignant equilibrium

Example 5.1

(Tumor immune interactions) Returning to Chap. 3, in particular, Hopf’s result (Proposition 3.6) tell us that indices adding up to 0 can be effectively “homotoped away”. This observation is of use when one wants to know if a certain dynamical systems can be “continuously deformed” into another dynamical system. Here, we look at an example that models tumor immune system interactions [58, Chap. 8]. Stepanova’s model is given by the following set of equations

$$\begin{aligned} \left\{ \begin{aligned} \dot{p}=&\xi p F(p) - \theta pr\\ \dot{r}=&\alpha (p-\beta p^2)r + \gamma -\delta r, \end{aligned}\right. \end{aligned}$$
(5.4)

where p represents tumor volume, r the immunocompetent cell density, F(p) a growth rate and all other parameters are constant coefficients. For common choices of parameters, (5.4) has two stable equilibrium points, a benign state \(b^{\star }\) and a malignant state \(m^{\star }\), separated by a saddle \(s^{\star }\), see Fig. 5.2(i). A question of interest is if (5.4) can be deformed such that only the asymptotically stable benign equilibrium state prevails. Using the theory of vector field indices we see that \(s^{\star }\) and \(m^{\star }\) have indices of opposite sign such that they can be morphed into \(\widetilde{s}^{\star }\), see Fig. 5.2(ii). Similarly, as this intermediate equilibrium point has index 0, it can be removed completely, see Fig. 5.2(iii). Indeed, as \(b^{\star }\) has index 1, as is the Euler characteristic of the rectangular domain, and the vector field is pointing inwards, the existence of this transformation was guaranteed from the start, see also Sect. 8.2.

Fig. 5.3
figure 3

Dynamical (control) systems: (i) a parametrization of a double helix DNA model on \(\textsf{SE}(3,\mathbb {R})\) corresponding to Example 5.2 and (ii) the circuit corresponding to Example 5.4

Example 5.2

(DNA conformation [37]) In this example we model the DNA double helix as an elastic rod that has a helical twist at its minimum energy conformation. One can parametrize this model by a coordinate frame moving along a curve in \(\mathbb {R}^3\). Moreover, this curve has a, possibly arc-length-dependent, matrix attached to it, describing its mechanical properties like stiffness. Now, to move along this curve, the instantaneous change in orientation and position need to be supplied, i.e., a rotation and a translation. This parametrization naturally leads to the employment of the special Euclidean group, defined as

$$\begin{aligned} \textsf{SE}(3,\mathbb {R})=\left\{ A\in \mathbb {R}^{4\times 4}:A=\begin{bmatrix} R &{} t\\ 0_{1\times 3} &{} 1 \end{bmatrix},\, R\in \textsf{SO}(3,\mathbb {R}),\,t\in \mathbb {R}^3 \right\} . \end{aligned}$$

Here, for a \(A\in \textsf{SE}(3,\mathbb {R})\), R describes the rotation and t the translation. For any point \(p\in \mathbb {R}^3\), the propagation under \(A\in \textsf{SE}(3,\mathbb {R})\) is as follows, append 1 to p, i.e., let \(\tilde{p}=(p,1)\in \mathbb {R}^4\), then \(A\tilde{p}=(Rp+t,1)\in \mathbb {R}^4\), i.e., first p is rotated to Rp, then, this point is translated to \(Rp+t\). Now, the potential function that characterizes the energy of the DNA conformation will only consider relative changes in the curve parametrizing the DNA model. As such, we will always look from the so-called “body frame”. To that end, see that for any differentiable curve \(s\mapsto A(s)\in \textsf{SE}(3,\mathbb {R})\)

$$\begin{aligned} A(s)^{-1}\frac{\textrm{d}}{\textrm{d}s}{A}(s) = \begin{bmatrix} R(s)^{\textsf{T}}\dot{R}(s) &{} R(s)^{\textsf{T}}\dot{t}(s)\\ 0_{1\times 3} &{} 0 \end{bmatrix}\in \mathfrak {se}(3,\mathbb {R})=T_{I_4}\textsf{SE}(3,\mathbb {R}). \end{aligned}$$

With this observation in mind, let \(\xi \) be the vectorization of \(A^{-1}\dot{A}\), denoted \(\xi =(A^{-1}\dot{A})^{\vee }\) with inverse \(\xi ^{\wedge }=A^{-1}\dot{A}\), details can be found in [52], but are irrelevant for this example. Now, one defines a quadratic elastic potential energy function, for a DNA double helix modelled as an unconstrained extensible rod, by \(U(\xi )=\tfrac{1}{2}\xi ^{\textsf{T}}K\xi - k^{\top }\xi +\beta \), for some positive definite \(K\in \mathbb {R}^{6\times 6}\), \(k\in \mathbb {R}^6\) and \(\beta \in \mathbb {R}\). Minimizing U over \(\xi \) results in \(\xi ^{\star }(s)=K^{-1}k\) and therefore

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}s}{A}(s) = A(s)(K^{-1}k)^{\wedge }, \end{aligned}$$
(5.5)

is a vector field on \(\textsf{SE}(3,\mathbb {R})\), that describes, locally, a DNA double helix model corresponding to the pair (Kk). To appreciate the geometric construction, the reader is invited to find the curve related to (5.5) directly.

5.2 Lyapunov Stability Theory

In this part we provide a brief overview of the stability theory as devised by Lyapunov [46]. The tools are fruitful in their own right, but as it turns out, the levelsets of Lyapunov functions are intimately related to the domain of attraction of the attractor at hand. A fairly general result is the following.

Theorem 5.1

(Compact attractors and Lyapunov functions [8, Theorem 10.6]) Let \(\varphi \) be a continuous flow on a locally compact Hausdorff space \(\textsf{X}\). The compact set \(A\subseteq \textsf{X}\) is a local attractor under \(\varphi \) if and only if there is continuous function \(V:\mathcal {D}(\varphi ,A)\rightarrow \mathbb {R}_{\ge 0}\) such that

  1. (i)

    \(V(x)=0\) for all \(x\in A\);

  2. (ii)

    \(V(x)>0\) for all \(x\in \mathcal {D}(\varphi ,A)\setminus A\), the sublevel set \(V^{-1}([0,c])=\{x\in \mathcal {D}(\varphi ,A):V(x)\le c\}\) is compact for all \(c\in [0,+\infty )\) and;

  3. (iii)

    \(V(\varphi ^t(x))<V(x)\) for all \(x\in \mathcal {D}(\varphi ,A)\setminus A\) and \(t>0\).

A function V as in Theorem 5.1 is called a strict continuous Lyapunov function. The word “strict” refers to item (iii), i.e., the inequality is strict. Without strictness, attractivity of A cannot be asserted, merely Lyapunov stability. As we almost exclusively consider attractors, the adjective “strict” is dropped in the remainder of the book. See that the sublevel set compactness assures the V is (weakly) coercive, e.g., for \(x\rightarrow \partial \mathcal {D}(\varphi ,A)\) such that \(\Vert x\Vert \rightarrow +\infty \), \(V(x)\rightarrow +\infty \). See for example [26, p. 109] for more on the necessity of this condition, see also Example 5.3 for a geometric interpretation. Note that the results in [8] are with respect to (local) semi-dynamical systems, however, by Assumption 5.1 we do not need to concern ourselves with semi-dynamical system technicalities like “start points”. See also [9, Chap. V] for a variety of Lyapunov theory, albeit for locally compact metric spaces. Although outside of the scope of this work, Theorem 5.1 does not apply as is to infinite-dimensional system, for infinite-dimensional Lyapunov theory, see for example [50].

Moreover, the results due to Kurzweil [38, Theorem 7], Massera [48] and Wilson [72] indicate that there should always be a smooth and properFootnote 5 Lyapunov function. However, in contrast to popular belief, it took until the work by Fathi and Pageault in 2019 to formally show this for flows generated by vector fields on smooth manifoldsFootnote 6 [21]. See the proof of Theorem 6.2 for an application.

The key benefit of a smooth Lyapunov function is that explicit knowledge of the flow in Theorem 5.1(iii) can be substituted by a simpler condition. Let X be a continuous complete vector field on a smooth manifold \(\textsf{M}\) and let \(A\subseteq \textsf{M}\) be an attractor under the flow \(\varphi _X\). Then, there is a smooth and proper function \(V:\mathcal {D}(\varphi _X,A)\rightarrow \mathbb {R}_{\ge 0}\) such that \(A=\{x\in \mathcal {D}(\varphi _X,A):V(x)=0\}\) and the Lie derivativeFootnote 7 \(L_XV(x)<0\) for all \(x\in \mathcal {D}(\varphi _X,A)\setminus A\). Such a function V is called a smooth Lyapunov function, again omitting “strict”. For example, for \(\dot{x}=-x^3\) one readily verifies that \(V(x)=\tfrac{1}{2}x^2\) is a smooth Lyapunov function that asserts global asymptotic stability of \(x^{\star }=0\). See [15, 65] for further generalizations and see [35] for a survey on converse Lyapunov theorems. We also point out that the global continuous converse problem was solved by Conley and coworkers and is often referred to as “Conley’s Fundamental Theorem of Dynamical Systems”. The theorem states that “Any flow on a compact metric space decomposes into a chain recurrent part and a gradient-like part.”, see [54]. We return to this in Sect. 8.5.

Example 5.3

(Topology of a Lyapunov function [73, Theorem 1.2]) Consider a continuous vector field \(\dot{x}=f(x)\) on \(\mathbb {R}^n\), with \(f(0)=0\) such that 0 is globally asymptotically stable and let \(V:\mathbb {R}^n\rightarrow \mathbb {R}_{\ge 0}\) be the corresponding smooth Lyapunov function. By assumption, \(\langle \partial _x V(x),f(x) \rangle < 0\) on \(\mathbb {R}^n\setminus \{0\}\), such that for any \(c\in (0,+\infty )\) one crosses the level set \(V_c = V^{-1}(c)\) once. Moreover, as we have the homeomorphism \(\varphi _f^t(V_c){\simeq }_t\varphi _f^s(V_c)\) for all \(t,s\in \mathbb {R}\), we find by exploiting the flow properties of \(\varphi _f\) that \(V_c\times \mathbb {R}\simeq _t\mathbb {R}^n\setminus \{0\}\). However, as the trivial bundle \(V_c\times \mathbb {R}\) is homotopic to \(V_c\) and \(\mathbb {R}^n\setminus \{0\}\simeq _h\mathbb {S}^{n-1}\) we can conclude that \(V_c{\simeq }_h\mathbb {S}^{n-1}\). Better yet, as \(V_c\) is compact, the resolution of the (a) Poincaré conjecture yields that \(V_c\simeq _t \mathbb {S}^{n-1}\). See also [38, Sect. 5] for earlier topological remarks and for example [23, Theorem 2] for early remarks on ramifications of \(V_c\simeq _t \mathbb {S}^{n-1}\).

Finding explicit Lyapunov functions is usually hard [4] and contradicts being a (direct) “method” [32], but as upcoming sections illustrate, their mere existence provides to be useful. Moreover, as Theorem 5.1 holds for locally compact Hausdorff spaces, and not just topological manifolds, one can generalize a few upcoming Lyapunov-based results beyond topological manifolds indeed.

5.3 Control Systems

We start by illustrating how the study of a control system on a manifold can emerge.

Example 5.4

(DC Converter [75, Sect. 3.5], [44]) We will consider a so-called DC-to-DC converter as employed in laptops, phones and so forth. An idealized model can be constructed as in Fig. 5.3, here the switch is used to either charge the left or right capacitor, by means of the inductor, thereby, one can control the potential (voltage) over these capacitors. The equations of motion follow from \(V_L(t)=L(\textrm{d}/\textrm{d}t)I_L(t)\), \(I_C(t)=C(\textrm{d}/\textrm{d}t)V_C(t)\) and Kirchoff’s laws

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\begin{bmatrix} C_1 V_1(t)\\ C_2 V_2(t)\\ L_3 I_3(t) \end{bmatrix} = \begin{bmatrix} (1-u)I_3(t)\\ u I_3(t)\\ -(1-u)V_1(t) - uV_2(t) \end{bmatrix}, \end{aligned}$$

where \(C_1,C_2\) denote the capacitance, \(L_3\) the inductance and \(u\in \{0,1\}\) the switching signal, that is, the input to the system. Now, consider the change of coordinates \(x_1=C_1^{1/2}V_1\), \(x_2=C_2^{1/2}V_2\) and \(x_3=L_3^{1/2}I_3\) with \(x=(x_1,x_2,x_3)\in \mathbb {R}^3\). Then let \(E(x)=\tfrac{1}{2}\langle x, x\rangle \) and observe that \(\textrm{d}/\textrm{d}t E(x(t))=0\). This means that if the system starts from some initial condition \(x_0\in \mathbb {R}^3\), total energy is preserved and the system evolves on a 2-sphere with radius \((2E(x_0))^{1/2}\). Now, the reader is invited the draw qualitative conclusions from \(\chi (\mathbb {S}^2)\ne 0\). We also remark that although the switching signal is binary, in practice one does not use such an idealized switch but rather a type of transistor. As such, u is not binary but rather continuous.

Example 5.5

(Quantum control [20]) Let \(|\psi (t)\rangle \in \mathcal {H}\) be the state of a quantum system at time t, for some complex Hilbert space \(\mathcal {H}\). Assume we work with a two-level system, that is, \(|\psi (t)\rangle = c_0(t)|0\rangle + c_1(t)|1\rangle \), for \(|0\rangle ,|1\rangle \in \mathcal {H}\) and both \(t \mapsto c_0(t)\) and \(t\mapsto c_1(t)\) are complex-valued functions satisfying \(|c_0(t)|^2+|c_1(t)|^2=1\) for all t, i.e., the state is normalized. Let \(H_0\) be the internal Hamiltonian and let the external part be of the form \(\sum ^m_{k=1}H_k \mu _k(t)\), for all \(H_i\) Hermitian operators on \(\mathcal {H}\) and \(t\mapsto \mu _k(t)\) real-valued (input) functions. Then, Schrödinger’s equation, i.e., \(i\hbar |\dot{\psi }(t)\rangle = H |\psi (t)\rangle \), becomes

$$\begin{aligned} i\hbar \frac{\textrm{d}}{\textrm{d}t}{\Psi }(t) = \left( \bar{H}_0 + \textstyle \sum \limits ^m_{k=1}\bar{H}_k \mu _k(t)\right) \Psi (t), \end{aligned}$$
(5.6)

for \(\Psi (t)\) the evolution operator and \(\bar{H}_i\) the Hamiltonian in \((c_0,c_1)\) coordinates. One might be interested in steering \(\Psi (0)=I_2\) to some desirable operator at some time \(T\ge 0\). With this goal in mind, we can without loss of generality assume that all \(\bar{H}_i\) are traceless. By doing so, we merely ignore a physically indistinguishable phase difference. Now see that (5.6) is of the form \(\dot{\Psi }(t)=\bar{A}\Psi (t)\) with \(\bar{A}=-i/\hbar \bar{H}\). However, now also consider the special unitary group \(\textsf{SU}(n,\mathbb {C})=\{Z\in \mathbb {C}^{n\times n}:Z^{\textsf{H}}Z=I_n\}\) with Lie algebra \(\mathfrak {su}(n,\mathbb {C})=\{A\in \mathbb {C}^{n\times n}:A^{\textsf{H}}+A=0,\,\textrm{Tr}(A)=0\}\) and see that \(\bar{A}\in \mathfrak {su}(2,\mathbb {C})\). As \(\Psi (0)=I_2\) is the identity element on \(\textsf{SU}(2,\mathbb {C})\), we see that (5.6) is in fact a (right-invariant) control system on the Lie group \(\textsf{SU}(2,\mathbb {C})\) cf. Sect. 1.3, unlocking Lie group machinery.

The purpose of the control system paradigm is to study if and how dynamical behaviour can be prescribed, e.g., the stabilization problem is that of finding inputs such that some set is stabilized in some sense. Control systems defined on manifolds can either be studied locally, that is, in some operating region of interest, or globally.

The most common local continuous-time formulation of a continuous time-invariant nonlinear dynamical control system is of the form

$$\begin{aligned} \Sigma _{n,m}^{\textrm{loc}}:\left\{ \quad \frac{\textrm{d}}{\textrm{d}t}{x}(t) = f\left( x(t),u\right) ,\right. \end{aligned}$$
(5.7)

for \(x(t)\in \mathcal {X}\subseteq \mathbb {R}^n\) and \(u\in \mathcal {U}\subseteq \mathbb {R}^m\) the state and input, respectively, e.g., see [53, 57]. One can append (5.7) with (time-invariant) output functions of the form \(y(t)=h(x(t),u)\). These functions capture for example what one can measure. However, we will work directly with the state x(t) and in that sense we restrict ourselves to a class of open dynamical systems with identity (or trivial) outputs [71]. This assumption is not restrictive as we will look at obstructions to achieving certain control objectives. Working with outputs y(t) instead of all state variables x(t) directly will only lead to more complications. We come back to this in Chap. 8.

When the input is chosen as a function of x, e.g., as \(\mu (x)\in \mathcal {U}\), we speak of \(\mu \) as being a feedback or control law. Note, as with the curve \(\xi \) and the points p and x, we use \(\mu \) instead of u to differentiate between the function and the point. When \(\mu \) depends on time, the input is said to be time-varying. A description like (5.7) works in Euclidean spaces or by using local coordinates on a manifold. However, as we are mostly interested in global questions we are aided by the following coordinate-free description, often attributed to Brockett, that allows for state-dependent input constraints, e.g., see the early work by Willems and van der Schaft [70, Definition 6]. Note, here we will not appeal to knowledge of a metric cf. [62, p. 123].

Definition 5.5

(Continuous control system) Given a smooth manifold \(\textsf{M}\), a continuous control system is the triple \(\Sigma =(\textsf{M},\mathcal {U},F)\), consisting of a topological space \(\mathcal {U}\), a continuous surjective map \(\pi _u:\mathcal {U}\rightarrow \textsf{M}\), the canonical projection \(\pi _x:T\textsf{M}\rightarrow \textsf{M}\) and a continuous fiber-preserving map \(F:\mathcal {U}\rightarrow T\textsf{M}\) such that the following diagram is commutative:

figure b

Definition 5.5 implies that the available inputs at \(x\in \textsf{M}\) are characterized by the sections \(\Gamma ^0(\mathcal {U})\), i.e., the continuous maps \(\sigma :\textsf{M}\rightarrow \mathcal {U}\) such that \(\pi _{u}\circ \sigma = \textrm{id}_{\textsf{M}}\). Indeed, let \(\mathcal {U}=\mathbb {R}^n\times \mathbb {R}^m\), then F relates directly to (5.7). More interestingly, consider \(\mathcal {U}\) to be for example a disk bundle (tubular neighbourhood of \(\textsf{M}\)), which corresponds in coordinates to the input being constrained to a topological disk. In the framework of Definition 5.5, a continuous control law (feedback) \(\mu \in \Gamma ^0(\mathcal {U})\) results in the continuous closed-loop system \(F\circ \mu :\textsf{M}\rightarrow T\textsf{M}\). Note that instead of assuming F and \(\mu \) to be continuous maps, one could consider a larger class of systems by only demanding that the vector field \(F\circ \mu \) is continuous. See also [24, 39, 53, 64].

Example 5.6

(Case study Sect. 1.3: modelling) Consider the left-invariant vector field (1.1b) over a lie group \(\textsf{G}\). This model is analogous to \(\dot{x}=u\) with \(u\in \mathbb {R}^n\), as such, the control system becomes \(\Sigma =(\textsf{G},T\textsf{G},\textrm{id}_{\textsf{TG}})\).

Example 5.7

(Affine control systems) Motivated by mechanics, one of the most studied nonlinear dynamical control system models is of the input-affine form, cf. [53], described by a set of vector fields \(\{f,g_1,\dots ,g_m\}\) and an input taking values in \(\mathbb {R}^m\)

$$\begin{aligned} \Sigma _{n,m}^{\textrm{aff}}:\left\{ \quad \frac{\textrm{d}}{\textrm{d}t}{x}(t)= f(x(t)) + \textstyle \sum \limits ^m_{i=1}g_i(x(t))u_i.\right. \end{aligned}$$
(5.8)

Globally, \(\Sigma ^{\textrm{aff}}_{n,m}\) corresponds to \(\pi _u:\mathcal {U}\rightarrow \textsf{M}\) being a vector bundle [53, p. 428].

To put the class of continuous feedback laws in perspective, we must be able to describe generic admissible behaviour, that is, we must be able to describe all trajectories that can result from applying an admissible—and potentially discontinuous—input to the control system. Albeit its importance and attention received, this task, that falls under the umbrella of “controllability”, is only partially solved. Loosely speaking, the control system \(\Sigma =(\textsf{M},\mathcal {U},F)\) is said to be controllable when for any \(x_1,x_2\in \textsf{M}\) there is a finite \(T\ge 0\) and an “admissible input” such that the “resulting integral curve” starting at \(x_1\) arrives at \(x_2\) in time T cf. [53, Definition 3.2], [57, Definition 11.1]. The notion of an admissible input depends evidently on the application at hand, but let us highlight common mathematical assumptions. To that end, consider momentarily a control system in local coordinates, when studying dynamical control systems of the form \(\dot{x}=f(x,u)\), the input might depend on time in a merely measurable (i.e., possibly discontinuous) way. As such, an integral curve that satisfies this differential equation cannot always be continuously differentiable.

To analyze this scenario, let \(f:\mathbb {R}\times \Omega \rightarrow \Omega \) be continuous with \(\Omega \) open in \(\mathcal {X}\) and rewrite the standard initial value problem, that is, finding a curve \(\xi \) such that

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}{\xi }(t) = f(t,\xi (t)),\quad \xi (t_0)=\xi _{t_0},\quad t_0\in \textrm{int}(\mathcal {I}),\quad \text {for}~t\in \mathcal {I}\subseteq \mathbb {R} \end{aligned}$$

as that of finding a curve \(\xi :\mathcal {I}\rightarrow \Omega \) such that \(\xi (t)=\xi _{t_0}+\int ^t_{t_0}f(\tau ,\xi (\tau ))\textrm{d}\tau \) for all \(t\in \mathcal {I}\). The integral representation of the initial value problem immediately reveals that demanding \(\xi \) to be \(C^1\)-smooth is not a necessity. Moreover, from a practical point of view, the integral problem is as relevant as the differential equation. Therefore, the rightFootnote 8 setting is that of absolutely continuous curves, that is, curves \(\xi :\mathcal {I}\rightarrow \Omega \) that, when restricted to compact subsets of \(\mathcal {I}\), admit an integral representation of the form \(\xi (t) = \xi (t_0) + \int ^t_{t_0} g(\tau )\textrm{d}\tau \) for some Lebesgue measurable function \(g:\mathcal {I}\rightarrow \Omega \) with \(t\in [t_0,t_1]\subseteq \mathcal {I}\). Then, one speaks of a (weak) solution to the initial value problem when one finds an absolutely continuous curve \(\xi :\mathcal {I}\rightarrow \Omega \), that passes through \(\xi _{t_0}\in \Omega \) at \(t_0\in \mathcal {I}\), such that \(\dot{\xi }(t)=f(t,\xi (t))\) holds for almost every \(t\in \mathcal {I}\) in the sense of Lebesgue. To assert existence one commonly appeals to Carathéodory’s sufficient conditions, that is, \(t\mapsto f(t,\xi )\) must be measurable for all \(\xi \) and \(\xi \mapsto f(t,\xi )\) must be continuous for (almost) all t. This implies in particular that \(t\mapsto f(t,\xi (t))\) is measurable. Moreover, for each compact set \(K\subset \mathcal {I}\times \Omega \) there must be an integrable function \(b_K(t)\) such that \(\Vert f(t,\xi )\Vert \le b_K(t)\) for all \((t,\xi )\in K\) [27, Sect. 1.5].

Now we return to dynamical control systems and briefly comment on classes of inputs that are frequently employed. The input \(t\mapsto \mu (t)\in \mathcal {U}\) with \(t\in \mathcal {I}\) is said to be locally integrable when \(\int _{\mathcal {J}}\Vert \mu (\tau )\Vert \textrm{d}\tau <\infty \) for all compact subsets \(\mathcal {J}\subseteq \mathcal {I}\) (with \(\Vert \cdot \Vert \) from \(\mathbb {R}^m\)). We denote this by \(\mu \in L^1_{\textrm{loc}}(\mathcal {I})\). On the other hand \(t\mapsto \mu (t)\in \mathcal {U}\) is said to be locally essentially bounded when for each compact set \(\mathcal {J}\subseteq \mathcal {I}\) there is a compact set \(K\subseteq \mathcal {U}\) such that \(\mu (t)\in K\) for almost every \(t\in \mathcal {J}\). This set is denoted by \(L^{\infty }_{\textrm{loc}}(\mathcal {I})\). When the control system is input-affine, we can select \(\mu \in L^1_{\textrm{loc}}(\mathcal {I})\) to assert, locally, the Carathéodory conditions, notably, the last condition. In general, when f(xu) is jointly continuous in x and u one select \(\mu \in L^{\infty }_{\textrm{loc}}(\mathcal {I})\) [62, Appendix C].

With this motivation and the control system \(\Sigma =(\textsf{M},\mathcal {U},F)\) in mind, we define the set of admissible inputs over the interval \(\mathcal {I}\), denoted by \(\mathscr {U}(\mathcal {I})\), as all maps \(\mu :(\mathcal {I}\subseteq \mathbb {R})\times \textsf{M}\rightarrow \mathcal {U}\) with \(t\mapsto \mu (t,x)\) being measurable for all \(x\in \textsf{M}\), \(x\mapsto \mu (t,x)\) being continuous for all \(t\in \mathcal {I}\) and in particular \(\mu (t,\cdot )\in \Gamma ^0(\mathcal {U})\) such that controlled trajectories are absolutely continuous curves \(\xi :\mathcal {I}\rightarrow \textsf{M}\) such that

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\xi (t) = (F\circ \mu )(t,\xi (t))\quad \text {for}~a.e.~t\in \mathcal {I} \text { and some }\mu \in \mathscr {U}(\mathcal {I}). \end{aligned}$$

With this in place we return to the question of controllability. Global controllability is unfortunately difficult to characterize. To study controllability and its ramifications, locally, we follow Lewis [45] and define the reachable set from \(x_1\in \textsf{M}\) in time \(T\ge 0\) as \(\mathcal {R}_{\Sigma }(x_1,T)=\{\xi (T):\) there exists a controlled trajectory \(\xi :\mathcal {I}\rightarrow \textsf{M}\) such that \(\xi (0)=x_1\) and \([0,T]\subseteq \mathcal {I}\).\(\}\). Let \(\mathcal {R}_{\Sigma }(x,\mathcal {T}) = \cup _{t\in \mathcal {T}}\mathcal {R}_{\Sigma }(x,t)\), then we say that \(\Sigma \) is accessible from \(x\in \textsf{M}\) if \(\textrm{int}(\mathcal {R}_{\Sigma }(x,\mathbb {R}_{\ge 0}))\ne \emptyset \). Similarly, \(\Sigma \) is locally controllable from \(x\in \textsf{M}\) when \(x\in \textrm{int}(\mathcal {R}_{\Sigma }(x,\mathbb {R}_{\ge 0}))\). The control system \(\Sigma \) is small-time-locally-controllable from \(x\in \textsf{M}\) if there is a \(T>0\) such that \(x\in \textrm{int}(\mathcal {R}_{\Sigma }(x,[0,s]))\) for all \(s\in (0,T]\). Based on the control system at hand we will spell out—as far as possible—which type of controllability can be asserted and how. When the goal is to stabilize a set \(A\subseteq \textsf{M}\) one might make a distinction between controllability on \(\textsf{M}\setminus A\) and on A itself. See [45] for a relatively recent overview of geometric nonlinear controllability and see [34] for earlier survey paper by Kawski. See also [22] for the parallel development of exploiting so-called “flatness”.

Example 5.8

(Linear controllability and continuous feedback) Consider the linear control system \(\Sigma ^{L}=(\mathbb {R}^n, \mathbb {R}^n\times \mathbb {R}^m, F\in L(\mathbb {R}^n\times \mathbb {R}^m;\mathbb {R}^n))\), succinctly given by

$$\begin{aligned} \dot{x}=Ax+Bu, \end{aligned}$$
(5.9)

for some \(A\in \mathbb {R}^{n\times n}\) and \(B\in \mathbb {R}^{n\times m}\). Note, here one exploits the identification \(T\mathbb {R}^n\simeq \mathbb {R}^n\times \mathbb {R}^n\). It readily follows that the integral curves \(t\mapsto \xi (t)\in \mathbb {R}^n\) corresponding to (5.9) are of the form \(\xi (T)=e^{TA}\xi (0)+\int ^T_0 e^{(T-s)A}B\mu (s)\textrm{d}s\) for some \(T>0\) and some choice of admissible input \(t\mapsto \mu (t)\), e.g., \(\mu \in L^1_{\textrm{loc}}([0,T])\). The linear control system \(\Sigma ^L\) is controllable if and only if \(\textrm{rank}(B,AB,\dots ,A^{n-1}B)=n\), which readily follows from comparing the rank of \((B,AB,\dots ,A^{n-1}B)\) to the rank of \(e^{tA}B\) cf. [62, Theorem 3]. Instead of referring to \(\Sigma ^L\), one frequently refers to the pair (AB), which is said to be a controllable pair in case \(\Sigma ^L\) is controllable. Now, one can show that if (AB) is a controllable pair, so is \((A+BK,B)\), for any \(K\in \mathbb {R}^{m\times n}\) [74, Lemma 2.1]. Better yet, one can show that there is a \(K_0\in \mathbb {R}^{m\times n}\) and a \(u_0\in \mathbb {R}^m\) such that \((A+BK_0,Bu_0)\) is a controllable pair [74, Lemma 2.2]. Note, with some abuse of notation, the resulting control system \(\dot{x}=(A+BK_0)x+(Bu_0)u\) is now a controllable single-input system. Then, for the moment assume \(m=1\), \(B=b\) and that (Ab) is a controllable pair, i.e., the matrix \(T=(b,Ab,\dots ,A^{n-1}b)\) has rank n. Due to the Cayley–Hamilton theorem, performing a linear change of coordinates \(Tz=x\) can be shown to result in

$$\begin{aligned} T^{-1}AT=\widetilde{A}=\begin{bmatrix} 0 &{} 0 &{} \cdots &{} -a_1 \\ 1 &{} 0 &{} \dots &{} -a_2\\ 0 &{} \ddots &{} &{}\vdots \\ 0 &{} 0 &{} 1 &{} -a_n \end{bmatrix},\quad T^{-1}b=\widetilde{b}=\begin{bmatrix} 1\\ 0 \\ \vdots \\ 0 \end{bmatrix}. \end{aligned}$$
(5.10)

for some tuple \((a_1,\dots ,a_n)\). Then, consider the pair \((\bar{A},\bar{b})\) given by

$$\begin{aligned} \bar{A}=\begin{bmatrix} -\bar{a}_1 &{} -\bar{a}_2 &{} \cdots &{} -\bar{a}_n \\ 1 &{} 0 &{} \dots &{} 0\\ 0 &{} \ddots &{} &{}\vdots \\ 0 &{} 0 &{} 1 &{} 0 \end{bmatrix},\quad \bar{b}=\begin{bmatrix} 1\\ 0 \\ \vdots \\ 0 \end{bmatrix}. \end{aligned}$$

A pair of the form \((\bar{A},\bar{b})\) is controllable for any tuple \((\bar{a}_1,\bar{a}_2,\dots ,\bar{a}_n)\) and is said to be in the control canonical form. Hence, as any controllable pair can be transformed as in (5.10), there must exist an invertible linear transformation S such that any controllable pair (Ab) can be brought into this control canonical form [67, Chap. 3]. To that end, assume that \((\bar{A},\bar{b})\) is a controllable pair in canonical form and let \(\bar{k}\in \mathbb {R}^{1\times n}\), then, by employing the feedback \(\mu (x)=\bar{k}x\) one finds that the characteristic polynomial of \(\bar{A}+\bar{b}\bar{k}\) is of the form \(\lambda ^n+(\bar{k}_1-a_1)\lambda ^{n-1}+\cdots + (\bar{k}_n-a_n)\). Therefore, one can “place” the eigenvalues of \(\bar{A}+\bar{b}\bar{k}\) anywhere desired by an appropriate selection of \(\bar{k}\). Concluding, we have shown that for any controllable linear dynamical system \(\Sigma ^L\) there is a tuple \((K_0,u_0,\bar{k},S^{-1})\) such that the linear feedback \(\mu (x)=Kx=(K_0+u_0\bar{k}S^{-1})x\) globally asymptotically stabilizes \(x^{\star }=0\).

In general, however, controllability as is does not provide any information on how the input should be selected, e.g., the input space might contain any measurable function. To that end, we highlight a variety of continuous stabilization paradigms with respect to a continuous control system \(\Sigma =(\textsf{M},\mathcal {U},F)\).

  1. (i)

    Static feedback: Control laws are continuous sections \(\sigma :\textsf{M}\rightarrow \mathcal {U}\).

  2. (ii)

    Dynamic feedback: This is most easily described in coordinates by passing from static feedback in u, i.e., \(\dot{x}=f(x,u)\) to dynamic feedback in u and v, i.e., \(\dot{x}=f(x,u)\), \(\dot{z}=v\). Here, both inputs can depend on x and the auxiliary state-variable z. In the context of input-output systems one commonly encounters the more restrictive form \(\dot{x}=f(x,z)\), \(\dot{z}=v\).

  3. (iii)

    Time-varying controls: Similar to dynamic feedback, one can describe a time-varying vector field by lifting the state space, that is, by writing \(\dot{x}=f(t,x)\) as \(\dot{x}=f(s,x)\), \(\dot{s}=1\). In this case, the time variable is understood to live in a subset of \(\mathbb {R}\). If instead the time variable lives on \(\mathbb {S}^1\) the system is periodic, e.g., \(\dot{s}=-Js\) for \(J\in \textsf{Sp}(2,\mathbb {R})\) (the Symplectic group) corresponding to a clockwise rotation. Crucially, the introduction of the auxiliary variable s will require a different analysis as \(\dot{s}\ne 0\) and solutions generally diverge to \(+\infty \), see Example 6.4.

As highlighted in the introduction, in what follows we focus on continuous feedback laws due to implementation and robustness considerations, but also to investigate the limitations of this common assumption. As dynamic and time-varying controls can be modelled by means of vector fields on extended state spaces, this is a good initial vantage point. However, keeping these distinctive classes in mind is important. For example, Coron and Praly showed that there are systems that cannot be stabilized by static feedback while a stabilizing dynamic controller exists [19]. This result relates to what is a common observation in topology, extending state spaces can enable continuity.Footnote 9 To give a simplified example on \(\mathbb {R}^2\), for \(f(x_1,x_2)=(x_1,-x_2)\) one has \(\textrm{ind}_0(f)=-1\ne (-1)^n\), while for \(\bar{f}(x_1,x_2,x_3)=(x_1,-x_2,x_3)\) one has \(\textrm{ind}_0(\bar{f})=-1=(-1)^{n+1}\) cf. Example 3.4. Moreover, Coron showed that stabilizing time-varying controllers are significantly more rich in that they usually exist [16, 17]. We have more to say about this in the upcoming two sections. See also [68] for more on the application of dynamic feedback. Note, in all of the aforementioned we focus on state-feedback, meaningful extensions to non-trivial outputs are still open. In the past the focus was mostly on local stabilization of equilibrium points, i.e., a controller induces a well-behaved closed-loop system on a neighbourhood of the equilibrium point. We like to understand what could happen outside of this operating region. Exactly then, the global topology of the underlying space plays a critical role.

Example 5.9

(Control-Lyapunov functions) Assume to work with a input-affine control system of the form (5.8) and let \(x^{\star }=0\) be the point to be stabilized under a feedback \(\mu \) that satisfies \(\mu (x^{\star })=0\). If we can find a smooth and proper function \(V:\mathbb {R}^n\rightarrow \mathbb {R}_{\ge 0}\) that only vanishes at 0 and is such that for all \(x\in \mathbb {R}^n\setminus \{0\}\) there is a \(u\in \mathbb {R}^m\) such that \(\langle \partial _x V(x), f(x)+\sum _i g_i(x)u_i\rangle < 0\), then V is a control-Lyapunov function (CLF) [62, Lemma 5.7.4]. Such a construction is particularly interesting when the inputs are known to be constrained. Although controllability does not imply the existence of a stabilizing continuous feedback, when V is a CLF and additionally satisfies the small control property,Footnote 10 then, the work by Artstein [7] and Sontag [60, 61] shows that an explicit continuous stabilizing feedback law can be found. As will be shown, the existence of CLFs is thereby easily obstructed based on topological grounds. See also Sect. 7.3.

Remark 5.1

(From stability to continuous stabilization) As continuity is preserved under compositions, a common approach to provide topological obstructions to continuous stabilization is as follows. Assume one can show that there does not exist any continuous vector field \(X\in \mathfrak {X}^{r\ge 0}(\textsf{M})\) that satisfies property \(\textrm{P}\). In its turn, this result directly implies that for any control system \(\Sigma =(\textsf{M},\mathcal {U},F)\) in the sense of Definition 5.5, there is no continuous feedback \(\mu :\textsf{M}\rightarrow \mathcal {U}\) such that the closed-loop vector field \(F\circ \mu \) satisfies property \(\textrm{P}\), e.g., there is no continuous feedback law on \(\textsf{M}\) that can achieve this type of stabilization.

For more on dynamical systems, see [8, 9, 31, 33, 55, 56, 59], for more on Lyapunov theory, see [9, 36, 62] and for more on geometric control theory, see [3, 11, 12, 18, 29, 53, 57]. See [3, Sect. 8] for a discussion on the topology of attainable sets, also called reachable- or accessible sets. Although we focus on smoothness and continuity, in part due to converse Lyapunov theorems, see [63] for an overview of how real-analyticity presents itself as an important setting for the study of control theory. We also remark that the work by Lur’e and Postnikov [47] was one of the first that—amongst other things—linked Lyapunov stability theory to control theory, whereas the book (and invariance principles) by Lefschetz and LaSalle [40] was key in promotion of the concept.