Abstract
As outlined in Chap. 1, the behavior of nonlinear systems is substantially richer than the one of linear systems. To deal with them there is a set of techniques, each one best suited to analyse particular aspects or particular classes of nonlinear systems. We target systems that are stable about an equilibrium point and that depend continuously on the input signal.
You have full access to this open access chapter, Download chapter PDF
1 Introduction
As outlined in Chap. 1, the behavior of nonlinear systems is substantially richer than the one of linear systems. To deal with them there is a set of techniques, each one best suited to analyse particular aspects or particular classes of nonlinear systems. We target systems that are stable about an equilibrium point and that depend continuously on the input signal.
Before analysing in more details this class of systems, we give a short overview, mostly by way of examples, of systems described by nonlinear ordinary differential equations of the form
with \(I\subset {\mathbb {R}}\), \(X\subset {\mathbb {R}}^n\) and initial conditions
We limit ourselves to the aspects that are helpful in better framing the concept of weakly nonlinear systems.
A first important difference compared to systems described by linear differential equations with constant coefficients is the fact that a solution may not exist for all \(t > 0\) or may not be unique.
Example 9.1: IVP with many Solutions
Consider the following initial value problem (IVP)
If \(y_0>0\) then the equation can be solved by the method of separation of the variables, and we obtain the unique solution
If \(y_0=0\) then \(y(t) = 0\) is a solution. However, it is not the only one. For any constant \(c>0\) the function
is also a solution as one easily verifies by inserting it in the equation.
For \(y_0<0\) we can again use the method of the separation of the variables to find the solution
However, due to the fact that at \(y=0\) the function \(1/\sqrt{|y |}\) is not continuous (not even defined) this solution is only valid as long as \(y(t)<0\). When y(t) reaches zero the equation can again be satisfied by multiple solutions
Therefore, for some initial conditions the equation has uncountably many solutions (Fig. 9.1).
From the above example we see that continuity of f is not enough to guarantee the existence of a unique solution of the initial value problem (9.1a). To guarantee uniqueness of a solution the function f(t, y) must be more regular with respect to y.
Let \(I\subset {\mathbb {R}}\) and \(X\subset {\mathbb {R}}^n\). A function \(f\in C(I\times X,{\mathbb {R}}^n)\) is called locally Lipschitz continuous in x if every point \((t_0,y_0) \in I\times X\) has a neighborhood \(U\times V\) such that, for some constant \(M>0\)
If the function f(t, y) in (9.1b) is continuous in t and locally Lipschitz continuous in y then Picard-Lindelöf’s theorem guarantees the existence and uniqueness of the solution of the initial value problem (9.1a) [23].
If the function f doesn’t depend explicitly on time, then the system is time invariant and the system equation becomes
A solution of the equation for which \(Dy = 0\) is called an equilibrium point of the system. When one investigates the stability of an equilibrium point \(y_e\) one can always assume it to be at the origin. In fact, by the change of variable \(u = y - y_e\) one can always transform the system differential equation in one whose equilibrium point of interest is \(u_e=0\)
An equilibrium point is stable if for each \(c > 0\) one can find an \(\epsilon > 0\) such that
It is asymptotically stable if it is stable and in addition \(\epsilon \) can be chosen such that
The set of all points \(y(t_0)\) such that \(\Vert y(t)\Vert \) converges to zero as t tends to infinity is called the domain of attraction of the equilibrium point. If an equilibrium point is not stable it is called unstable.
As already highlighted in Chap. 1, an important difference of time invariant nonlinear systems compared to LTI ones is the possibility of the existence of multiple isolated equilibrium points.
Example 9.2
Consider the system described by the following differential equation
with a and c positive constants. From
we see that the system has two equilibrium points:
We are interested in the dynamic of the system starting from the initial condition \(y(0) = y_0\) assuming that \(y_0\) doesn’t coincide with an equilibrium point. Since the function \(f(y) = -a y + c y^2\) is locally Lipschitz continuous, there is a unique solution and this solution doesn’t intersect the equilibrium points. The initial value problem can therefore be solved by separating the variables and integrating
The solution is found to be
If \(y_0\) is negative or \(0 < y_0c/a < 1\) the solution converges toward zero which therefore is an asymptotically stable equilibrium point (see Fig. 9.2). If \(y_0c/a > 1\) the solution diverges and reaches infinity in the finite time
From the above example we see that a nonlinear system can have multiple equilibrium points some of which can be stable, and some unstable. For a system to remain stable around a stable equilibrium point the initial condition may have to remain within a limited region around that point. Also, divergence from initial conditions near unstable equilibrium points can diverge faster than exponentially and reach infinity in finite time (finite escape time).
One of the most useful tools in the study of the stability of equilibrium points is the Lyapunov stability theory [24]. In particular Lyapunov’s linearization (or indirect) method, states that
-
If the linear approximation of the system about an equilibrium point is asymptotically stable then, in a neighborhood U of the equilibrium point, the (nonlinear) system is asymptotically stable. The largest neighborhood U is the domain of attraction of the equilibrium point.
-
If the linear approximation of the system about an equilibrium point is unstable, then the (nonlinear) system is unstable.
If the linear approximation of the system is neither asymptotically stable nor unstable then this method is inconclusive and one must turn to other methods, for example, Lyapunov’s direct method [24].
Example 9.3
Consider the initial value problem described by the differential equation
with c a constant; and the initial condition
The only equilibrium point of the equation is the zero solution \(y_e(t) = 0\). As it’s immediately seen, the linearized equation is stable, but not asymptotically stable about the equilibrium point.
The nonlinear equation can be solved by the method of the separation of the variables
Performing the integrations and solving for y we find
If \(c > 0\) the solution diverges and reaches infinity at
If \(c < 0\) the equation is asymptotically stable for any value of the initial value \(y_0\) (see Fig. 9.3).
Differently from what the above examples may suggest, most nonlinear differential equations can’t be solved analytically. Therefore we are interested in methods to find approximate solutions around asymptotically stable equilibrium points in the spirit of a perturbation theory. Weakly nonlinear systems are a class of systems for which such a method exists and the solution is obtained in the form of a functional series.
Informally weakly nonlinear systems can be described as systems operated around an asymptotically stable equilibrium point and whose response depends continuously on the input signal x. They include systems described by a differential equation of the form
with C a linear function and f a function that, within the excursion range of interest of y, can be approximated to any desired accuracy by a Taylor expansion. Note that polynomials are locally Lipschitz continuous. For this reason weakly nonlinear systems are well-behaved and produce a well-defined and unique output response.
2 Graded Algebra of Test Functions
In the previous section we illustrated some aspects of weakly nonlinear systems based on examples of systems described by nonlinear differential equations. We now look for a description based on distributions. We’ll see that this allows reducing the problem of solving some classes of nonlinear differential equations to an essentially algebraic problem. However, before discussing systems, we need some preparation that we provide in this and the next section.
Let \(V_k, k \in {\mathbb {N}}\) be vector spaces on \({\mathbb {C}}\) such that \(V_k \cap V_j = \{0\}\) for \(k \ne j\). The direct sum
is the vector space whose elements are the sequences \((x_k)\) in \(\bigcup _{k=0}^\infty V_k\) with \(x_k \in V_k\) and \(x_k = 0\) for fast every k. That is, the set of all finite sequences with \(x_k \in V_k\). The vector space structure of V is defined by the following addition and multiplication with scalars
Each \(V_k\) is evidently a sub-vector space of V.
If furthermore V is provided with a multiplication
such that it forms an algebra and in addition
then it is called a graded algebra.
Let \(V_k = {\mathcal {D}}({\mathbb {R}}^k)\) be the vector space of test functions on \({\mathbb {R}}^k\) with \(V_0 = {\mathbb {C}}\). Then
with the tensor product as multiplication
is a graded algebra that we call the graded algebra of test functions. We write elements of \({\mathcal {D}}_{\oplus }\) as sums with indices denoting the grade of the element
In the graded algebra of test functions we define the following convergence criterion. A sequence \((\phi _m), \phi _m\in {\mathcal {D}}_{\oplus }\) with
converges to zero if
-
1.
There exist compact sets \(K_j\subset {\mathbb {R}}^j\), \(j=1,\ldots ,N\) with \(N = \max _{m \in {\mathbb {N}}}(N_m)\) such that for each j and m
$$\begin{aligned} \text {supp}(\phi _{j,m}) \subset K_j\,. \end{aligned}$$ -
2.
For every \(j>0\) and every j-tuple \(k \in {\mathbb {N}}^j\) the sequence \((D^k\phi _{j,m})_{m\in {\mathbb {N}}}\) converges uniformly to zero. For \(j=0\) the sequence of numbers \((\phi _{0,m})_{m\in {\mathbb {N}}}\) converges to zero.
3 Direct Product of Distributions
The direct product V of vector spaces \(V_k\) on \({\mathbb {C}}\) is the vector space whose elements are the sequences \((x_k)\) with \(x_k\in V_k, k\in {\mathbb {N}}\). The vector space structure is defined as for the direct sum by (9.4). It is denoted by
The key difference from the direct sum is that, in a direct product, the sequence does not have to be finite.
Let \(V_k={\mathcal {D}}'({\mathbb {R}}^k)\), with \(V_0={\mathbb {C}}\). Then the direct product
is the set of linear continuous functionals on \({\mathcal {D}}_{\oplus }\) defined by
with
Since \(\phi \) only has a finite number of terms different from zero, \(\langle h,\phi \rangle \) is well-defined. As for \(k \ne j\), \({\mathcal {D}}'({\mathbb {R}}^k) \cap {\mathcal {D}}'({\mathbb {R}}^j) = \{0\}\), here and in the following we denote elements of \({\mathcal {D}}_{\oplus }'\) by sums in a similar way as we do for elements of \({\mathcal {D}}_{\oplus }\).
Continuity in \({\mathcal {D}}'_\oplus \) is defined by the convergence that we defined for \({\mathcal {D}}_\oplus \) and follows from the continuity of distributions. Since \({\mathcal {D}}'_\oplus \) is a vector space, it’s enough to verify continuity at the origin. Let \(h \in {\mathcal {D}}'_\oplus \) and \(\phi \in {\mathcal {D}}_{\oplus }\), then there exists an \(N \in {\mathbb {N}}\) such that
and according to our definition of convergence, when \(\phi \) converges to zero, so does \(\sup _j|\langle h_j,\phi _j \rangle |\) and hence \(\langle h,\phi \rangle \).
In Sect. 3.1 we have introduced the tensor product of distributions and have seen that it is well defined between any pair of distributions. With it we can define a product \(g \cdot h\) between elements g and h of \({\mathcal {D}}'_\oplus \). It’s kth component is defined by
with \(g_j\) and \(h_j\) the jth components of g and h respectively. With this product \(({\mathcal {D}}'_\oplus , +, \cdot )\) becomes an algebra. As is common practice, we will often denote \(g \cdot h\) simply by gh. Being based on an associative operation (the tensor product) the product that we just defined is associative.
Note the close similarity between the algebra of formal power series and the one that we have defined for \({\mathcal {D}}'_\oplus \). In both cases addition is defined component wise and the product has the form of a convolution.
4 Symmetric Distributions
Let \(\mathsf {S_{k}}\) denote the set of all permutations of \(\{1,\ldots ,k\}\). A distribution \(h_k\in {\mathcal {D}}'({\mathbb {R}}^k)\) is symmetric if
for all permutations \(\sigma \in \mathsf {S_{k}}\) and every \(\phi \in {\mathcal {D}}({\mathbb {R}}^k)\). Symmetric distributions are fully characterized by symmetric test functions for
and the sum of test functions on the right-hand side is a symmetric test function. The sum of symmetric distributions is a symmetric distribution. Therefore, they form a vector subspace of distributions that we denote by \({\mathcal {D}}'_\text {sym}({\mathbb {R}}^k)\). Similarly, we denote the vector subspace of all symmetric test functions on \({\mathbb {R}}^k\) by \({\mathcal {D}}_\text {sym}({\mathbb {R}}^k)\), the one of the direct sum of symmetric test functions by \({\mathcal {D}}_{\oplus ,\text {sym}}({\mathbb {R}}^k)\) and the one of the direct product of symmetric distributions by \({\mathcal {D}}'_{\oplus ,\text {sym}}({\mathbb {R}}^k)\).
A symmetric distribution can be constructed from an arbitrary distribution \(f \in {\mathcal {D}}'({\mathbb {R}}^k)\) by averaging over all permutations of the independent variables
with
Such an operation is called symmetrisation.
The tensor product is a bi-linear operation. Therefore, the power of an element of \({\mathcal {D}}'_\oplus \) composed by a finite number of distributions \(f_j \in {\mathcal {D}}'({\mathbb {R}}^{n_j})\), \(n_j \ge 1\), \(j=1,\ldots ,m\), \(m \ge 2\) can be expressed as a sum of tensor products
with the sum ranging over all possible combinations of the indexes \(j_1,\ldots ,j_k\). If the distributions \(f_1,\ldots ,f_m\) are symmetric then one can reorder the indexes \(j_1,\ldots ,j_k\) by any permutation \(\sigma \) without changing the value of the sum. Hence, the tensor products on the right-hand side can be replaced by symmetrized products
The tensor product of symmetric distributions inside the symmetrisation operator act as a commutative operator. For this reason the sum includes summands that are equal and, by grouping them, we obtain an expression that is similar to the multinomial formula [21]
with \(\alpha \) an m-tuple in \({\mathbb {N}}^m\),
and where we made use of the multi-index notation introduced in Sect. 4.6.
In general the product that we defined on \({\mathcal {D}}'_{\oplus }\) applied to two elements of \({\mathcal {D}}'_{\oplus ,\text {sym}}\) does not result in an element of \({\mathcal {D}}'_{\oplus ,\text {sym}}\). This can be remedied by symmetrizing the product
Unless explicitly stated otherwise, when working in \({\mathcal {D}}'_{\oplus ,\text {sym}}\) we will always assume the use of this symmetrized product.
The last property of symmetric distributions that we want to mention is the fact that, in a convolution algebra, the inverse of a symmetric distribution is symmetric, for
5 Weakly Nonlinear Systems
We are looking for a representation, in the spirit of a perturbation theory, of a class of nonlinear systems including the ones described by differential equations of the form
with \(x\in {\mathcal {D}}'({\mathbb {R}})\) a given input signal, L a linear differential operator with constant coefficients
and where we assume that the linearized system is stable.
In Chap. 7 we saw that, in the language of distributions, a linear differential equation with constant coefficients becomes a convolution equation. If we want to apply the results obtained for convolution equations, we need to give a meaning to the nonlinear terms appearing in the above equation.
In general, it’s not possible to define a multiplication valid for arbitrary distributions. Therefore, the terms \(y^k, k > 1\) can’t be assumed to belong to \({\mathcal {D}}'({\mathbb {R}})\). To work around this problem we can assume y to belong to a direct product of distributions, \(y=(y_0, y_1, y_2, \dotsc )\), and use the product defined on that space. Since the product between functions with values in \({\mathbb {C}}\) is commutative \(f \cdot g = g \cdot f\), we require y to belong to the direct product of symmetric distributions \({\mathcal {D}}'_{\oplus ,\text {sym}}\). Then, if \(y_1\) is the solution of the linearized equation its powers become tensor powers
If \(y_1\) is a regular distribution, that is a locally integrable function, then we can recover the meaning of the powers in the differential equation by evaluating \(y_1^{\otimes k}\) on the diagonal
The same remains true if we replace \(y_1\) by a sum of distributions.
To complete the interpretation of the differential equation in the language of distributions it remains to be clarified what is the effect of the one dimensional differential operator \(D\) appearing in (9.13) on the components \(y_k \in {\mathcal {D}}'_\text {sym}({\mathbb {R}}^k)\) of y. To this end, suppose \(y_k\) to be a regular distribution. Then it is a locally integrable function
and we can associate with it a function of the single variable t by defining an operation that we call “evaluating on the diagonal”
If we assume this function to be differentiable, then the derivative with respect to t is well-defined
and, as a distribution, can be represented by
This last expression is symmetric and is valid for arbitrary distributions. Therefore, we can take it as the definition of the effect of the differential operator \(D\) on distributions \(y_k\in {\mathcal {D}}'_\text {sym}({\mathbb {R}}^k)\). For \(y\in {\mathcal {D}}'_{\oplus ,\text {sym}}\) and any \(\phi \in {\mathcal {D}}_\otimes \), \(\langle y,\phi \rangle \) only has a finite number of terms different from zero. For this reason the effect of \(D\) on y can be defined as acting on each component individually.
For \(y\in {\mathcal {D}}'_{\oplus ,\text {sym}}\) to be a solution of (9.13) in a convolution algebra, the equation must be satisfied by each component \(y_k\) of y individually. If y has to be compatible with our assumption of the system being described around the zero equilibrium point, then the 0th component \(y_0\) must always be zero
In analogy with the theory of formal power series we call distributions \(y \in {\mathcal {D}}'_{\oplus }\) with \(y_0 = 0\) nonunits [25].
For \(k=1\) the only terms belonging to \({\mathcal {D}}'({\mathbb {R}})\) appearing in the equation are \(y_1\) and x. Hence, \(y_1\) is the solution of the linearized equation and, as discussed in Sect. 8.1, can be represented by
For \(k = 2\) we have
and we see that, for the computation of \(y_2\), the tensor power of \(y_1\) plays the role of an input signal applied to a linear system. Assuming that \(L\delta \) has an inverse, we obtain
The above expression can be further manipulated by noting that
or
With this expression and the solution found for \(y_1\) we can express \(y_2\) as
where raising to a tensor power is assumed to have higher priority than convolution.
From this it is not difficult to see that every component \(y_k\) can be expressed as the convolution of a distribution \(h_k\) specific to the problem and the input signal x raised to the tensor power of k
We are therefore led to define a weakly nonlinear (or analytic) time-invariant (WNTI) system as a system \({\mathcal {H}}\) whose behavior around the zero equilibrium point can be described by an element h of \({\mathcal {D}}'_{\oplus ,\text {sym}}\) such that, when driven by the input signal x, its output is given by
with \({\mathcal {A}}'_1\) a convolution algebra in \({\mathcal {D}}'({\mathbb {R}})\) and \({\mathcal {A}}'_k\) a convolution algebra in \({\mathcal {D}}'_\text {sym}({\mathbb {R}}^k)\) compatible with \({\mathcal {A}}_1\) and the tensor product. This means that, if \(x \in {\mathcal {A}}'_1\), then \(x^{\otimes k}\) must be an element of \({\mathcal {A}}'_k\). We denote such a set of convolution algebras by \({\mathcal {A}}'_{\oplus ,\text {sym}}\). The distribution \(h_k\) is called the kth order impulse response (or kernel) of the system. A block diagram representation of a weakly nonlinear system is shown in Fig. 9.4. Note that, if the input signal is multiplied by a constant \(c\in {\mathbb {C}}\), \(y_k\) is scaled by a factor of \(c^k\)
The interpretation of the output of our definition of a weakly nonlinear system requires some comment as it doesn’t always represent a quantity that can be interpreted as a signal depending on time. Under the assumption that all involved distributions belong to a convolution algebra, then one can distinguish the following cases
-
If the impulse responses \(h_k\) as well as the input signal x are regular distributions and the convolutions \(h_k*x^{\otimes k}\) are well-defined (see Sect. 3.2) then all output components \(y_k\) are locally integrable functions. In this case we can evaluate the \(y_k\) on the diagonal
$$\begin{aligned} & \mathrm {ev_{d}}(y_k)(t) = \mathrm {ev_{d}}(h_k *x^{\otimes k})(t) = \nonumber \\ &\qquad \int \limits _{-\infty }^\infty \cdots \int \limits _{-\infty }^\infty h_k(\tau _1,\ldots ,\tau _k) x(t - \tau _1)\cdot \cdots \cdot x(t - \tau _k) d\tau _1 \cdots d\tau _k \end{aligned}$$(9.17)and obtain an interpretation for the \(y_k\) as signals of time. If the input signal is scaled by the constant c, then, at each time t, the output \(\mathrm {ev_{d}}(y)(t)\) is seen to be a power series in c
$$\begin{aligned} \mathrm {ev_{d}}(y)(t) :=\sum _{k=1}^\infty c^k \mathrm {ev_{d}}(h_k *x^{\otimes k})(t)\,. \end{aligned}$$If this series has a convergence radius greater than zero valid at all times, then \(\mathrm {ev_{d}}(y)\) represents a well-defined function of time and we have a clear procedure to interpret the output of the system.
-
If some or all of the impulse responses \(h_k\) are not regular, there is still a class of input signals for which all \(y_k\) are regular distributions. (Remember that the convolution of any distribution with a test function is an indefinitely differentiable function.) The system restricted to this class of input signals may still be evaluated on the diagonal to obtain a function of time \(\mathrm {ev_{d}}(y)\) as in the previous case.
-
If for no input signal (different from zero) there is a constant \(c > 0\) such that \(\mathrm {ev_{d}}(y)(t)\) remains finite at all times then the system can’t be represented using an element of \({\mathcal {D}}'_{\oplus ,\text {sym}}\).
Example 9.4: Polynomial System
In this example we consider a class of systems whose impulse responses are not regular.
Suppose that the output of a system \({\mathcal {H}}\) is represented by a nonlinear function f of the input signal x and that the function f can be adequately approximated by a Taylor polynomial around the origin
It is readily seen that such a system can be represented by the impulse responses
The response of the system to the input signal x as represented by these impulse responses is
If the input signal is not a regular distribution, for example if it is a Dirac pulse, then neither the initial representation given by (9.18), nor the evaluation on the diagonal \(\mathrm {ev_{d}}(h[\delta ])\) do have a meaning. In spite of this, the impulse responses and their outputs \(y_k\) are mathematically well-defined.
If the class of input signals is restricted to regular distributions then the output obtained from the representation in terms of impulse responses by evaluating on the diagonal \(\mathrm {ev_{d}}(h[x])\) agrees with the original one.
If f is analytic, then it can be represented by a power series (\(K\rightarrow \infty \)). In this case the output of the system is only well defined if the magnitude of the input signal \(|x(t) |\) remains smaller than the convergence radius of the series at all times.
Let \(h_k\) be the kth order impulse responses of the weakly nonlinear system \({\mathcal {H}}\) and x its input signal. In Sect. 3.3 we saw that an arbitrary distribution can be approximated to any desired accuracy by a finite sum of Dirac pulses. Hence, x can be approximated by
and the output of \(h_k\) by
This expression suggests the interpretation for \(h_k\) as that portion of the system defining how the response of the system depends on the combination of k simultaneous points in time of the input signal.
In addition, if we compare the expression representing the output at time t of the (causal) impulse response \(h_k\)
with the one of a polynomial system (see Example 9.4)
we see that, the output at time t of the latter only depends on the kth power of the current value of the input signal. In contrast to this, the output at time t of the former depends on all combinations of products of k (past) values of the input signal. The impulse responses \(h_k\) can thus be interpreted as the memory of the system. The given representation of weakly nonlinear systems can be seen as a generalization of the Taylor approximation method for memory-less systems to systems with memory. It is called the Volterra functional series in honor of V. Volterra who first proposed it [5].
6 Nonlinear Transfer Functions
All impulse responses \(h_k\) of a causal weakly nonlinear system must vanish if any argument \(\tau _j\) is less than zero. This is most easily seen if we consider the case where the impulse responses as well as the input signal x are regular distributions, for then
As every distribution is the limit of smooth functions, this must then be true for arbitrary distributions. The impulse responses of all orders of causal systems are therefore right-sided distributions.
The Laplace transform of the k-order impulse response \(h_k\) is called the nonlinear transfer function of order k
Due to the symmetry of \(h_k\), it is a symmetric function of the variables \(s_1\) to \(s_k\)
As the Laplace transform converts convolution products into ordinary products, the Laplace transform of \(y_k=h_k*x^{\otimes k}\) is
Just as with LTI systems, the many useful properties of the Laplace transform makes it a very valuable tool for solving convolution equations describing weakly nonlinear systems. In particular, on top of converting convolution products into ordinary multiplications, in their region of convergence, the Laplace transformed of distributions are holomorphic functions.
Consider a system described by a differential equation with constant coefficients of the type considered before
The part of the corresponding convolution equation relevant for the calculation of \(y_k\), \(k>1\), is
As the Laplace transform of \(D\delta ^{\otimes k}\) is
(see (9.14)), the Laplace transform of \(L\delta ^{\otimes k}\) is a polynomial in \(s_1+\cdots +s_k\)
Note that the coefficients of this polynomial are the same for all k, including \(k=1\). The only difference between the various values of k is in the argument. If we factor it, we see that the denominator of \(H_k\) adds to the denominator of the lower order transfer functions \(H_j, j=1,\ldots ,k-1\) terms of the form
with \(p_j\) the jth pole and \(l_j\) its multiplicity. If we assume \(H_k\) to be a proper rational function, then its partial fraction expansion will include terms of the form
and similar ones where some of the variables \(s_1,\ldots ,s_{k-1}\) may be missing. If by the calculation of the inverse Laplace transform we start by inverse transforming with respect to \(s_k\) we obtain the expression
By using the shifting property of the Laplace transform and denoting by f the inverse transform of F, the complete inverse transform of the above expression is
If \(H_k\) is not a proper rational function, then it can be decomposed into a polynomial and a proper rational function. The inverse Laplace transform of the polynomial part results in Dirac pulses and its derivatives.
This shows that, if the system under consideration can be described by a differential equation with constant coefficients of the indicated type, then, similarly to the first order impulse response \(h_1\), the higher order impulse responses are sums of Dirac pulses, their derivatives and products of polynomials and exponential functions in the variables \(\tau _1,\ldots ,\tau _k\). In addition, it also shows that, if the linear transfer function \(H_1(s_1)\) has all its poles in the left-hand side of the complex plane, then not only does the regular part of \(h_1\) (that is discarding the Dirac pulses and its derivatives) decay exponentially as its argument tends to infinity, but so also do the regular part of all higher order impulse responses \(h_k\). In particular, we see that all impulse responses are summable distributions
In the following, unless explicitly stated otherwise, we are always going to assume the systems to be of this type.
Example 9.5
We revisit Example 9.2and find an approximate solution of the initial value problem
valid around its zero equilibrium point.
As we saw, in translating an initial value problem into the language of distributions, the initial conditions become part of the equation which, in this case, comes to be
We can think of this equation as an equation describing a system driven by the input signal \(x = y_0\delta \). The solution of the equation y is an element of \({\mathcal {D}}'_{\oplus ,\text {sym}}\) and has the form
The system is therefore fully characterized if we find the impulse responses \(h_k\). The solution of the original problem is then found by multiplying each impulse response \(h_k\) by \(y_0^k\)
To find the impulse responses we apply the input signal \(x = \delta \) and insert \(y = h\) into the equation. The equation is solved if it is satisfied by each component \(h_k\) of h individually. The component \(h_k\) can be computed from the equation and the impulse responses of lower order \(h_j, j=1,\ldots ,k-1\).
To find \(h_1\) we retain only terms of the equation belonging to \({\mathcal {D}}'({\mathbb {R}})\)
If we Laplace transform the equation we obtain
from which we immediately obtain the first order transfer function
and, by inverse Laplace transformation, the first order impulse response
The second order impulse response \(h_2\) is found by retaining in the equation only terms belonging to \({\mathcal {D}}'({\mathbb {R}}^2)\)
From the Laplace transformed equation
we immediately obtain the second order nonlinear transfer function
Note that it’s often convenient to write higher-order transfer functions in terms of the first-order one. In this example
To obtain the second order impulse response we can inverse Laplace transform, first with respect to one Laplace variable, then with respect to the other one, and finally by symmetrizing the result. We first inverse transform with respect to \(s_2\) the expression
Assuming \(s_1\ne 0\)Footnote 1 and expanding in partial fractions we find
We then combine this expression with the other factors of \(H_2\)
and inverse transform with respect to \(s_1\). This can be done by expanding in partial fractions the first factor
and by using the shifting property of the Laplace transform to find
Note that this expression is not symmetric and that if we had first inverse transformed with respect to \(s_1\) and then to \(s_2\), we would have obtained an expression with \(\tau _1\) and \(\tau _2\) exchanged.
The second-order impulse response is obtained from the above expression by symmetrisation
where we have suppressed the explicit Heavyside step functions with the understanding that the expression is zero if \(\tau _1 < 0\) or \(\tau _2 < 0\). As \(h_2\) is a regular distribution, it can be evaluated on the diagonal and we obtain
The third order impulse response \(h_3\) is found by retaining only elements belonging to \({\mathcal {D}}'({\mathbb {R}}^3)\) in the equation. As a first step we write
for no other term can produce distributions belonging to \({\mathcal {D}}'({\mathbb {R}}^3)\). The right hand side can be expanded with the help of (9.10) and, retaining only the terms of interest, we obtain
The Laplace transformed equation is
and with it the third order nonlinear transfer function is readily obtained
By expressing \(H_2\) in terms of \(H_1\) we can write \(H_3\) in terms of \(H_1\) alone
The computation of the third order impulse response proceeds along the same lines as the computation of \(h_2\). After some algebraic manipulations and exploiting the properties of the Laplace transform we obtain a rather long expression whose evaluation on the diagonal is
At this point it’s interesting to compare the first three elements of the approximate solution that we computed here with the exact solution that we calculated in Example 9.2 and that we reproduce here for convenience
If \(|y_0 c/a | < 1\) the exact solution can be expanded in a geometric power series
and see that the lowest order terms correspond to the calculated response components \(y_1, y_2\) and \(y_3\). Note also that the convergence radius of the power series derived from the exact solution corresponds to the radius of the largest open ball, centered at the origin and contained in the domain of attraction of the equilibrium point
Figure 9.5 compares the exact solution of the initial value problem with the approximation given by \(\mathrm {ev_{d}}(y_1 + y_2 + y_3)\) for \(a=1, c=1/2, y_0=1\).
While for this particular example it was easier to compute the exact solution than to calculate the approximation, the latter allows us to obtain the output of the system described by the differential equation
for any input signals \(x \in {\mathcal {D_+'}}({\mathbb {R}})\) maintaining the system withing the region of attraction of the equilibrium point
with
Here and in many problems, this amounts to limiting the magnitude of the input signal to sufficiently small values. Figure 9.6 show the approximate solution for a sinusoidal input \(x(t) = \textsf{1}_{+}(t)\sin (t)\) and compares it to the solution obtained by numerical integration of the differential equation for \(a=1, c=1/2\).
This example shows how by representing the solution of a nonlinear differential equation describing a weakly nonlinear system by a sequence of distributions \(y\in {\mathcal {D}}'_{\oplus ,\text {sym}}\) we have reduced the problem of solving a nonlinear differential equation to an essentially algebraic problem. While some expressions are rather long, they can be manipulated rather easily by modern computer algebra systems (CAS).
Example 9.6
We revisit Example 9.3and try to find an approximate solution in \({\mathcal {D}}'_{\oplus ,\text {sym}}\) of the initial value problem
valid around its zero equilibrium point. Note that the linearized equation is stable, but not asymptotically stable.
As before we calculate the impulse responses by setting \(y_0 = 1\). The solution for an arbitrary \(y_0\) is then found by multiplying the kth order impulse response \(h_k\) by \(y_0^k\).
The first order impulse response \(h_1\) is found by writing the convolution equation corresponding to the above initial value problem and retaining only terms of first order
By Laplace transforming the equation, the first order transfer function \(H_1(s_1)\) is found to be
From it, the first order impulse response is
The equation doesn’t have second order nonlinearities. Therefore the second order impulse response and the second order transfer function are both zero
The third order impulse response is found by retaining all third order terms in the convolution equation
By Laplace transforming the equation we find for the third order transfer function
From this, the third order impulse response is obtained by inverse Laplace transforming with respect to one variable at a time and by symmetrizing the result
From the above results we could conclude that, to third order, the approximate solution of the initial value problem is
This is however only valid for sufficiently small values of t. The reason is best seen by comparing the above expression with the exact solution of the initial value problem that we obtained in Example 9.3 and that we repeat here for convenience
The Taylor expansion around zero of the function
is
and has a convergence radius of 1. Therefore, as long as \(|2cy_0^2 t | < 1\), the exact solution can be represented by the power series
whose first two terms coincide with \(y_0 h_1(t)\) and \(\mathrm {ev_{d}}(y_0^3 h_3)(t)\) respectively. However, as t increases, the higher order terms become more and more important and, when \(|2cy_0^2 t | = 1\), the Taylor expansion stops being a valid representation of the exact solution of the initial value problem.
The last example shows that, in general, the solution of a nonlinear differential equation in terms of an element of \({\mathcal {D}}'_{\oplus ,\text {sym}}\) is only meaningful around an equilibrium point for which the linearized equation is asymptotically stable. The reason being that, if this is not the case then the response of the system to any part of the input signal can persist indefinitely in time without ever decreasing to negligible levels. Since this is true for the response of any order, the output \(\mathrm {ev_{d}}(y)\) can not in general be represented by a power series. We can say that systems that are representable by a Volterra series are those whose output does not depend on the too distant past.
In the case in which the linearized system is asymptotically stable all impulse responses are summable distributions. Their Fourier transforms are therefore continuous functions that can be obtained from the nonlinear transfer functions \(H_k\) by
As the nonlinear transfer functions are rational functions, the Fourier transforms \(\hat{h}_k\) are indefinitely differentiable and of polynomial growth, so they belong to \({\mathcal {O}}_M\).
7 Periodic Input Signals
In this section we investigate the response of weakly nonlinear systems to periodic input signals. Given a periodic input signal x, every tensor power \(x^{\otimes k}\) is evidently also a (higher dimensional) periodic distribution. Therefore, every component \(y_k\) of the system response y can be calculated in the convolution algebra of periodic distributions and represented by a Fourier series.
Let x be a \({\mathcal {T}}\)-periodic input signal with Fourier coefficients
Further, let \(m = (m_1,\ldots ,m_k) \in {\mathbb {Z}}^k\) be a multi-index and \(\omega _c = 2\pi /{\mathcal {T}}\), then the Fourier coefficients of the kth tensor power of x are
With this expression and a straightforward generalization of Eqs. (4.21) and (4.24) to higher dimensional distributions, the Fourier coefficients of \(y_k\) are readily seen to be
with \(\hat{h}_k\) the Fourier transform of the kth order impulse response of the system.
8 Multi-tone Input Signals
In some applications, for example in the study of interference and distortion in communication systems, one is often interested in the response of a system to input signals consisting of sinusoidal tones. If the frequencies of the tones are commensurate, that is, if their ratios are rational numbers, then one can find a common period and the input signal is periodic. The system response can thus be obtained by using the results of the previous section. However, for multi-tone input signals the results are often more directly interpretable by using a different indexing scheme for the tones composing the output components \(y_k\) [13].
8.1 General Case
Let’s consider a system driven by an input consisting of N complex tones
initially assumed to have commensurate angular frequencies \(\omega _1,\ldots ,\omega _N\). Our objective is to calculate the system response of order k
Consider first the tensor power \(x^{\otimes k}\). It can be expanded with the help of (9.10)
with m the multi-index \(m=(m_1,\ldots ,m_N)\) whose elements range from 0 to N. Observe that this expression is the Fourier series representation of \(x^{\otimes k}\). With it and (9.23) the Fourier series representation of \(y_k\) is thus found to be
with \(\hat{h}_k\) the Fourier transform of the impulse response of order k. As this sum is finite and only composed by indefinitely differentiable functions, it is itself an indefinitely differentiable function that can be evaluated on the diagonal
The kth order response of the system is therefore a sum composed by
complex tones, each one uniquely determined by a specific multi-index m. In this context the multi-index m is also called a frequency mix and \(|m |\) its order.
These results show several important properties of weakly nonlinear systems.
-
In contrast to linear systems, weakly nonlinear systems generate tones at frequencies not present at its input.
-
In general, tones at a specific frequency are generated by frequency mixes of various orders.
-
To fully characterize \(\hat{h}_k\) (and hence \(h_k\)) one needs k input tones.
At the beginning of this section we assumed the input frequencies to be commensurate. If this is not the case then the input signal is not periodic, but almost periodic. For such signals one can still define a Fourier series [16, Sect. VI.9] and the obtained results remains valid.
8.2 Real Case
In this section we specialize the above results to the case of a real system driven by an input consisting of N sinusoidal signals
and where we assume \(\omega _1,\ldots ,\omega _N > 0\). To re-use previous results it’s convenient to represent the input signal in terms of complex exponentials and use separate indexes for positive and negative angular frequencies
The quantity \(A_n\) is called the phasor of the sinusoidal signal
With this notation and using the multi-index \(m=(m_{-N},\ldots ,m_{-1},m_1,\ldots ,m_N)\) the output component \(y_k\) is easily calculated with the help of (9.24)–(9.27)
To N sinusoidal input tones there correspond 2N complex tones. Therefore, in the real case, the sum is composed by
frequency mixes.
In the real case there is some extra structure that we can exploit. Consider a specific frequency mix \(m=(m_{-N},\ldots ,m_{-1},m_1,\ldots ,m_N)\). From the above expression, it’s apparent that the multi-index
obtained from m by reversing the order of the entries does also appear in the Fourier series of \(y_k\). If \(m \ne \textrm{rv}(m)\) then from \(k!/\textrm{rv}(m)! = k!/m!\), \(\omega _{\textrm{rv}(m)} = -\omega _m\), \(A_{\textrm{rv}(m)} = \overline{A}_m\) and \(\hat{h}_{k,\textrm{rv}(m)} = \overline{\hat{h}}_{k,m}\) we deduce that the sum of \(y_{k,m}(t)\) and \(y_{k,\textrm{rv}(m)}(t)\) is a sinusoidal signal
with
If \(m = \textrm{rv}(m)\) then the multi-index \(\textrm{rv}(m)\) is not distinct from m and the Fourier series component described by \(\textrm{rv}(m)\) coincides with the one described by m. In this case \(\omega _m = 0\) and, as the system is assumed to be real, \(\hat{h}_{k,m}\) must be real. The response \(y_{k,m}\) therefore becomes
Note that m and \(\textrm{rv}(m)\) can only be equal for even values of k. Also, note that there can be multi-indexes m resulting in \(\omega _m = 0\) for which \(m = \textrm{rv}(m)\) doesn’t hold.
Example 9.7
Consider again the system described by the differential equation
that we analysed in Example 9.5. Here we are interested in the steady state response of the system when driven by the input signal
In our previous analysis of this system we calculated the first three nonlinear transfer functions \(H_1, H_2\) and \(H_3\). Using those results, the output components \(y_1, y_2\) and \(y_3\) are immediately obtained from (9.31) and (9.34) without having to calculate any inverse Laplace transform.
Concretely, as the input signal consists of a single sinusoidal tone, the frequency mixes are composed by two entries \(m=(m_{-1}, m_1)\). The output of first order \(y_1\) is obtained from the above equations by setting \(k=1\) and by summing over all multi-indexes satisfying the constraint \(|m| = m_{-1}+m_1 = 1\). There are only two such multi-indexes: (0, 1) and \(\textrm{rv}((0,1)) = (1,0)\). The first order output of the system is therefore given by
with \(A = |A |\textrm{e}^{-\jmath \pi /2}\).
The second order response of the system \(y_2\) is obtained by setting \(k=2\) and summing over all multi-indexes under the constraint \(|m|=2\). There are three of them: (2, 0), (0, 2) and (1, 1). The first one is the reverse of the second one. Therefore, the contribution of these two is obtained from (9.31)
Since the remaining multi-index is equal to its reverse \((1,1) = \textrm{rv}((1,1))\), its contribution is the constant given by (9.34)
The response of second order is thus
The third order response of the system \(y_3\) is obtained by setting \(k=3\) and summing over all multi-indexes for which \(|m|=3\). There are four of them: (3, 0), (2, 1), (1, 2) and (0, 3). Two of them are the reverse of the other two. For this reason the response of third order of the system \(y_3\) is given by
with
Example 9.8: Two Tones Input
Suppose that we would like to implement a causal real LTI system. However, due to unavoidable limitations of physical components, the implementation behaves as a real weakly nonlinear system characterized by the nonlinear transfer functions \(H_k\) (see Fig. 9.4). We are interested in its output when driven by an input signal consisting in two sinusoidal tones
We think of the two tones as closely spaced in frequency and denote the difference of their angular frequencies by \(\Delta \omega = \omega _2 - \omega _1\).
As the input is composed by two sinusoidal signals, the frequency mixes have four entries \(m = (m_{-2}, m_{-1}, m_1, m_2)\). From (9.29) we calculate that there are 4, 10 and 20 frequency mixes of order one, two and three, respectively. They are listed in Table 9.1.
The first order output \(y_1\) is the output that would be produced by a perfectly linear system. All other tones are undesired. In particular, while tones relatively distant in frequency from \(\omega _1\) and \(\omega _2\) are relatively easily suppressed with filters, tones close to them are much more difficult to filter out. The tones closest in frequency to \(\omega _1\) and \(\omega _2\) listed in Table 9.1 are the tones associated with the frequency mixes (1, 0, 2, 0), (0, 1, 0, 2) end their reverses
and
both produced by nonlinearities of third order.
The frequency mixes of fifth order are 56. Among them we can easily identify frequency mixes producing tones at every frequency generated by third order nonlinearities, in particular at \(\omega _1-\Delta \omega = 2\omega _1 - \omega _2\). To see this, start with a frequency mix m producing the frequency of interest and add the same number \(l > 0\) to \(m_n\) and \(m_{-n}\) for any n ranging from 1 to N (the number of input sinusoidal tones, here 2)
Then the order of the new frequency mix \(m'\) is 2l higher than the one of m and the angular frequencies \(\omega _m\) and \(\omega _{m'}\) associated with the two frequency mixes are identical (see (9.27)).
Using this construction starting from (1, 0, 2, 0), we see that the fifth order mixes (2, 0, 2, 1), (1, 1, 3, 0) and their reverses produce tones at \(\omega _1-\Delta \omega \)
The total response of the system at the frequency \(\omega _1 - \Delta \omega \) is therefore a possibly infinite sum composed by the above mixes and higher order ones
This sum can be represented graphically by drawing the phasor of each summand as a vector in the complex plane and summing them by vector addition. Figure 9.7 shows the phasor diagram for the above sum under the assumption that summands of order higher than fifth can be neglected.
Observe that summands of different order depend differently on the amplitude of the input signals \(|A_1 |\) and \(|A_2 |\). For small input amplitudes the third order one is usually the dominant. As the amplitude of the input tones grows, higher order summands become first significant and then dominant. This means that both the magnitude as well as the phase of the output tone does change with the amplitude of the input signals. At some level of the input tones there may even be a canceling effect where the output tone becomes very small.
Among the 56 frequency mixes of fifth order there are several of them generating tones at new frequencies. In particular the closest in frequency to \(\omega _1\) and \(\omega _2\) (not generated by lower order mixes) are at \(\omega _1 - 2\Delta \omega \) and \(\omega _2 + 2\Delta \omega \). Similarly, higher odd order frequency mixes introduce tones at new frequencies spaced by \(\Delta \omega \) from the previous ones. Figure 9.8 illustrates a typical spectrum of the output signal. For simplicity of representation the figure only shows lines generated by fifth or lower order nonlinearities.
Notes
- 1.
The obtained expression is a continuous function of \(s_1\) which we extend by continuity to \(s_1 = 0\).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2024 The Author(s)
About this chapter
Cite this chapter
Beffa, F. (2024). Weakly Nonlinear Time Invariant Systems. In: Weakly Nonlinear Systems. Understanding Complex Systems. Springer, Cham. https://doi.org/10.1007/978-3-031-40681-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-40681-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40680-5
Online ISBN: 978-3-031-40681-2
eBook Packages: EngineeringEngineering (R0)