This chapter is devoted to some basic definitions and results that are needed in this book. We essentially focus on the Newton interpolation theory and the higher order convexity and concavity properties.

Recall that, unless indicated otherwise, the symbol I always denotes an arbitrary real interval whose interior is nonempty.

2.1 Newton Interpolation Theory

In this first section, we recall some basic facts about Newton interpolation theory and divided differences. We also establish a result on the derivatives of interpolating polynomials. For background see, e.g., de Boor [32, Chapter 1], Gel’fond [39, Chapter 1], Quarteroni et al.[85, Section 8.2.2], and Stoer and Bulirsch [94, Section 2.1.3].

Let \(n\in \mathbb {N}\) and let x 0, x 1, …, x n be any (not necessarily distinct) points of I. Let also \(f\colon I\to \mathbb {R}\) be so that \(D^{m_i-1}f(x_i)\) exists for i = 0, …, n, where m i is the multiplicity of x i among the points x 0, x 1, …, x n.

We let

$$\displaystyle \begin{aligned} f[x_0,x_1,\ldots,x_n]{} \end{aligned}$$

denote the divided difference of f at the points x 0, x 1, …, x n, and we let the map

$$\displaystyle \begin{aligned} x ~\mapsto ~ P_n[f](x_0,x_1,\ldots,x_n;x){} \end{aligned}$$

denote the interpolating polynomial of f with nodes at x 0, x 1, …, x n, i.e., the unique polynomial P satisfying the equations

$$\displaystyle \begin{aligned} D^kP(x_i) ~=~ D^kf(x_i),\qquad 0\leq k \leq m_i-1,\qquad i=0,\ldots,n. \end{aligned}$$

This polynomial has degree at most n.

Recall that f[x 0, x 1, …, x n] is precisely the coefficient of x n in the interpolating polynomial P n[f](x 0, x 1, …, x n;x). More precisely, the Newton interpolation formula states that

$$\displaystyle \begin{aligned} P_n[f](x_0,x_1,\ldots,x_n;x) ~=~ \sum_{k=0}^nf[x_0,x_1,\ldots,x_k]\,\prod_{i=0}^{k-1}(x-x_i). \end{aligned} $$
(2.1)

Moreover, the corresponding interpolation error at any x ∈ I can take the following form

$$\displaystyle \begin{aligned} f(x) - P_n[f](x_0,x_1,\ldots,x_n;x) ~=~ f[x_0,x_1,\ldots,x_n,x]\,\prod_{i=0}^n(x-x_i). \end{aligned} $$
(2.2)

Recall also that the map

$$\displaystyle \begin{aligned} (z_0,z_1,\ldots,z_n) ~\mapsto ~ f[z_0,z_1,\ldots,z_n] \end{aligned}$$

is symmetric, i.e., invariant under any permutation of its arguments. Moreover, the divided differences of f can be computed via the following recurrence relation. For any k ∈{0, 1, …, n}, we have f[x k] = f(x k) and

$$\displaystyle \begin{aligned} f[x_0,\ldots,x_k] ~=~ \begin{cases} \displaystyle{\frac{f[x_1,\ldots,x_k]-f[x_0,\ldots,x_{k-1}]}{x_k-x_0}}{\,}, & \mbox{if }x_k\neq x_0,\\ \displaystyle{\frac{1}{k!}{\,}D^kf(x_0)}{\,}, & \mbox{if }x_0 =x_1 =\cdots =x_k. \end{cases} \end{aligned} $$
(2.3)

When the points x 0, x 1, …, x n are pairwise distinct, we also have the following explicit expression

$$\displaystyle \begin{aligned} f[x_0,x_1,\ldots,x_n] ~=~ \sum_{k=0}^n\frac{f(x_k)}{\prod_{j\neq k}(x_k-x_j)}{\,}. \end{aligned} $$
(2.4)

We now establish a proposition that shows how the derivative of an interpolating polynomial of a differentiable function f is related to the derivative of f.

Proposition 2.1

Suppose that I is an arbitrary nonempty open real interval. For any \(n\in \mathbb {N}^*\) , any system x 0 < x 1 < ⋯ < x n of n + 1 points in I, and any differentiable function \(f\colon I\to \mathbb {R}\) , there exist n points ξ 0, …, ξ n−1 in I such that, for i = 0, …, n − 1, we have x i < ξ i < x i+1 and

$$\displaystyle \begin{aligned} D_xP_n[f](x_0,\ldots,x_n;x)\big|{}_{x=\xi_i} ~=~ f'(\xi_i). \end{aligned} $$
(2.5)

Moreover, we have

$$\displaystyle \begin{aligned} D_xP_n[f](x_0,\ldots,x_n;x) ~=~ P_{n-1}[f'](\xi_0,\ldots,\xi_{n-1};x) \end{aligned} $$
(2.6)

and

$$\displaystyle \begin{aligned} n{\,}f[x_0,\ldots,x_n] ~=~ f'[\xi_0,\ldots,\xi_{n-1}]. \end{aligned} $$
(2.7)

Proof

The function \(g\colon I\to \mathbb {R}\) defined by the equation

$$\displaystyle \begin{aligned} g(x) ~=~ P_n[f](x_0,\ldots,x_n;x)-f(x)\qquad \mbox{for }x\in I \end{aligned}$$

vanishes at the n + 1 points x 0, x 1, …, x n. The first part of the proposition then follows from applying Rolle’s theorem in each interval (x i, x i+1). Now, identity (2.6) immediately follows from (2.5) and the very definition of the interpolating polynomial. Identity (2.7) then follows by equating the coefficients of x n−1 in (2.6). □

2.2 Higher Order Convexity and Concavity

Let us recall the definitions of p-convex and p-concave functions and present some related results. For background see, e.g., Kuczma [58], Kuczma [61, Chapter 15], Popoviciu [84], and Roberts and Varberg [87, pp. 237–240].

Definition 2.2 (p-Convexity and p-Concavity)

A function \(f\colon I\to \mathbb {R}\) is said to be convex of order p (resp. concave of order p) or simply p-convex (resp. p-concave) for some integer p ≥−1 if for any system x 0 < x 1 < ⋯ < x p+1 of p + 2 points in I it holds that

$$\displaystyle \begin{aligned} f[x_0,x_1,\ldots,x_{p+1}]~\geq ~0\qquad (\mbox{resp.}~f[x_0,x_1,\ldots,x_{p+1}]~\leq ~0). \end{aligned}$$

Thus defined, a function \(f\colon I\to \mathbb {R}\) is 1-convex if it is an ordinary convex function; it is 0-convex if it is increasing (in the wide sense); it is (−1)-convex if it is nonnegative.

Let us now introduce a practical notation to denote the set of p-convex functions and the set of p-concave functions.

Definition 2.3

Let p ≥−1 be an integer.

  • We let \(\mathcal {K}^p_+(I)\) (resp. \(\mathcal {K}^p_-(I)\)) denote the set of functions \(f\colon I\to \mathbb {R}\) that are p-convex (resp. p-concave).

  • We let \(\mathcal {K}^p_+\) (resp. \(\mathcal {K}^p_-\)) denote the set of functions \(f\colon \mathbb {R}_+\to \mathbb {R}\) that are eventually p-convex (resp. eventually p-concave), i.e., p-convex (resp. p-concave) in a neighborhood of infinity.

We also set

$$\displaystyle \begin{aligned} \mathcal{K}^p(I) ~=~ \mathcal{K}^p_+(I)\cup\mathcal{K}^p_-(I)\qquad \mbox{and}\qquad \mathcal{K}^p ~=~ \mathcal{K}^p_+\cup\mathcal{K}^p_-.{} \end{aligned}$$

The following proposition shows that both sets \(\mathcal {K}^p_+(I)\) and \(\mathcal {K}^p_-(I)\) are convex cones whose intersection is precisely the real linear space of all polynomials of degree less than or equal to p. A similar description of the sets \(\mathcal {K}^p_+\) and \(\mathcal {K}^p_-\) will be given in Corollary 4.6.

Proposition 2.4

For any \(p\in \mathbb {N}\) , the sets \(\mathcal {K}^p_+(I)\) and \(\mathcal {K}^p_-(I)\) are convex cones. These cones are opposite in the sense that f lies in \(\mathcal {K}^p_+(I)\) if and only if f lies in \(\mathcal {K}^p_-(I)\) . Moreover, the intersection \(\mathcal {K}^p_+(I)\cap \mathcal {K}^p_-(I)\) is the real linear space of all polynomials of degree less than or equal to p.

Proof

That the sets \(\mathcal {K}^p_+(I)\) and \(\mathcal {K}^p_-(I)\) are convex cones is trivial; indeed, if f 1 and f 2 lie in \(\mathcal {K}^p_+(I)\) for instance, then so does c 1 f 1 + c 2 f 2 for any c 1, c 2 ≥ 0. By definition of \(\mathcal {K}^p_+(I)\) and \(\mathcal {K}^p_-(I)\), these cones are clearly opposite. Now, let f lie in \(\mathcal {K}^p_+(I)\cap \mathcal {K}^p_-(I)\) and let x 0 < ⋯ < x p be p + 1 points in I. By (2.2), for any x ∈ I we must have

$$\displaystyle \begin{aligned} f(x) - P_p[f](x_0,x_1,\ldots,x_p;x) ~=~ 0, \end{aligned}$$

which shows that f is a polynomial of degree at most p. Conversely, using (2.2) again, we can readily see that any such polynomial lies in \(\mathcal {K}^p_+(I)\cap \mathcal {K}^p_-(I)\). □

We now present an important lemma. It is interesting in its own right and will be very useful in the subsequent chapters. A variant of this result can be found in Kuczma [61, Lemma 15.7.2].

Recall first that for any \(f\colon I\to \mathbb {R}\), any \(p\in \mathbb {N}\), and any x ∈ I such that x + p ∈ I, we have

$$\displaystyle \begin{aligned} \Delta^pf(x) ~=~ p!{\,}f[x,x+1,\ldots,x+p]. \end{aligned} $$
(2.8)

Lemma 2.5

Let \(p\in \mathbb {N}\) and let \(\mathcal {I}_{p+1}\) denote the set of tuples of I p+1 whose components are pairwise distinct. A function \(f\colon I\to \mathbb {R}\) lies in \(\mathcal {K}^p_+(I)\) (resp. \(\mathcal {K}^p_-(I)\) ) if and only if the restriction of the map

$$\displaystyle \begin{aligned} (z_0,\ldots,z_p) ~\mapsto ~ f[z_0,\ldots,z_p] \end{aligned}$$

to \(\mathcal {I}_{p+1}\) is increasing (resp. decreasing) in each place. In particular, if I is not upper bounded, then for any function f lying in \(\mathcal {K}^p_+(I)\) (resp. \(\mathcal {K}^p_-(I)\) ), the function Δ p f is increasing (resp. decreasing) on I.

Proof

Using the definition of p-convexity and the standard recurrence relation (2.3) for divided differences, we can see that f lies in \(\mathcal {K}^p_+(I)\) if and only if, for any pairwise distinct x 0, …, x p+1 ∈ I, we have

$$\displaystyle \begin{aligned} \frac{f[x_1,x_2\ldots,x_{p+1}]-f[x_0,x_2\ldots,x_{p+1}]}{x_1-x_0} ~\geq ~ 0. \end{aligned}$$

Equivalently, for any pairwise distinct x 0, …, x p+1 ∈ I, we have

$$\displaystyle \begin{aligned} x_1>x_0\quad \Rightarrow\quad f[x_1,x_2\ldots,x_{p+1}]-f[x_0,x_2\ldots,x_{p+1}] ~\geq ~ 0. \end{aligned}$$

The latter condition exactly means that the map defined in the statement is increasing in the first place. Since this map is symmetric, it must be increasing in each place. The second part of the lemma follows from (2.8). □

We end this section with a second lemma, which provides some important connections between higher order convexity and higher order differentiability. In fact, these connections can be derived (sometimes tediously) from various results given in the references mentioned in the beginning of this section, especially the book by Kuczma [61, Chapter 15]. However, for the sake of self-containment we provide a detailed proof in Appendix A.

Lemma 2.6

Let I be an nonempty open real interval and let \(p\in \mathbb {N}\) . Then the following assertions hold.

  1. (a)

    We have \(\mathcal {K}^{p+1}(I)\subset \mathcal {C}^p(I)\).

  2. (b)

    Assume that I is not upper bounded. If \(f\in \mathcal {K}^p_+(I)\) , then \(\Delta ^jf\in \mathcal {K}^{p-j}_+(I)\) for every j ∈{0, …, p + 1}.

  3. (c)

    If \(f\in \mathcal {C}^{j}(I)\cap \mathcal {K}^p_+(I)\) for some j ∈{0, …, p + 1}, then \(f^{(j)}\in \mathcal {K}^{p-j}_+(I)\).

  4. (d)

    If \(f\in \mathcal {C}^1(I)\) and \(f'\in \mathcal {K}^{p-1}_+(I)\) , then \(f\in \mathcal {K}^p_+(I)\).

Proof

See Appendix A. □

2.3 A Key Lemma

Let \(p\in \mathbb {N}\), a > 0, and \(f\colon \mathbb {R}_+\to \mathbb {R}\). Combining Newton’s interpolation formula (2.1) with identity (2.8), we can readily see that the unique interpolating polynomial of f with nodes at the p points a, a + 1, …, a + p − 1 takes the form

$$\displaystyle \begin{aligned} P_{p-1}[f](a,a+1,\ldots,a+p-1;x) ~=~ \sum_{j=0}^{p-1}{\textstyle{{{x-a}\choose{j}}}}\,\Delta^jf(a). \end{aligned} $$
(2.9)

If p = 0, then this polynomial is naturally the zero polynomial, which is assumed to have degree − 1. Moreover, using (2.2) we can immediately see that the corresponding interpolation error at any x > 0 is

$$\displaystyle \begin{aligned} f(x)-\sum_{j=0}^{p-1}{\textstyle{{{x-a}\choose{j}}}}\,\Delta^jf(a) ~=~ (x-a)^{\underline{p}}{\,}f[a,a+1,\ldots,a+p-1,x]. \end{aligned} $$
(2.10)

Now, the right side of (2.10) is actually the remainder of the (p − 1)th degree Newton expansion of f(x) about x = a (see, e.g., Graham et al. [41, Section 5.3]). Note also that formula (2.10) is a pure identity in the sense that it is valid without any restriction on the form of f(x).

Using (2.9) and (2.10) we see that, for any a > 0, any x ≥ 0, any \(p\in \mathbb {N}\), and any \(g\colon \mathbb {R}_+\to \mathbb {R}\), the quantity \(\rho ^p_a[g](x)\) defined in (1.7) is precisely the interpolation error at a + x when considering the interpolating polynomial of g with nodes at a, a + 1, …, a + p − 1. We then immediately derive the following identities:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \rho^p_a[g](x) & =&\displaystyle g(a+x)-P_{p-1}[g](a,a+1,\ldots,a+p-1;a+x){\,},{} \end{array} \end{aligned} $$
(2.11)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \rho^p_a[g](x) & =&\displaystyle x^{\underline{p}}{\,}g[a,a+1,\ldots,a+p-1,a+x]{\,}.{} \end{array} \end{aligned} $$
(2.12)

We note that identity (2.12) also extends to the case when x ∈{0, 1, …, p − 1}, even if g is not differentiable. Indeed, in this case we must have \(\rho ^p_a[g](x)=0\) by (2.11).

We now end this chapter with a key lemma that will be used repeatedly in this book. Although this lemma is rather technical, it is at the root of various fundamental convergence results of our theory. Recall first that, for any \(k\in \mathbb {N}\), the symbol ε k(x) stands for the sign of \(x^{ \underline {k}}\).

Lemma 2.7

Let \(p\in \mathbb {N}\), \(f\in \mathcal {K}^p\) , and a > 0 be so that f is p-convex or p-concave on [a, ). Then, for any x ≥ 0, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} 0 ~\leq ~ \pm{\,}\varepsilon_{p+1}(x){\,}\rho_a^{p+1}[f](x) & \leq &\displaystyle \pm{\,}\left|{\textstyle{{{x-1}\choose{p}}}}\right|\left(\Delta^pf(a+x)-\Delta^pf(a)\right)\\ & \leq &\displaystyle \pm\left|{\textstyle{{{x-1}\choose{p}}}}\right|\sum_{j=0}^{\lceil x\rceil -1}\Delta^{p+1}f(a+j), \end{array} \end{aligned} $$

where ± stands for 1 or − 1 according to whether f lies in \(\mathcal {K}^p_+\) or \(\mathcal {K}^p_-\) . Moreover, if x ∈{0, 1, …, p} (i.e., ε p+1(x) = 0), then \(\rho _a^{p+1}[f](x)=0\).

Proof

If x ∈{0, 1, …, p}, then we have that \(\rho _a^{p+1}[f](x)=0\) by (2.11), and then the inequalities hold trivially. Let us now assume that x∉{0, 1, …, p}, which means that ε p+1(x) ≠ 0. Negating f if necessary, we may assume that it lies in \(\mathcal {K}^p_+\). By (2.12) we then have

$$\displaystyle \begin{aligned} \varepsilon_{p+1}(x)\,\rho^{p+1}_a[f](x) ~=~ \varepsilon_{p+1}(x){\,}x^{\underline{p+1}}{\,}f[a,a+1,\ldots,a+p,a+x] ~\geq 0. \end{aligned}$$

Hence, using identities (2.3) and (2.8) and Lemma 2.5, we obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} 0 & \leq &\displaystyle \varepsilon_{p+1}(x)\,\rho^{p+1}_a[f](x)\\ & =&\displaystyle \varepsilon_{p+1}(x){\,}x^{\underline{p+1}}{\,}f[a,a+1,\ldots,a+p,a+x]\\ & =&\displaystyle \varepsilon_{p+1}(x){\,}(x-1)^{\underline{p}}{\,}(f[a+x,a+1,\ldots,a+p]-f[a,a+1,\ldots,a+p])\\ & \leq &\displaystyle \varepsilon_{p+1}(x)\,{\textstyle{{{x-1}\choose{p}}}}{\,}(\Delta^pf(a+x)-\Delta^pf(a))\\ & \leq &\displaystyle \varepsilon_{p+1}(x)\,{\textstyle{{{x-1}\choose{p}}}}{\,}(\Delta^pf(a+\lceil x\rceil)-\Delta^pf(a)), \end{array} \end{aligned} $$

which proves the first two inequalities. The third one can be immediately proved using a telescoping sum. □