Abstract
In this paper we introduce and analyze an iteratively re-weighted algorithm, that allows to approximate the weak solution of the p-Poisson problem for \(1 < p \leqslant 2\) by iteratively solving a sequence of linear elliptic problems. The algorithm can be interpreted as a relaxed Kačanov iteration, as so-called in the specific literature of the numerical solution of quasi-linear equations. The main contribution of the paper is proving that the algorithm converges at least with an algebraic rate.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper we approach the numerical solution of the p-Poisson problem
where \(\Omega \subset \mathbb {R}^d\) is open and bounded and \(1<p<\infty \). The solution might be scalar or vector-valued.Footnote 1
Nonlinear problems of this type appear in many applications, e.g. non-Newtonian fluid theory [21], turbulent flow of a gas in porous media, glaciology or plastic modeling. Moreover, the p-Laplacian has a similar model character for nonlinear problems as the ordinary Laplace operator for linear problems; see [22] for an introduction.
As usual we are looking for the weak solution of (1.1). In particular, we are searching for a function \(u\in W_0^{1,p}(\Omega )\) such that
where in the most general case \(f\in (W_0^{1,p}(\Omega ))^*\). It is well-known that the solution is unique and coincides with the minimizer of the energy \(\mathcal {J}: W_0^{1,p}(\Omega )\rightarrow \mathbb {R}\) defined by
Due to the nonlinearity of the problem it is harder to obtain efficient numerical solutions of this problem with a guaranteed performance. Our goal is to construct solutions of (1.2) by means of a numerically accessible algorithm. In particular, we construct an iterative algorithm that approximates solutions of (1.2), where in each step only a linear elliptic problem has to be solved. Primarily, we focus here on the iteration on the infinite dimensional space \(W^{1,p}_0(\Omega )\). However, the same algorithm will immediately apply also to discretized versions of the p-Poisson problem, e.g., by means of finite elements or wavelets. This approach would coincide with the one adopted, for instance, in [5] of first finding an iteration on the infinite-dimensional solution space and then discretizing in space. We will consider in subsequent work the effect of the discretization and its adaptation to error estimators.
In this paper we also restrict ourselves to the case \(p\in (1,2]\), since we are in particular interested in relatively small values of p, also because the case of \(p>2\) is already addressed to a certain extent in [5]. We will see, e.g., in Example 20 that our algorithm actually only works properly for the range of \(p \in (1,2]\).
Coming from the weak formulation (1.2) one can interpret the problem as a weighted Poisson problem
for the given f, where \(a:\Omega \rightarrow \mathbb {R}\) and \(a=|\nabla u|\). This suggests to iteratively calculate for a given function \(v_n\) the new iterate \(v_{n+1}\) as the solution of
The advantage of this step is that the calculation of \(v_{n+1}\) only requires solving a linear problem. This allows invoking relatively standard appraoches to discretize this step and solve it numerically with guaranteed performances. The problem with this approach, however, is that the weighted Poisson problem is only well posed if a is bounded from above and from below away from zero. However, the weight \(|\nabla v_n|^{p-2}\) may be degenerating, at points where \({|{\nabla u}|}=0\) or \({|{\nabla u}|}=\infty \).
To overcome this problem we will use a relaxation arguments. Therefore, we introduce in our algorithm two relaxation parameters \(\epsilon _-, \epsilon _+ \in (0,\infty )\) with \(\epsilon _- \leqslant \epsilon _+\) that ensure that the weight is truncated properly from below and above. In particular, we replace a by its truncation
Note that this is just the (pointwise) closest point projection of a to the truncation interval \( [\epsilon _-,\epsilon _+]\). The limits \(\epsilon _- \searrow 0\) and \(\epsilon _+ \nearrow \infty \) will recover the unrelaxed or original problem. We also write \(\epsilon :=[\epsilon _-, \epsilon _+]\) and interpret \(\epsilon \) both as a pair \(\{\epsilon _-,\epsilon _+\}\) and as the truncation interval \([\epsilon _-,\epsilon _+]\). We will write \(\epsilon \rightarrow [0,\infty ]\) as a short version of \(\epsilon _- \searrow 0\) and \(\epsilon _+ \nearrow \infty \). We will see later, see Corollary 15, that for f in the Lorentz space \(L^{d,1}(\Omega )\) the lower parameter \(\epsilon _-\) is the crucial one.
According to these considerations we propose the following algorithm:
Since \(0< \epsilon _{n,-} \leqslant \epsilon _{n,+}<\infty \) the equation for \(v_{n+1}\) in the algorithm is always well defined, since it is uniformly elliptic (with constant depending on \(\epsilon _n\)).
This algorithm is not completely new in the realm of quasi-linear equations. Such an iterative linearization approach is in fact called the Kačanov method in [17, 18] and we refer to those papers for additional references related to the history of this method for solving numerically quasi-linear equations. It was also proposed and analyzed to solve total variation minimization problems in image processing, which can be formally related to the 1-Laplace differential operator in [6, 27].
Unfortunately, the results obtained in these aforementioned papers cannot be applied straightforwardly to justify the convergence of the Kačanov iteration for equations involving the p-Laplace operator. In particular, to obtain quantitative estimates of convergence with precise rates, as we do in this paper, one needs to employ several finer tools, which have been explored in, e.g., [2, 11, 12, 24], precisely to handle singularities in nonlinear differential operators such as the p-Laplacian. In particular, the theory of N-functions, Orlicz spaces [19], shifted N-functions [11] and Lipschitz truncations, see [13] and [4] have been used systematically in the analysis of such nonlinear operators, allowing the development of a potential theory analogous to the one known of linear equations.
Besides these tools from nonlinear potential theory, the variational formulation of the algorithm, as introduced first in [6], and further used to analyze other related iteratively re-weighted least squares algorithms [10, 16], offers the right framework for the analysis also of the p-Kačanov iteration.
Taking inspiration from [6, 10], in Sect. 2 we provide the variational derivation of this algorithm based on the alternating minimization of a relaxed energy with two parameters.
If we apply the algorithm with fixed relaxation parameter \(\epsilon \) independent on n, i.e. \(0<\epsilon _- \leqslant \epsilon _+ < \infty \), then our iterates \(v_n\) converge to the unique minimizer \(u_\epsilon \) of another one-parameter relaxed energy \(\mathcal {J}_\epsilon \). We study this limit in Sect. 4 and present (linear) exponential rates of convergence.
In Sect. 3 we study how the minimizers \(u_\epsilon \) of the relaxed energy \(\mathcal {J}_\epsilon \) converge to the minimizer u of the original problem. This convergence can also be interpreted as a limit in the sense of \(\Gamma \)-convergence [3, 9]. Differently, e.g., from [6], we use a novel argument based on the Lipschitz truncation technique to establish a recovery sequence for the \(\Gamma -\lim \sup \). In particular, thanks to the finer tools mentioned above, we can go beyond a pure compactness argument as provided by the \(\Gamma \)-limit and derive precise rates of convergence depending on \(\epsilon \).
Finally, in Sect. 5 we combine the estimates of the two previous sections to deduce an overall error analysis with algebraic rates.
2 Variational formulation of the algorithm
In this section we show that the algorithm can be deduced from an alternating minimization of a relaxed energy. Recall that \(1 < p \leqslant 2\) throughout this article. Since the case \(p=2\) is just the standard Laplace problem, it suffices in the following to consider the case \(1<p<2\) only.
Let us introduce some standard notation. We use \(W^{1,p}(\Omega )\) and \(W^{1,p}_0(\Omega )\) for the Sobolev space without and with zero boundary values. We use c for a generic positive constant whose value may change from line to line. We use \(f \lesssim g\) for \(f \leqslant c\, g\). We also write \(f \eqsim g\) for \(f \lesssim g\) and \(g \lesssim f\).
The most important feature of the algorithm is that it only needs to solve linear sub-problems, which carry their own energy depending on the weight. Therefore, very much inspired by the work [6, 10] and with appropriate adjustments, we extend the energy by an additional parameter \(a \,:\, \Omega \rightarrow [0,\infty )\) such that the new functional is quadratic with respect to v. In particular, we define
This energy is well-defined for all \(v \in W^{1,p}_0(\Omega )\) and measurable \(a\,:\, \Omega \rightarrow [0,\infty )\) but might take the value \(\infty \).
This relaxed energy is convex with respect to (v, a). This follows from the fact that \(\beta (t,a) := \frac{1}{2} a^{p-2}t^2\) is convex on \([0,\infty )^2\), since
is nonnegative definite as \(a^{p-2}\geqslant 0\) and \(\det ((\nabla ^2 \beta )(t,a))=a^{2p-6} t^2 (2-p)(p-1)\geqslant 0\). Notice that in the latter lower bound we specifically used \(1 < p \leqslant 2\).
Remark 1
If \(p>2\), then the relaxed energy \(\mathcal {J}(v,a)\) is neither bounded from below nor convex with respect to a. Therefore, the algorithm derived below using the minimization with respect to a does not lead to a feasible problem for \(p>2\). See also Remark 21.
Note that \(\mathcal {J}(v,a)\) (for fixed a) is quadratic with respect to v and a minimization with respect to v leads formally to the elliptic equation
see (1.4) for its weak form.
Unfortunately, the ellipticity of this system degenerates for \(a(x)\rightarrow 0\) and \(a(x) \rightarrow \infty \). To overcome this problem we restrict the minimization with respect to a (for fixed v) to functions with values within a relaxation interval \([\epsilon _-, \epsilon _+] \subset (0,\infty )\), i.e. \(\epsilon _- \leqslant a(x) \leqslant \epsilon _+\). This minimization with respect to a (for fixed v) has a simple solution, namely
where \(\vee \) denotes the maximum and \(\wedge \) the minimum, since
This allows us to define for fixed \(\epsilon = [\varepsilon _-,\varepsilon _+] \subset [0,\infty ]\) another relaxed energy
This immediately implies that the relaxed energy \(\mathcal {J}_\epsilon (v)\) is monotonically decreasing with respect to \(\epsilon \), i.e., an increasing interval \(\epsilon \) in terms of inclusion decreases the energy \(\mathcal {J}_\epsilon (v)\).
This new relaxed energy \(\mathcal {J}_\epsilon \) somehow “hides” the constrained minimization with respect to a. We can write \(\mathcal {J}_\epsilon \,:\, W^{1,p}_0(\Omega ) \rightarrow \mathbb {R}\cup {\{{\infty }\}}\) explicitly as
with \(\kappa _\epsilon \,:\,\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}\) given by
Note that \(\frac{1}{p} t^p \leqslant \kappa _\epsilon (t)\) for all \(t \geqslant 0\) and \(\frac{1}{p} t^p = \lim _{\epsilon \rightarrow [0,\infty ]} \kappa _\epsilon (t)\) for all \(t \geqslant 0\). Since \(\kappa _\epsilon (t) \eqsim \epsilon _+^{p-2} t^2\) for large t, we see that \(\mathcal {J}_\epsilon (v) < \infty \) if and only if \(v \in W^{1,2}_0(\Omega )\). Moreover, \(\lim _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon (v) = \mathcal {J}(v)\) for all \(v \in W^{1,2}_0(\Omega )\) and \(\mathcal {J}(v) \leqslant \liminf _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon (v)\) for all \(v \in W^{1,p}_0(\Omega )\).
Based on the above observations it is natural to iteratively minimize \(\mathcal {J}(v,a)\) alternating between v and a. Certainly, we have also to increase the relaxation interval \(\epsilon \). Thus our algorithm reads as follows:
This is just the algorithm given in the introduction written in different form.
3 Convergence in the relaxation parameter
In this section we show that the minimizers \(u_\epsilon \) of the relaxed energy \(\mathcal {J}_\epsilon \) converge to the minimizer u of \(\mathcal {J}\) for \(\epsilon \rightarrow [0,\infty ]\) and derive an upper bound for the relaxation error.
Since \(\mathcal {J}_\epsilon (v) \geqslant \mathcal {J}(v)\) and \(\mathcal {J}\) is \(W^{1,p}_0(\Omega )\) coercive, it follows that \(\mathcal {J}_\epsilon \) is also coercive in \(W^{1,p}_0(\Omega )\). However, \(\mathcal {J}_\epsilon (v)<\infty \) requires \(v \in W^{1,2}_0(\Omega )\), as we have seen above. Certainly, there is a gap between the space \(W^{1,p}_0(\Omega )\) and \(W^{1,2}_0(\Omega )\). To close this gap we need a finer analysis of the energies, which requires the use of Orlicz spaces. We state in the following some standard results for these spaces, see for Example [19].
A function \(\phi :\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}\) is called an N-function if and only if there is a right-continuous, positive on the positive real line, and non-decreasing function \(\phi ':\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}\) with \(\phi '(0)=0\) and \(\lim _{t \rightarrow \infty } \phi '(t)=\infty \) such that \(\phi (t) = \int _0^{t}\phi '(\tau )\, d\tau \). An N-function is said to satisfy the \(\Delta _2\)-condition if and only if there is a constant \(c>1\) such that \(\phi (2t)\leqslant c\,\phi (t)\). For an N-function satisfying the \(\Delta _2\)-condition we define the Orlicz space to consist of those functions \(v \in L_{{\mathrm {loc}}}^1(\Omega )\) with \(\int _\Omega \phi (|v|)\, dx<\infty \). It becomes a Banach space with the norm \({\Vert {f}\Vert }_\phi := \inf {\{{\gamma >0\,:\, \int _\Omega \phi ({|{v}|}/\gamma )\,dx\leqslant 1}\}}\). The Orlicz-Sobolev space \(W^{1,\phi }(\Omega )\) then consists of those \(v \in L^\phi \) such that the weak derivative \(\nabla v\) is also in \(L^\phi \), equipped with the norm \({\Vert {v}\Vert }_\phi + {\Vert {\nabla v}\Vert }_\phi \). The space \(W^{1,\phi }_0(\Omega )\) denotes the subspace of those functions from \(W^{1,\phi }(\Omega )\) with zero boundary trace, which coincides with the closure of \(C^\infty _0(\Omega )\) in \(W^{1,\phi }(\Omega )\). For example choosing \(\phi (t):=\tfrac{1}{p} t^p\) we have \(L^\phi (\Omega ) = L^p(\Omega )\) and \(W^{1,\phi }_0(\Omega ) = W^{1,p}_0(\Omega )\).
The function \(\kappa _\epsilon \) cannot be an N-function, since \(\kappa _\epsilon (0)\ne 0\), . However, if we define
then \(\phi _\epsilon \) is actually an N-function. It can be verified that \(\phi _\epsilon \) satisfies the \(\Delta _2\)-condition with a constant independent of \(\epsilon \).
Since \(\phi _\epsilon (t) \eqsim \epsilon _+^{p-2} t^2\) for large t and \(\Omega \) is bounded, we have \(L^{\phi _\epsilon }(\Omega ) \eqsim L^2(\Omega )\). However, the constant of the embedding \(L^{\phi _\epsilon }(\Omega ) \hookrightarrow L^2(\Omega )\) depends on \(\epsilon \), so this equivalence is not of much use. Instead we use the chain of embeddings
with constants independent of \(\epsilon \). This follows from the fact that the Simonenko indices of \(\phi _\epsilon \) are within [p, 2]. We refer the reader to, e.g., [28, Chapter 2] for the details.
Since \(\phi _\epsilon \) is strictly convex and \(\kappa _\epsilon (t) = \phi _\epsilon (t) + \kappa _\epsilon (0)\), the energy \(\mathcal {J}_\epsilon \) admits a unique minimizer \(u_\epsilon \in W_0^{1,\phi _\epsilon }(\Omega )\) whose Euler-Lagrange equation is
At this we used that
Remark 2
Let us consider the special case \(\varepsilon _+=\infty \). Then, the derivative of the truncated function reads \(\phi _\epsilon '(t) = (\varepsilon _-\vee t)^{p-2} t\). A different modification, namely \((\varepsilon _-+ t)^{p-2} t\), would lead to the so-called shifted N-function of \(\frac{1}{p}t^p\), as introduced in more generality in [11], which has similar properties as our truncation functions. However, the version from this paper is more suitable for our energy relaxation, since it is closer to the original function \(\frac{1}{p} t^p\) on the truncation interval \(\epsilon \) (the derivatives agree there). See the Appendix for more information on uniformly convex Orlicz functions and their shifted verions.
Lemma 3
The functions \(\phi \) and \(\phi _\epsilon \) are uniformly convex Orlicz functions in the sense of Sect. B from the Appendix. The convexity constants are independent of \(\epsilon \).
Proof
The uniform convexity of \(\phi \) follows from \(\frac{\phi ''(t)\,t}{\phi '(t)}= (p-1)\) and Lemma (B.2). Now, for \(t \in (\epsilon _-,\epsilon _+)\) we have \(\frac{\phi _\epsilon ''(t)\,t}{\phi \epsilon '(t)}= (p-1)\), while for \(t \in (0,\infty ) \setminus [\epsilon _-,\epsilon _+]\) we have \(\frac{\phi _\epsilon ''(t)\,t}{\phi _\epsilon '(t)} = 1\). Hence, the claim for \(\phi _\epsilon \) follows again by Lemma (B.2). \(\square \)
Since \(W^{1,p}_0(\Omega )\) is the largest space, see (3.2), which contains both \(u_\epsilon \) and u, it is natural to consider all energies \(\mathcal {J}\) and \(\mathcal {J}_\epsilon \) as functionals on \(W^{1,p}_0(\Omega )\) with possible value \(\infty \).
Let us recall that the goal of this section is to show that \(u_\epsilon \) converges to u in \(W^{1,p}_0(\Omega )\). Since \(W^{1,p}_0(\Omega )\) is uniformly convex, strong convergence is a consequence of weak convergence and norm convergence, or equivalently, in this case, energy convergence \(\mathcal {J}(u_\epsilon ) \rightarrow \mathcal {J}(u)\). It is possible to show the weak convergence as well as that of the energy by means of \(\Gamma \)-convergence. Indeed, we will see in Remark 11 that \(\mathcal {J}_\epsilon \rightarrow \mathcal {J}\) in the sense of \(\Gamma \)-convergence. However, we will derive in the following much stronger results that provide us with a precise rate of convergence for the energies. This energy convergence implies strong convergence of the sequence, see the proof of Corollary 10.
Let us turn to the convergence of the energies \(\mathcal {J}(u_\epsilon ) \rightarrow \mathcal {J}(u)\) for \(\epsilon \rightarrow [0,\infty ]\). Since \(\mathcal {J}_\epsilon \) is monotonically decreasing with respect to \(\epsilon \), it follows from the minimizing properties of u and \(u_\epsilon \) that
Therefore, it suffices to prove the stronger claim
In fact, we will later need this stronger estimate in the other sections.
It follows from the minimizing property of \(u_\epsilon \) that
So it would be natural to estimate \(\mathcal {J}_\epsilon (u) - \mathcal {J}(u)\) in terms of \(\epsilon \) and u. However, the solution u is unfortunately a priori only a \(W^{1,p}_0\)-function, so \(\mathcal {J}_\epsilon (u)\) might be infinity. Hence, we cannot assure that this difference is small. This is only possible if we assume higher regularity of u. In order to treat arbitrary right-hand sides \(f \in (W^{1,p}_0(\Omega ))^*\) at this point, we have to use a much more subtle argument. For this we need a result from [13, Subsection 3.5] and [4, Theorem 2.7], which allows to change u on a small set such that it becomes a Lipschitz function. This technique is known as the Lipschitz truncation technique. Its origin goes back to [1]. As a tool we need the Hardy-Littlewood operator, e.g. [25],
where \({|{B_r(x)}|}\) denote the Lebesgue measure of \(B_r(x)\).
Theorem 4
[Lipschitz trunction [4, 13]] Let \(v\in W_0^{1,p}(\Omega )\) and for all \(\lambda >0\) define
Then, there exists an approximation \(T_\lambda v\in W_0^{1,\infty }(\Omega )\) of v with the following properties:
- (a)
\(\{v\ne T_\lambda v\}\subset \mathcal {O}_\lambda \).
- (b)
\(\Vert T_\lambda v\Vert _{L^p(\Omega )}\lesssim \Vert v\Vert _{L^p(\Omega )}\).
- (c)
\(\Vert \nabla T_\lambda v\Vert _{L^p(\Omega )}\lesssim \Vert \nabla v\Vert _{L^p(\Omega )}\).
- (d)
\(\vert \nabla T_\lambda v\vert \lesssim \lambda \chi _{\mathcal {O}_\lambda } + \vert \nabla v\vert \chi _{\Omega \setminus \mathcal {O}_\lambda } \leqslant \lambda \) almost everywhere.
- (e)
\({\Vert {\nabla (v-T_\lambda v)}\Vert }_{L^p(\Omega )} \lesssim {\Vert {\nabla v}\Vert }_{L^p(\mathcal {O}_\lambda )}\).
- (f)
\(v_\lambda \rightarrow v\) in \(W^{1,p}_0(\Omega )\) as \(\lambda \rightarrow \infty \).
All our convergence results concerning the relaxation parameter \(\varepsilon \) are based on the following result, which shows how the energy relaxation depends on the truncation interval \(\varepsilon \).
Theorem 5
The estimate
holds for all \(\lambda \leqslant \varepsilon _+/c_1\), where \(c_1\) is the (hidden) constant from Theorem 4 (d).
Proof
Let \(\lambda \leqslant \varepsilon _+/c_1\) and let \(T_\lambda u\) be the Lipschitz truncation of u. Then
Using the minimizing property of \(u_\varepsilon \) and the equation for u we get
Using \({|{\nabla T_\lambda u}|} \leqslant \varepsilon _+\), \(\kappa _\epsilon (t)=\tfrac{1}{p} t^p\) for \(t \in [\varepsilon _-,\varepsilon _+]\), \(\kappa _\epsilon (t) \leqslant \tfrac{1}{p} \varepsilon _-^p\) for \(t \in [0, \varepsilon _-]\), and \(T_\lambda u=u\) outside of \(\mathcal {O}_\lambda \) we get
This, the previous estimate and Theorem 4 (e) imply
This proves the claim. \(\square \)
Corollary 6
\(\mathcal {J}_\epsilon (u_\varepsilon ) \rightarrow \mathcal {J}(u)\) and \(\mathcal {J}(u_\varepsilon ) \rightarrow \mathcal {J}(u)\) as \(\varepsilon \rightarrow [0,\infty ]\).
Proof
Due to (3.5) it suffices to prove \(\mathcal {J}_\epsilon (u_\varepsilon ) \rightarrow \mathcal {J}(u)\). Consider the right-hand side of (3.7) with \(\lambda := \varepsilon _+/c_1\). The first term goes to zero as \(\varepsilon _-\rightarrow 0\). Now consider the second term. Since \(\mathcal {O}_\lambda (u) \subset {\{{M(\nabla u) > \lambda }\}}\) and \(\nabla u\in L^p(\Omega )\) we get by the weak \(L^p\)-estimate of the maximal operator \(|\mathcal {O}_\lambda (u)| \lesssim \lambda ^{-p} {\Vert {\nabla u}\Vert }_p^p\). Therefore \(|\mathcal {O}_\lambda (u)|\rightarrow 0\) as \(\varepsilon _+\rightarrow \infty \), which implies \(\int _{\mathcal {O}_\lambda (u)} |\nabla u|^p \, dx \rightarrow 0\) as \(\varepsilon _+\rightarrow \infty \). \(\square \)
Before we continue we need the following natural quantities, see [11].
Definition 7
For \(P\in \mathbb {R}^d\) we define
Moreover, by \(A := A_{[0,\infty ]}\) and \(V := V_{[0,\infty ]}\) we denote the unrelaxed versions.
The following two lemmas are modifications of similar results of [11, Lemma 3] and [12, Lemma 16]. In fact, they follow from the properties of uniformly convex Orlicz functions; see Sect. B from the Appendix for more details.
Lemma 8
For \(P,Q\in \mathbb {R}^d\)
where the constants can be chosen independently of \(\varepsilon \).
Proof
This follows directly from the uniform convexity of \(\phi _\epsilon \), see Lemma 3, Lemma 41 and Lemma 40. \(\square \)
Lemma 9
The following estimates hold for arbitrary \(v\in W_0^{1,\phi _\epsilon }(\Omega )\) and \(u_\varepsilon \) being the minimizer of \(\mathcal {J}_\epsilon \):
In particular, for the case where \(\varepsilon =[0,\infty ]\) the statement actually implies also
Proof
This is just Lemma 42 applied to \(\phi _\epsilon \). \(\square \)
We are now prepared to show the convergence of minimizers \(u_\epsilon \) of \(\mathcal {J}_\epsilon \) to u.
Corollary 10
\(u_\epsilon \rightarrow u\) in \(W^{1,p}_0(\Omega )\) as \(\varepsilon \rightarrow [0,\infty ]\).
Proof
Due to Corollary 6 we have \(\mathcal {J}(u_\epsilon ) - \mathcal {J}(u) \rightarrow 0\) as \(\epsilon \rightarrow [0,\infty ]\). Now Lemma 9 for the case where \(\epsilon =[0,\infty ]\) and \(v =u_\varepsilon \) implies \(V(\nabla u_\epsilon ) \rightarrow V(\nabla u)\) in \(L^2(\Omega )\). It follows from the shift-change-lema, see Corollary 44 or [12, Corollary 26], that for all \(\delta >0\) there exists \(c_\delta >0\) such that
This and \(V(\nabla u_\epsilon ) \rightarrow V(\nabla u)\) in \(L^2(\Omega )\) implies \(\nabla u_\epsilon \rightarrow \nabla u\) in \(L^p(\Omega )\). \(\square \)
Remark 11
It is also possible to deduce \(\mathcal {J}_\epsilon (u_\epsilon ) \rightarrow \mathcal {J}(u)\) and \(u_\epsilon \rightarrow u\) in \(W^{1,p}_0(\Omega )\) by means of \(\Gamma \)-convergence: As the underlying topological space we choose \(W^{1,p}_0(\Omega )\) equipped with the weak topology. Then the Lipschitz truncation provides a recovery sequence for \(v \in W^{1,p}_0(\Omega )\) implying \(\mathop {\Gamma \text {-}\lim }\nolimits _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon = \mathcal {J}\). Indeed, it follows as in the proof of Theorem 5 that for all \(v \in W^{1,p}_0(\Omega )\)
So the properties of the Lipschitz truncation, see Theorem 4 (f), imply that the right-hand side goes to zero as \(\epsilon \rightarrow [0,\infty ]\). Hence, \(T_{\varepsilon _+/c_1} v\) is a recovery sequence of v.
Moreover, \(\mathcal {J}_\epsilon \geqslant \mathcal {J}\), so the standard theory of \(\Gamma \)-convergence [3, 9] proves \(u_\epsilon \rightharpoonup u\) in \(W^{1,p}_0(\Omega )\) and \(\mathcal {J}_\epsilon (u_\epsilon ) \rightarrow \mathcal {J}(u)\) for \(\epsilon \rightarrow [0,\infty ]\). The uniform convexity of \(W^{1,p}_0(\Omega )\) implies \(u_\epsilon \rightarrow u\) in \(W^{1,p}_0(\Omega )\).
To our knowledge this is the first time that the Lipschitz truncation is used to construct a recovery sequence for the \(\Gamma -\lim \sup \) in a \(\Gamma \)-convergence argument related to energies on \(W^{1,p}_0(\Omega )\).
Up to now, we discussed the convergence of \(u_\epsilon \rightarrow u\) without any additional assumptions on the data \(f \in (W_0^{1,p}(\Omega ))^*\) and the domain \(\Omega \). If f is more regular and \(\partial \Omega \) is suitably smooth, then we obtain specific rates for the convergence. The rates of convergence will follow from the regularity of \(\nabla u\) in terms of the weak-\(L^q\) spaces \(L^{q,\infty }(\Omega )\), which consists of all functions v such that
Lemma 12
Let \(\nabla u\in L^{q,\infty }(\Omega )\) for some \(q>p\). Then,
Proof
First note that \(M:L^{q,\infty }(\Omega )\rightarrow L^{q,\infty }(\Omega )\) is bounded. This follows for example by extrapolation theory, see [8, Theorem 1.1]. In particular,
Moreover, let \(L^{q,1}(\Omega )\) denote the usual Lorentz space, which consists of functions v such that
Since \((L^{s',1})^* =L^{s,\infty }\) for \(1<s<\infty \) and \(\frac{1}{s}+\frac{1}{s'}=1\) we obtain
where \({|{\mathcal {O}_\lambda }|}\) denotes the Lebesgue measure of \(\mathcal {O}_\lambda \). Applying Theorem 5 with \(\lambda := \varepsilon _+/c_1\) yields the statement. \(\square \)
To exemplify the consequences of Lemma 12 we combine it with the regularity results of [7, 14]:
Theorem 13
([7], Theorems 1.3 and 1.4) Let \(\Omega \subset \mathbb {R}^d\) be convex or let its boundary \(\partial \Omega \in W^2 L^{d-1,1}\) (for example \(\partial \Omega \in C^2\) sufficesFootnote 2) and additionally \(f\in L^{d,1}(\Omega )\). Then \(\nabla u \in L^\infty (\Omega )\).
Theorem 14
([14], (4.3)) Let \(\Omega \) be a polyhedral domain where the inner angle is strictly less than \(2\pi \) and \(f\in L^{p'}(\Omega )\) and \(\frac{1}{p} + \frac{1}{p'}=1\). Then \(\nabla u \in L^{\frac{pd}{d-1},\infty }(\Omega )\).
Proof
Actually, it is proven in [14] (4.3) that \({|{\nabla u}|}^{\frac{p}{2}} \in \mathcal {N}^{\frac{1}{2},2}(\Omega )\) (Nikolskiĭ space). Now, one can use the embedding \(\mathcal {N}^{\frac{1}{2},2}(\Omega )\hookrightarrow L^{\frac{2d}{d-1},\infty }(\Omega )\) of Lemma 26 in the Appendix. \(\square \)
Corollary 15
Under the assumptions of Theorem 13 we have
Proof
Since \(\nabla u \in L^\infty (\Omega )\), we have \(M(\nabla u)\in L^\infty (\Omega )\) and so for \(\lambda :=\varepsilon _+/c_1\) and \(\varepsilon _+\) large enough, \(\mathcal {O}_\lambda (u)=\emptyset \). Hence, Theorem 5 implies the estimate. \(\square \)
Corollary 16
Under the assumptions of Theorem 14 we have
Proof
Since \(\nabla u\in L^{\frac{pd}{d-1},\infty }(\Omega )\), an application of Lemma 12 finishes the pf. \(\square \)
Remark 17
The choice \(f=0\) and hence \(u=0\) gives \(\mathcal {J}_\epsilon (u) = \kappa _\epsilon (0) {|{\Omega }|} \eqsim \varepsilon _-^p\). This shows that the estimate in Corollary 15 is sharp.
4 Convergence of the Kačanov-iteration
In this section we study the convergence of the Kačanov-iteration for fixed relaxation parameter \(\epsilon =[\varepsilon _-,\varepsilon _+]\). In particular, for \(v_0\in W_0^{1,2}(\Omega )\) arbitrary we calculate recursively \(v_{n+1}\) by
We will show that \(v_n\) converges to the minimizer \(u_\epsilon \) of the relaxed energy \(\mathcal {J}_\epsilon \). In particular, we show exponential decay of the energy error \(\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\epsilon )\). The proof is based on the following estimate, proved below.
Theorem 18
There is a constant \(c_K>1\) such that
holds for \(\delta := \tfrac{1}{c_K} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}\).
This theorem says that in each iteration we reduce the energy by a certain part of the remaining energy error. This implies
As a direct consequence we will obtain the following exponential convergence result.
Corollary 19
There is a constant \(c_K>1\) such that
holds for \(\delta := \tfrac{1}{c_K} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}\).
Let us get to the proof of Theorem 18.
Proof (Proof of Theorem 18)
Using Lemma 9, the equation (3.3) for \(u_\varepsilon \), the equation (4.1) for \(v_{n+1}\), and Young’s inequality (see Remark 32) we get, for arbitrary \(\gamma > 0\),
Let us define
For the first term I we calculate with the equation (4.1) for \(v_{n+1}\)
To establish the inequality above, we used the fact that
and
For the second term II we use \(\varepsilon _+^{p-2}\leqslant \tfrac{\phi _\epsilon '(t)}{t} \leqslant \varepsilon _-^{p-2}\), implying \(\tfrac{\phi _\epsilon '(t)}{t} \leqslant (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}\tfrac{\phi _\epsilon '(s)}{s}\), for any \(s,t \geqslant 0\), Lemma 8 and Lemma 9 to get
Putting all estimates together we get
Now, \(\max _{\gamma >0} \gamma (1-c\gamma (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}) = \tfrac{1}{4c} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}\) yields the statement. \(\square \)
Example 20
(Peak function) Let \(\Omega := B_1(0)\) and \(f(x)=-{\text {div}}(\tfrac{x}{|x|})\). Then \(f\notin L^1(\Omega )\) but still \(f \in (W_0^{1,1}(\Omega ))^*\subset (W_0^{1,p}(\Omega ))^*\) . Then the minimizer of \(\mathcal {J}\) is given by \(u(x) = 1-{|{x}|}\), which look like a peak. Since \(|\nabla u|\equiv 1\), the factor \({|{\nabla u}|}^{p-2}\) in the p-Laplace operator does not appear for the minimizer. So in this case u also minimizes every \(\mathcal {J}_\epsilon \) as long as \(\varepsilon _-\leqslant 1\) and \(\varepsilon _+\geqslant 1\). This follows from
for all \(v\in W_0^{1,\phi _\epsilon }(\Omega )\).
Let us see how our algorithm performs with the starting value \(v_0:=0\). It is easy to see that \(v_n = \alpha _n u\) with
Since \(p\in (1,2)\) one can show \(\alpha _ n = \varepsilon _-^{(2-p)^n}\) by induction and
Note that
Moreover,
This estimate with \(s:=(2-p)^n \in (0,1]\) and \(t := \varepsilon _-^s \in (0,1]\) gives
is sharp. Indeed, the energy differences \(\mathcal {J}(v_n)-\mathcal {J}(u) = \mathcal {J}_\epsilon (v_n)-\mathcal {J}(u)\) asymptotically behave like \(\tfrac{1}{2}|B_1(0)|(p-1)\ln (\varepsilon _-)^2(2-p)^{2n}\) for large n, in view of (4.4).
This shows that it is impossible to get an energy reduction as in (4.2) with \(\delta \) independent of \(\epsilon \). Indeed, Corollary 19 would imply
which contradicts the above asymptotic estimate (4.5).
Nevertheless, our asymptotic shows that in this particular case
with \(1-\delta = (2-p)^2 <1\) independent of \(\epsilon \). Therefore, it remains open if such an estimate holds in the general case.
Remark 21
Although our considerations are all under the assumption \(1<p\leqslant 2\) it is interesting to check how the algorithm performs in the case \(p> 2\) for our Example 20.
If \(p\geqslant 3\) and \(\varepsilon _+:= \tfrac{1}{\varepsilon _-}\) for some \(\varepsilon _-<1\), then it follows from (4.3) that
Therefore, the Kačanov iteration does not converge as \(p\geqslant 3\).
If \(p \in (2,3)\) and \(\varepsilon _+:= \tfrac{1}{\varepsilon _-}\), then it follows from (4.3) that
and \(v_n\) still converges to u.
5 Algebraic rate
As we learned in the last section the Kačanov iteration converges for fixed \(\epsilon \), but the rate depends badly on the choice of the relaxation interval \(\varepsilon =[\varepsilon _-,\varepsilon _+]\). Furthermore, we have algebraic convergence of the error \(\mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u)\) induced by the relaxation. We will combine these results to deduce an algebraic rate of the full error \(\mathcal {J}_{\epsilon _n}(v_n) -\mathcal {J}(u)\) in terms of n for a specific predefined choice of \(\epsilon _n\). To achieve our goal we will use that \(|\nabla u|\in L^{q,\infty }(\Omega )\) for some \(q>p\), which is justified by Theorems 13 and 14.
Let us consider a sequence of solutions created by our relaxed p-Kačanov algorithm. In particular, \(\epsilon _n=[\epsilon _{n,-},\epsilon _{n,+}]\) is now an increasing sequence of intervals. Then exactly as in Theorem 18 we get the following estimate.
Theorem 22
There is a constant \(c_K>1\) such that
holds for \( \delta _n := \tfrac{1}{c_K} (\tfrac{\epsilon _{n,-}}{\epsilon _{n,+}})^{2-p}\).
Since \(\epsilon _n \subset \epsilon _{n+1}\), we have \(\mathcal {J}_{\epsilon _{n+1}} \leqslant \mathcal {J}_{\epsilon _n}\). This and Theorem 22 imply
Now, Lemma 12 and \(|\nabla u|\in L^{q,\infty }(\Omega )\) ensure the existence of \(c_R>0\) such that
This and the previous estimate therefore imply
Without the last term \(\delta _{n+1} c_R(\epsilon _{n,-}^p + \epsilon _{n,+}^{-(q-p)})\) we would get a reduction of the error \(\mathcal {J}_{\epsilon _n}(v_n) -\mathcal {J}(u)\) by the factor \((1-\delta _n)\). On the other hand this last term is small if \(\epsilon _{n,-} \rightarrow 0\) and \(\epsilon _{n,+} \rightarrow \infty \), so it should not bother too much. Nevertheless, the reduction factor \((1-\delta _n)\) tends to 1 if \(\epsilon _{n,-} \rightarrow 0\) and \(\epsilon _{n,+} \rightarrow \infty \). The idea however is the following: if \(\delta _n\) goes to zero slowly, then the product \(\prod _{i=1}^n (1-\delta _i)\) still tends to zero algebraically.
Let us be more precise. We define another relaxed energy \(\mathcal {G}_n\) by
where \(K_1>0\) will be determined below. Moreover, choose
and define
Then
In particular, the algorithm with this choice of \(\epsilon _n\) reads as follows:
We continue to derive a decay estimate for \(\mathcal {G}_n - \mathcal {G}_\infty \).
Lemma 23
There exists \(K=K(\alpha ,\beta ,p,q)\) (which appears in the definition of \(\mathcal {G}_n\)) and some \(c_3=c_3(\alpha ,\beta ,p,q)\geqslant 1\), such that for all \(n \in \mathbb {N}\)
Proof
Define
Hence it follows by Lemma 27 in the Appendix that there exists \(c_2 = c_2(\alpha ,\beta ,p,q) \geqslant 1\) with
In particular, \(\rho _n\) satisfies a decay estimate!
On the other hand it follows from Theorem 22 that
We deduce from (5.1), the definition of \(\epsilon _n\) and \(\rho _n\) that
This and the previous estimate prove
Since \(\mathcal {G}_n = \mathcal {J}_{\epsilon _n}(v_n) + K_1\rho _n\), it follows together with (5.7) that
We finally fix \(K_1\): We choose \(K_1\) so large such that
which is always possible. Combining this with our previous estimates we deduce
This proves the theorem with \(c_3 = 2\, \max {\{{c_K,c_2}\}}\). \(\square \)
We are now able to present our convergence result.
Theorem 24
Let \(\nabla u\in L^{q,\infty }(\Omega )\) for some \(q>p\) (as given for example in Theorem 13 or Theorem 14). Then, the sequence \((v_n)_{n\in \mathbb N}\) produced by the algorithm above described satisfies
where \(c_3\) is the constant of Lemma 23. In particular, the energy error decreases at least algebraically.
Proof
The estimate \(\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u) \leqslant \mathcal {G}_n-\mathcal {G}_\infty \) is obvious, so it remains to prove the decay of \(\mathcal {G}_n - \mathcal {G}_\infty \). If follows from Lemma 23 that, for \(n \in \mathbb {N}\),
Now,
This proves the lemma. \(\square \)
Remark 25
We have seen that the choice \(\epsilon _n = [(n+1)^{-\alpha }, (n+1)^\beta ]\) ensures that the error decreases at least with an algebraic rate. However, the decay of the relaxed energy error \(\mathcal {G}_n- \mathcal {G}_\infty \) can never be faster than algebraical with this choice of \(\epsilon _n\). Hence, this choice is also very restrictive. From the numerical experiments we performed, we have seen that it is possible to decrease \(\epsilon _{n,-}\) and increase \(\epsilon _{n,+}\) much faster and still obtain convergence. Moreover, the observed convergence is much faster than algebraic and more of exponential type. We will present the details of such numerical experiments in a subsequent work. Let us summarize: the algorithm of this section ensures an algebraic convergence rate, but in practice we expect a better behavior for other, perhaps adaptive, choices of \(\epsilon _n\), still to be fully investigated.
6 Numerical experiments
We have performed numerous experiments on the basis of the adaptive finite element method with piecewise linear elements. We developed preliminary versions of error estimators that capture the effect of the truncation, the adaptivity of the mesh and the fixpoint iteration. Let \(v_n\) denote the iterated solution generated by the algorithm, then we used the following ad hoc estimators:
We use
$$\begin{aligned} \eta ^2_{\epsilon ^+}(v_n)&:= \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{(\epsilon _{n,-},\infty )}(v_n) \end{aligned}$$to measure the effect of the upper truncation bound \(\epsilon _{n,+}\) and
$$\begin{aligned} \eta ^2_{\epsilon ^-}(v_n)&:= \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{(0,\epsilon _{n,+})}(v_n) \end{aligned}$$to measure the effect of the lower truncation bound \(\epsilon _{n,-}\).
We use the optimal estimators of [2, 12] with the Orlicz function \(\phi _{\epsilon _n}\) to estimate the error due to mesh refinement, i.e. on elements T we use the estimators
$$\begin{aligned} \eta ^2_h(v_n)&:= \int _T (\phi _{\epsilon _n, {|{\nabla v_n}|}})^*(h_T {|{f}|})\,dx + \sum _{\gamma \subset \partial T} h_\gamma \! \int _\gamma {\big |{\llbracket {V_{\epsilon _n}(\nabla v_n)}\rrbracket _\gamma }\big |}^2\,ds. \end{aligned}$$To measure the error due to the fixpoint iteration (for \(n \geqslant 1\))
(6.1)which is in fact an upper bound for \(\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )\) and \(\mathcal {J}_\epsilon (v_{n-1})-\mathcal {J}_\epsilon (u_\varepsilon )\).
We used these estimators to implement a fully adaptive version of our relaxed p-Kačanov iteration.
Let us present three experiments that measure different aspects. We have chosen quite critical situations and small exponents p in order to see how the algorithm behaves in particularly bad situations. In particular, we have chosen a quite small exponent \(p= \frac{16}{15}\) for all of our experiments presented here. (The results of our numerical experiments behave much nicer for p closer to 2.) The results for larger exponents are in fact much nicer.
The bump On \(\Omega = [-1,1]^2\) choose f and the boundary values of u such that \(u(x)=(x_1^2\!-\!1)(x_2^2\!-\!1)\). There is only one critical point at (0, 0), where \(\nabla u(0,0)=0\). This example is chosen to see how the algorithm behaves for smooth functions with isolated extrema.Footnote 3
The needle On \(\Omega = [-1,1]^2\) choose f and the boundary values of u such that \(u(x) = {|{x}|}^{1-\frac{1}{p}} -1\). In this case \(\nabla u(x) = {|{x}|}^{-\frac{1}{p}} \frac{x}{{|{x}|}}\) and \(A(\nabla u) = {|{x}|}^{-\frac{1}{p'}} \frac{x}{{|{x}|}}\) and \(V(\nabla u) = {|{x}|}^{-\frac{1}{2}} \frac{x}{{|{x}|}}\). This example is chosen such that \(V(\nabla u) \in W^{\frac{1}{2}}L^{2,\infty }(\Omega )\) meaning that half a derivativeFootnote 4 of \(V(\nabla u)\) is still in the Lorentz space \(L^{2,\infty }\). This corresponds to the regularity of a p-harmonic function on a slit domain. It is the critical regularity that allows optimal estimates for the adaptive finite element method, while for uniform refinement a rate of \(O(h^{1/2})\) instead of the optimal one O(h).
Constant force in slit domain Choose \(f=2\) on the slit domain \(\Omega = (-1,1)^2 \setminus [0,1]^2\) and \(u=0\) on its boundary. There is no explicit formula for the exact solution (different from the linear case of \(p=2\)) for the solution. The reentrant corner reduces the regularity of the solution.
There is a nice gap between our theory and the numerical experiments that we performed. In fact, our experiments shows a significantly faster convergence rate. This shows that we are on a good track and have developed a good algorithm.
Figures 1, 2 and 3 show the results of our experiments. The pictures show a log-log-plot of energy accuracy vs. the computed costs. Let us explain the picture from Fig. 1 in more detail. The others figures are analogously. The dotted black line is the most important line and shows the energy difference \(\mathcal {J}_\epsilon (u_n)-\mathcal {J}(u)\), which is the measure of the error between the computational solution and the exact solution. The lines \(\epsilon _+\) and \(\epsilon _-\) represent the truncation parameters \((\epsilon _-,\epsilon _+)\). The other lines represent error indicators for \((\epsilon _-,\epsilon _+)\) named \(\eta ^2_{\epsilon ^+}\) and \(\epsilon ^2_{\epsilon ^-}\), the fixpoint iteration , and refinement \(\eta _h^2(u_h)\). The straight black line \(\text {costs}^{-1}\) is the optimal convergence rate (1/costs), where the costs are the accumulative sum of the degrees of freedom for each step that requires solving a linear system. (Here we have assumed a linear cost for solving the linear system, which might be possible with a multi grid method.) It is important that we use the accumulated cost instead of the degrees of freedom, since only this truly measures the effort, in particular if the number of fixpoint iterations increases.
Let us explain the numerical results in more detail.
Bump The algorithm converges optimally with respect to the accumulated costs. The exact solution has a bounded gradient, so the upper truncation bound \(\epsilon _{n,+}\) does not increase much. There is only one critical point at \(x=0\), where \(\nabla u(0)=0\). Thus, the algorithm decreases the lower truncation bound \(\epsilon _{n,-}\) quite moderately, since the error from truncation appears only in a small neighborhood around zero. All estimators decrease nicely with the optimal rate (1/cost). The number of fixpoint iterations remains bounded, so the cost is proportional to the current number of degrees of freedom.
The needle The algorithm converges optimally with respect to the accumulated costs. The exact solution has unbounded gradient at the isolated point zero. Therefore, the upper truncation bound \(\epsilon _{n,+}\) is increased by the algorithm. This upper truncation starts quite late, since it is only necessary in set of small measure. The number of fixpoint iterations remain bounded.
Constant force in slit domain The gradient of the exact solution is bounded, so the upper truncation bound \(\epsilon _{n,+}\) does not increase much. The energy error still decreases very fast. It decreases almost optimally with respect to the accumulated cost, but the slope seems slightly worse. It is however still much faster, than our worst-case theory predicts.
Overall, we see that our algorithm converges with a rate, which is optimal in many cases.
Notes
All our results also hold for the p-Poisson system, where the functions are vector-valued. To this end sometimes \(\mathbb {R}^d\) has to be replaced by \(\mathbb {R}^{N \times d}\).
The condition \(\partial \Omega \in W^2 L^{d-1,1}\) means that the weak Hessians of the local maps characterizing the boundary are in the Lorentz space \(L^{d-1,1}\).
Note that \({|{f(x)}|}\) behaves like \({|{x}|}^{p-2}\). Thus, \(f \in L^2\) for all \(p>1\) but \(f \in L^{p'}\) only for \(p > \sqrt{2}\). This makes potential troubles with the used error estimator, but since the error estimator is also truncated with \(\epsilon \) the effect is manageable.
This is understood in a heuristic way only: half a derivative of \(V(\nabla u)\) growths like the function \({|{x}|}^{-1}\), which is in the Lorentz space \(L^{2,\infty }(\Omega )\).
We thank W. Sickel for explaining the details to us.
See the beginning of Sect. 3 for the definition of an N-function.
References
Acerbi, E., Fusco, N.: An approximation lemma for \(W^{1,p}\) functions. In: Material Instabilities in Continuum Mechanics (Edinburgh, 1985–1986), Oxford Univ. Press, New York, pp. 1–5 (1988)
Belenki, L., Diening, L., Kreuzer, C.: Optimality of an adaptive finite element method for the \(p\)-Laplacian equation. IMA J. Numer. Anal. 32(2), 484–510 (2012)
Braides, A.: \(\Gamma \)-Convergence for Beginners. Oxford Lecture Series in Mathematics and its Applications, vol. 22. Oxford University Press, Oxford (2002)
Bulíček, M., Diening, L., Schwarzacher, S.: Existence, uniqueness and optimal regularity results for very weak solutions to nonlinear elliptic systems. Anal. PDE 9(5), 1115–1151 (2016)
Canuto, C., Urban, K.: Adaptive optimization of convex functionals in Banach spaces. SIAM J. Numer. Anal. 42(5), 2043–2075 (2005)
Chambolle, A., Lions, P.L.: Image recovery via total variation minimization and related problems. Numer. Math. 76(2), 167–188 (1997)
Cianchi, A., Maz’ya, V.: Global Lipschitz regularity for a class of quasilinear elliptic equations. Commun. Partial Differ. Equ. 36(1), 100–133 (2010). https://doi.org/10.1080/03605301003657843
Cruz-Uribe, D., Martell, J.M., Pérez, C.: Extrapolation from \(A_\infty \) weights and applications. J. Funct. Anal. 213(2), 412–439 (2004)
Dal Maso, G.: An Introduction to \(\Gamma \)-Convergence. Progress in Nonlinear Differential Equations and their Applications, vol. 8. Birkhäuser Boston Inc, Boston (1993)
Daubechies, I., DeVore, R., Fornasier, M., Güntürk, C.S.: Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 63(1), 1–38 (2010)
Diening, L., Ettwein, F.: Fractional estimates for non-differentiable elliptic systems with general growth. Forum Mathematicum 20(3), 523–556 (2008). https://doi.org/10.1515/FORUM.2008.027
Diening, L., Kreuzer, C.: Linear convergence of an adaptive finite element method for the \(p\)-Laplacian equation. SIAM J. Numer. Anal. 46, 614–638 (2008)
Diening, L., Kreuzer, C., Süli, E.: Finite element approximation of steady flows of incompressible fluids with implicit power-law-like rheology. SIAM J. Numer. Anal. 51(2), 984–1015 (2013). https://doi.org/10.1137/120873133
Ebmeyer, C.: Mixed boundary value problems for nonlinear elliptic systems with \(p\)-structure in polyhedral domains. Mathematische Nachrichten 236(1), 91–108 (2002)
Edmunds, D., Evans, W., Karadzhov, G.: Sharp estimates of the embedding constants for Besov spaces. Rev. Mat. Complut. 19(1), 161–182 (2006)
Fornasier, M., Rauhut, H., Ward, R.: Low-rank matrix recovery via iteratively reweighted least squares minimization. SIAM J. Optim. 21(4), 1614–1640 (2011)
Garau, E.M., Morin, P., Zuppa, C.: Convergence of an adaptive Kačanov FEM for quasi-linear problems. Appl. Numer. Math. 61(4), 512–529 (2011)
Han, W., Jensen, S., Shimansky, I.: The Kačanov method for some nonlinear problems. Appl. Numer. Math. 24(1), 57–79 (1997)
Krasnosel’skiĭ, M., Rutickiĭ, Y.: Convex Functions and Orlicz Spaces. Cambridge Univ Press, Cambridge (1961)
Kufner, A., Oldrich, J., Fučík, S.: Function Spaces. Noordhoff International Pub, Leyden (1977)
Ladyzhenskaya, O.: New equations for the description of motion of viscous incompressible fluids and solvability in the large of boundary value problems for them. Proc. Stek. Inst. Math. 102, 95–118 (1967)
Lindqvist, P.: Notes on the \(p\)-laplace equation. http://www.math.ntnu.no/lqvist/p-laplace.pdf (2006). Accessed 28 Nov 2017
Peetre, J.: Espaces d’interpolation et théorème de Soboleff. Ann. Inst. Fourier (Grenoble) 16(1), 279–317 (1966)
Ružička, M., Diening, L.: Non-Newtonian Fluids and Function Spaces. NAFSA 8–Nonlinear Analysis. Function Spaces and Applications, vol. 8, pp. 94–143. Czech. Acad. Sci, Prague (2007)
Stein, E., Murphy, T.: Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals. Princeton, Princeton mathematical series (1993)
Triebel, H.: Interpolation Theory, Function Spaces, Differential Operators. North-Holland Pub. Co, Amsterdam (1978)
Vogel, C.R., Oman, M.: Fast, robust total variation-based reconstruction of noisy, blurred images. IEEE Trans. Image Process. 7(6), 813–824 (1998)
Wank, M.: A Kačanov Type Iteration for the \(p\)-Poisson Problem. Doctoral thesis, University of Osnabrück (2016)
Acknowledgements
Open Access funding provided by Projekt DEAL. Many thanks to Johannes Storn who did the final numerical experiments of this article. Finally, we thank the anonymous reviewer for the careful reading of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We acknowledge support for the Article Processing Charge by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University. This research was supported by the DFG project “Optimal Adaptive Numerical Methods for p-Poisson Elliptic Equations”.
Appendices
Auxiliaries
The following embedding is probably well known. However, since we could not find a reference for it and we need it for the proof of Theorem 14, we include a short proof of it.Footnote 5 In the following \(\mathcal {N}^{\frac{1}{2},2}\) denotes the usual Nikolskiĭ space, see e.g. [20].
Lemma 26
\(\mathcal {N}^{\frac{1}{2},2}(\Omega )\hookrightarrow L^{\frac{2d}{d-1},\infty }(\Omega )\).
Proof
We will not recapitulate the definitions of the occurring spaces. First of all, we use the identity
as stated in [20, Remark 8.4.5], where \(B_{p,q}^s(\Omega )\) denotes the standard Besov spaces. In [26, Theorems 1 and 2 in 4.3.1] we find the interpolation pair \(\{B^{\frac{1}{4}}_{2,1} (\Omega ), B^{\frac{3}{4}}_{2,1} (\Omega )\}\) such that
holds. The embeddings (see [15] respectively [23])
yield
Finally, by [26, Theorem 2 in 1.18.6] we get
This proves the claim. \(\square \)
Moreover, in the proof Lemma 23 we used the following algebraic estimate:
Lemma 27
Let \(\gamma >0\). Then for all \(n\geqslant 1\) we have
Proof
We define \(h:[0,\tfrac{1}{2}]\rightarrow \mathbb {R}\) via \(h(t):=1-(1-t)^\gamma \). Note that \(h'(t) = \gamma (1-t)^{\gamma -1}\) and \(h''(t) = \gamma (1-\gamma )(1-t)^{\gamma -2}\). For \(\gamma \geqslant 1\) this implies that h is concave, so
On the other hand, if \(\gamma \in (0,1)\), the function h is convex. Therefore,
This implies \(h(t)\geqslant \min \{\gamma , 2(1-2^{-\gamma })\}t\). Therefore, we get
This proves the claim. \(\square \)
On uniformly convex Orlicz functions
In this appendix we introduce the concept of uniformly convex Orlicz functions and their shifted versions. The results presented below are modifications of [11, Lemma 3], [12, Lemma 16], and [12, Corollary 26]. However, since we use here a slightly different notion of shifted functions and less regularity for our Orlicz functions, we decided to include a proof and keep our paper self-contained. Throughout this section we assume that our Orlicz function satisfies the following assumptions.
Definition 28
Let \(\phi \) be an N-function.Footnote 6 We say that \(\phi \) is uniformly convex if there exist \(c_4,c_5 >0\) with
The choice \(t=0\) implies that \(c_4 \leqslant 1 \leqslant c_5\). Moreover, (B.1) implies that \(\phi '\) is strict increasing.
For any N-function \(\rho \) the complementary (or dual) N-function \(\rho ^*\) is given by \(\rho ^*(u) = \sup _{t \geqslant 0} (ut - \phi (t))\). This is equivalent to \((\rho ^*)' = (\rho ')^{-1}\). Note that \((\rho ^*)^*=\rho \).
Lemma 29
If \(\phi \) is a uniformly convex N-function with constants \(c_4\) and \(c_5\), then \(\phi ^*\) is a uniformly convex N-function with constants \(1/c_5\) and \(1/c_4\).
Proof
Let \(s=(\phi ^*)'(u)\) and \(t=(\phi ^*)'(v)\). Then by strict monotonicity of \(\phi '\) and \((\phi ^*)'\) condition (B.1) is equivalent to
Taking the reciprocal proves the uniform convexity of \(\phi ^*\). The reverse conclusion follows by duality, i.e. \(\phi =(\phi ^*)^*\). \(\square \)
Lemma 30
Let \(\phi \) be an N-function, which is piecwise \(C^2\) on \((0,\infty )\). Then \(\phi \) is uniformly convex if and only if
for some \(c_6,c_7>0\). This is in fact the uniform convexity condition in [11].
Proof
Note that \(s \searrow t\) in (B.1) implies (B.2) with \(c_6=c_4\) and \(c_7=c_5\). So assume now, that (B.2) holds.
If \(s \geqslant 2t\), then \(s-t \geqslant s/2\) and the upper bound of (B.1) is obvious with \(c_7=2\). So let us assume \(t < s \leqslant 2t\). Then
Now \(\log (1+a) \leqslant a\) for \(a\geqslant 0\) and \(s \leqslant 2t\) imply
This proves the upper bound of (B.1). The lower bound follows by duality: Indeed, it follows from
that \(\phi ^*\) satisfies B.2 with constants \(1/c_7\) and \(1/c_6\). Thus, by the already proven \(\phi ^*\) satisfies the upper estimate of (B.1). By duality, see Lemma 29. we get the lower estimate of \(\phi \) in (B.1). \(\square \)
Lemma 31
Let \(\phi \) be a uniformly convex N-function. Then the functions \(\phi \) and \(\phi ^*\) satisfy the \(\Delta _2\)-condition, with constants only depending on \(c_4\) and \(c_5\).
Proof
Using \(s=\lambda t\) with \(\lambda >1\) in (B.1) we obtain
Now, we can choose \(\lambda _0 > 1\) such that \(\mu := (1-c_4 \frac{\lambda _0-1}{\lambda _0}) > \frac{1}{2}\). So, we have \(\phi '(\lambda _0 t) \leqslant 2 \phi '(t)\). From this it follows by iteration (also using the monotonicity of \(\phi ')\) that \(\phi '(2t) \leqslant \tilde{c}\, \phi '(t)\), where \(\tilde{c}\) only depends on \(\lambda _0\) and \(\mu \) and therefore only on \(c_4\). Thus, it follows that
Hence, \(\phi (2t) \leqslant (1+2\tilde{c}) \phi (t)\), which proves the claim for \(\phi \). The claim for \(\phi ^*\) follows by duality with Lemma 29. \(\square \)
Remark 32
Let \(\phi \) be an N-function such that \(\phi \) and \(\phi ^*\) satisfy the \(\Delta _2\)-condition. Then for every \(s,t \geqslant 0\) we have by Young’s inequality
This and \(\phi (t) \leqslant \phi '(t)\,t \leqslant \phi (2t)\) implies that for all \(s,t\geqslant 0\)
where \(c_\delta \) depends only on \(\delta \) and the \(\Delta _2\)-constants.
For each \(a\geqslant 0\) we define the shifted N-function \(\phi _a\) by its derivative
and \(\phi _a(t) = \int _0^t \phi _a'(\tau )\,d\tau \). In the notation of Section 3 this is just \(\phi _\epsilon \) with \(\epsilon =(a,\infty )\), see also Remark 2.
Lemma 33
There holds \((\phi _a)^* = (\phi ^*)_{\phi '(a)}\).
Proof
Note that \(((\phi ^*)_{\phi '(a)})'(u) = \frac{(\phi ^*)'(u \vee \phi '(a))}{u \vee \phi '(a)} u\) is the inverse of \(\phi _a'(t)\). Thus \((\phi ^*)_{\phi '(a)}\) and \((\phi _a)^*\) are conjugate to each other. \(\square \)
Remark 34
The shifted N-functions have already been originally introduced in [11] with the modified definition \(\phi _a'(t) = \frac{\phi '(t + a)}{t+a}\,t\). This original version shares almost all of the properties with the version of this paper. However, our exact formula \((\phi _a)^* = (\phi ^*)_{\phi '(a)}\) of Lemma 33 is replaced in [11] by equivalence. This is one of the advantages of our new definition.
Given our uniformly convex N-function \(\phi \) we define the stress A and the auxiliary function V as in Definition 7 by
Definition 35
For \(P\in \mathbb {R}^d\) we define
Lemma 36
Let \(\phi \) be a uniformly convex N-function with constants \(c_4\) and \(c_5\). Define \(\psi \) via its derivative by \(\frac{\psi '(s)}{s} := \sqrt{\frac{\phi '(s)}{s}}\) and let \(\psi (t) := \int _0^t \psi '(s)\,ds\). Then \(\psi \) is a uniformly convex N-function with constants \(\frac{1}{2}\) and \(1+c_5\). Moreover, \(V(P) = \frac{\psi '({|{P}|})}{{|{P}|}} P\).
Proof
For \(s>t\) we calculate
Now, with (B.1)
and
Overall, we get \(\frac{1}{2} \frac{\psi '(s)}{s} \leqslant I \leqslant (1+c_5) \frac{\psi '(s)}{s}\), which proves the claim. \(\square \)
Lemma 37
Let \(\phi \) be a uniformly convex N-function. Then \(\phi _a\) is also uniformly convex with constants \(c_4\) and \(c_5\) replaced by \(c_4\) and \(c_5+1\).
Proof
For \(s>t\) define
If \(s,t \geqslant a\), then \(\phi _a'(s)=\phi '(s)\) and \(\phi _a'(t)=\phi '(t)\). Hence \(c_4 II \leqslant I \leqslant c_5 II\) by assumptions on \(\phi \).
If \(s,t \leqslant a\), then \(\phi _a(s) = \frac{\phi '(a)}{a} s\) and \(\phi _a(t) = \frac{\phi '(a)}{a} t\), so \(I= \frac{\phi '(a)}{a} =II\). Since \(c_4 \leqslant 1 \leqslant c_5\) the claim follows in this case.
If remains to consider \(s> a > t\). Then \(\phi _a'(s)=\phi '(s)\) and \(\phi _a'(t) = \frac{\phi '(a)}{a} s\), so
Thus,
On the hand using that \(a \mapsto \frac{a-t}{a}\) is increasing in a we get
This proves the claim. \(\square \)
Remark 38
If follows from Lemma 37 and 31 that the families \({\{{\phi _a}\}}_{a \geqslant 0}\) and \({\{{(\phi _a)^*}\}}_{a \geqslant 0}\) satisfy the \(\Delta _2\)-condition with constants independent of a.
Lemma 39
Let \(\phi \) be a uniformly convex N-function with constants \(c_4\) and \(c_5\). Then for all P, Q we have
Proof
We define \(\hat{Q} := \frac{P}{{|{P}|}}\), \(\hat{P} := \frac{P}{{|{P}|}}\), \(\theta := \hat{P} : \hat{Q}\),
Then
and
We need to estimate \(f({|{P}|},{|{Q}|},\theta )/g({|{P}|},{|{Q}|},\theta )\). Both f and g are non-negative and linear in \(\theta \). Hence \(\frac{f}{g}\) is monotone in \(\theta \). In particular, it suffices to control the cases \(\theta =1\) and \(\theta =-1\). We begin with the simple case \(\theta =-1\).
Since \(a \vee b \leqslant a+b \leqslant 2 (a \vee b)\), this implies
Now consider the case \(\theta =1\). Without loss of generality we can assume \({|{P}|} \geqslant {|{Q}|}\).
Now, (B.1) implies
Combining the two cases proves the first claim. To prove the second one we assume again \({|{P}|} \geqslant {|{Q}|}\) and estimate
This proves the second claim. \(\square \)
Lemma 40
Let \(\phi \) be a uniformly convex N-function with constants \(c_4\) and \(c_5\). Then for all \(P,Q \in \mathbb {R}^d\)
where the constants only depend on \(c_4\) and \(c_5\).
Proof
The estimates follows from \(\frac{1}{2} ({|{Q}|} + {|{P-Q}|}) \leqslant {|{P}|} \vee {|{Q}|} \leqslant 2 ({|{Q}|} + {|{P-Q}|})\), the \(\Delta _2\)-property of \(\phi \) (see Lemma 31) and \(\phi _{{|{Q}|}}'(t)\,t \eqsim \phi _{{|{Q}|}}(t)\). \(\square \)
Lemma 41
Let \(\phi \) be a uniformly convex N-function with constants \(c_4\) and \(c_5\) and let A and V be as in Lemma 35. Then for all \(P,Q\in \mathbb {R}^d\)
where the constants only depend on \(c_4\) and \(c_5\).
Proof
The first equivalence in the first claim and the second claim follow immediately from Lemma 39 and 40. It remains to prove the second equivalence of the first claim. For this we recall \(\psi \) from Lemma 36 and observe that V is induced by \(\psi \) exactly as A is induced by \(\phi \), see Definition 35. In particular, we have \({|{V(P)-V(Q)}|} \eqsim \psi _{{|{Q}|}}'({|{P-Q}|})\). This proves
This proves the claim. \(\square \)
Lemma 42
The following estimates hold for arbitrary \(v\in W_0^{1,\phi }(\Omega )\) and u being the minimizer of \(\mathcal {J}(w) := \int _\Omega \phi ({|{\nabla w}|}) - fw\,dx\):
Proof
It follows by convexity and \(\mathcal {J}'(u)=0\) that
Since \(\mathcal {J}'(w)(\xi ) = \int _\Omega A(\nabla w) \nabla \xi - f\,\xi \,dx\), this proves the first estimate. The equivalence in the claim follows then by Lemma 41. For the last estimate we calculate with Taylor, \(\mathcal {J}'(u)=0\), Lemma 41 and the uniform \(\Delta _2\)-condition of the family \(\phi _a\) that
This proves the claim. \(\square \)
The following lemma is a sharper version of Lemma 25 and Lemma 27 of [12].
Lemma 43
(Shift-change) For all \(a,b \geqslant 0\) and \(t \geqslant 0\), there holds
Proof
We begin with the proof of (B.5). The equivalence in (B.5) follows from Lemma 40. Thus the claim is symmetric in a and b and we can assume that \(a \leqslant b\). If \(a \leqslant b \leqslant t\), then \(\phi _a'(t)=\phi _b'(t)=0\) and (B.5) follows. If \(t \leqslant a \leqslant b\), then with Lemma 41
This proves (B.5). It remains to consider the case \(a \leqslant t \leqslant b\). In this situation we estimate using the previous two cases
This proves the remaining case of (B.5). To prove (B.6) we estimate with Lemma 41 and \((\phi _a^*)' = (\phi _a')^{-1}\)
This proves the claim. \(\square \)
With Lemma 43 we deduce the following estimates similar to Corollary 26 and Corollary 28 of [12].
Corollary 44
(Shift-change) For all P, Q and all \(t \geqslant 0\) there holds
Proof
We estimate with Lemma 43, Young’s inequality (see Remark 32) with \(\phi _{{|{Q}|}}\) and Lemma 41
This proves the first inequality. We can exchange \(\delta \) and \(c_\delta \) within the proof to get the second inequality (see Remark 32). The other estimates follow analogously using \((\phi _{{|{Q}|}})^* = (\phi ^*)_{\phi '({|{Q}|})}\). \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Diening, L., Fornasier, M., Tomasi, R. et al. A Relaxed Kačanov iteration for the p-poisson problem. Numer. Math. 145, 1–34 (2020). https://doi.org/10.1007/s00211-020-01107-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-020-01107-1