1 Introduction

In this article, we consider a random d-regular graph on N vertices, under the uniform probability measure. Let \(\mathcal A \in \mathbb R^{N\times N}\) be the adjacency matrix of the graph, and we denote its eigenvalues by \(\lambda _1\geqslant \cdots \geqslant \lambda _N\). It is easy to see that \( \lambda _1=d\, \) with corresponding eigenvector \({{\textbf {e}}}{:}{=}N^{-1/2}(1,1,\ldots ,1)^*\). The behavior of nontrivial extreme eigenvalues of \(\mathcal {A}\) is of particular interest in graph theory and computer science. For instance, the gap between the first and second eigenvalues measures the expanding property of the graph. For a deterministic d-regular graph on N vertices, the Alon-Boppana bound [2] states that

$$\begin{aligned} \lambda _2, |\lambda _N| \geqslant 2\sqrt{d-1}(1-o(1)) \end{aligned}$$

for d fixed and N large enough. A Ramanujan graph is a d-regular graph whose nontrivial eigenvalues are bounded in absolute value by \(2\sqrt{d-1},\) i.e. it is a graph that essentially saturates the Alon-Boppana bound. Ramanujan graphs were first constructed by Lubotzky, Phillips and Sarnak [34], and by Margulis [37] for some values of d. The construction of Ramanujan graphs in the bipartite case for all degrees was given by Marcus, Spielman and Srivastava [35, 36]. For the random d-regular graph \(\mathcal {A}\), when d is fixed, Friedman [21] showed that, for sufficiently large N,

$$\begin{aligned} \lambda _2, |\lambda _N| \leqslant 2\sqrt{d-1}(1+o(1)) \end{aligned}$$

with high probability (the proof was later substantially simplified by Bordenave [8]). This means that a random d-regular graph is typically “almost Ramanujan". More recently, Huang, McKenzie and Yau [28] (following Huang and Yau [29]) extended this result by showing the near-optimal rate

$$\begin{aligned} \lambda _2, |\lambda _N| \leqslant 2\sqrt{d-1}(1+O(N^{-2/3+o(1)})) \end{aligned}$$

with probability \(1-N^{-1+o(1)}\).

The case of \(d \leqslant N/2\) that tends to infinity with N was conjectured by Vu [43] to have

$$\begin{aligned} \lambda _2, |\lambda _N| =2\sqrt{d(N-d)/N})(1+o(1)) \end{aligned}$$
(1.1)

with high probability. The magnitude bound \(\lambda _2+|\lambda _N|=O(\sqrt{d})\) with high probability was proved by Broder, Frieze, Suen and Upfal [11] for \(d=o(\sqrt{N})\); by Cook, Goldstein and Johnson [12] for \(d=O(N^{2/3})\); by Tikhomirov and Youssef [42] for all \(d \leqslant N/2\). The eigenvalue locations were proved to satisfy \(\lambda _2, |\lambda _N| =2\sqrt{d-1}(1+o(1))\) in the regime \(N^{o(1)}\leqslant d \leqslant N^{2/3-o(1)}\), by Bauerschmidt, Huang, Knowles and Yau [4]. Very recently, Sarid [39] proved (1.1) for \(1\ll d\leqslant cN\), where c is a small constant.

Our first main result determines the extreme eigenvalue locations in the regime \(N^{2/3+o(1)}\leqslant d \leqslant N/2\), with optimal error bounds. Together with [4, 39], we settle the conjecture (1.1) in the whole regime \(1 \ll d \leqslant N/2\). We may now state our first main result.

Theorem 1.1

Fix \(\tau >0\), \(k\geqslant 2\), and assume \(N^{2/3+\tau }\leqslant d\leqslant N/2\). For any fixed \(\varepsilon ,D>0\), we have

$$\begin{aligned} \lambda _2,\ldots ,\lambda _k=(2\sqrt{d(N-d)/N}-d/N)(1+O(N^{-2/3+\varepsilon })) \end{aligned}$$
(1.2)

as well as

$$\begin{aligned} \lambda _N,\ldots ,\lambda _{N-k}=(-2\sqrt{d(N-d)/N}-d/N)(1+O(N^{-2/3+\varepsilon })) \end{aligned}$$

with probability \(1-O(N^{-D})\).

The negative shift \(-d/N\) in Theorem 1.1 is only relevant if we want the optimal error bound \(O(N^{-2/3+\varepsilon })\). As \(\sqrt{d(N-d)/N}\asymp d^{1/2}\) for \(d\leqslant N/2\), Theorem 1.1 implies

$$\begin{aligned} \lambda _2, |\lambda _N| =2\sqrt{d(N-d)/N})(1+O(N^{-1/2})) \end{aligned}$$

with very high probability. In addition, Theorem 1.1 implies that for \(N^{2/3+o(1)} \leqslant d \leqslant N/2\), almost all d-regular graphs on N vertices are Ramanujan. Indeed, by (1.2) we have

$$\begin{aligned} \lambda _2-2\sqrt{d-1}=\frac{2-2d^2/N}{\sqrt{d(N-d)/N}+\sqrt{d-1}}-\frac{d}{N}+O(d^{1/2}N^{-2/3+\varepsilon }) \end{aligned}$$

with very high probability. As \(N^{2/3+o(1)} \leqslant d \leqslant N/2\), the above is negative with very high probability. The analogue also holds for \(-\lambda _N-2\sqrt{d-1}\). This yields the following result.

Corollary 1.2

Fix \(D,\tau >0\). For d large enough and \(2d \leqslant N \leqslant d^{3/2+\tau }\),

$$\begin{aligned} {\mathbb {P}}\Big (\max \{|\lambda _N|,\lambda _2\} <2\sqrt{d-1}\Big ) \geqslant 1-N^{-D}. \end{aligned}$$
(1.3)

Beyond the law of large numbers, the distributions of the extreme eigenvalues of \(\mathcal {A}\) were conjectured in [38] to satisfy edge universality, i.e. after normalization, their joint distribution is the same as that of the extreme eigenvalues of the Gaussian Orthogonal Ensemble. Edge universality was proved by by Bauerschmidt, Huang, Knowles and Yau [4] for \(\mathcal {A}\) in the intermediate regime \(N^{2/9+o(1)} \leqslant d \ll N^{1/3-o(1)}\). The authors showed that

$$\begin{aligned} N^{2/3}\bigg (\frac{\lambda _2}{\sqrt{d-1}}-2\bigg ) \overset{d}{\longrightarrow }\ \textrm{TW}_1, \end{aligned}$$
(1.4)

together with analogue results for other extreme eigenvalues. Recently, Huang and Yau [30] extended (1.4) to \(N^{o(1)}\leqslant d\leqslant N^{1/3-o(1)}\). Our second main result is the edge universality of \(\mathcal {A}\) in the dense regime \(N^{2/3+o(1)}\leqslant d \leqslant N/2\).

Theorem 1.3

Fix \(\tau >0\) and assume \(N^{2/3+\tau }\leqslant d\leqslant N/2\). Let \(\mu _1\geqslant \cdots \geqslant \mu _N\) denote the eigenvalues of a Gaussian Orthogonal Ensemble. Fix \(k \in \mathbb N_+\). We have

$$\begin{aligned}{} & {} \lim _{N\rightarrow \infty } \mathbb P_{\mathcal {A}}\bigg (N^{2/3}\bigg (\frac{\lambda _{i+1}+d/N}{\sqrt{d(N-d)/N}}-2\bigg )\\{} & {} \quad \geqslant s_i, N^{2/3}\bigg (\frac{\lambda _{N-i+1}+d/N}{\sqrt{d(N-d)/N}}+2\bigg )\geqslant r_i, 1\leqslant i \leqslant k\bigg )\\{} & {} \quad =\lim _{N\rightarrow \infty } \mathbb P_{GOE }\Big (N^{2/3}(\mu _{i}-2)\\{} & {} \quad \geqslant s_i, N^{2/3}(\mu _{N-i+1}+2)\geqslant r_i, 1\leqslant i \leqslant k\Big ) \end{aligned}$$

uniformly for all \(s_1,r_1,\ldots ,s_k,r_k \in \mathbb R\).

To prove the main results, we analysis the Stieltjes transform of \(\mathcal {A}\) near the spectral edge, on all mesoscopic spectral scales. This Green function method is widely used in the random matrix community. To start of, it was applied to Wigner matrices, in particular in [9, 16,17,18,19,20, 40, 41]. It was then applied in [10, 14, 15, 23, 24, 26, 27, 32, 33] to sparse matrices, which includes the adjacency matrix of sparse Erdős-Rényi graphs \(\mathcal G(N,p)\) for \(p \gg N^{-1}\). These works rely on the fact that the matrix entries are independent (subject to the symmetry constraint), which is not the case for \(\mathcal {A}\). In the work [6], the authors developed a technique through local switching, which opens the door of studying random regular graphs through the Green function method. For \(N^{o(1)} \leqslant d \leqslant N^{2/3-o(1)}\), they proved that the eigenvalues of \(\mathcal {A}\) satisfy the local semicircle law. The idea of switching was then applied to prove various results for \(\mathcal {A}\) in the regime \(N^{o(1)}\leqslant d \leqslant N^{2/3-o(1)}\) [3, 4], and d fixed [5, 29]. All these works require the degree upper bound \(d \ll N^{2/3}\), which is essentially due to the approximation \(1-\mathcal {A}_{ij} \approx 1\). In other words, due to the sparsity of the graph in the regime \(d \ll N^{2/3}\), in many situations, one can take two vertices of the graph, and with an affordable error assume that they are disconnected.

In order to deal with the dense case \(N^{2/3+o(1)}\leqslant d \leqslant N/2\), we develop an algorithm which is insensitive to the increasing density of the graph. Comparing to [4, 6], the integration by parts formula used in this paper (see Lemma 2.2) comes with an error term that does not explicitly depend on d. Another ingredient of the proof is a large deviation result on the powers of \(\mathcal {A}\) (see Proposition 3.1), which essentially counts the number of short cycles of the graph. This enables us to replace the entries of \(\mathcal {A}^r\) (\(r\geqslant 2\)) by their expectations, with affordable errors.

Our first step is to prove a weak local semicircle law for all \(N^{o(1)} \leqslant d \leqslant N/2\), which is stated in terms of Green functions (see Theorem 4.2). A standard consequence of Theorem 4.2 is the following complete eigenvector delocalization.

Corollary 1.4

Fix \(\tau >0\) and assume \(N^{\tau }\leqslant d \leqslant N/2\). Let \({{\textbf {u}}}_i\in {\mathbb {S}}^{N-1}\) denote the i-th eigenvector of \(\mathcal {A}\). For any fixed \(\varepsilon ,D>0\), we have

$$\begin{aligned} \max _i \Vert {{\textbf {u}}}_i\Vert _\infty =O(N^{-1/2+\varepsilon }) \end{aligned}$$

with probability \(1-O(N^{-D})\).

After obtaining the weak local law, we perform a refined analysis of the averaged self-consistent equations near the spectral edge (see Proposition 5.1). This leads to a strong estimate on the traces of the Green functions in the regime \(N^{2/3+o(1)}\leqslant d \leqslant N/2\) (see Proposition 6.1), and from which Theorem 1.1 follows. Providing optimal edge rigidity, the edge universality Theorem 1.3 is proved basing on the usual three-step approach of random matrix theory [13]. The same strategy was also used in [4].

From Theorems 1.1 and 1.3, we see a shift of \(-d/N\) on both spectral edges of \(\mathcal A\). This is due to the fact that the diagonal entries of the adjacency matrix are 0. More precisely, observe that

$$\begin{aligned} N{{\textbf {e}}} {{\textbf {e}}}^*-\mathcal A-I \end{aligned}$$

is the adjacency matrix of a random \((N-1-d)\)-regular graph on N vertices, and we denote its eigenvalues by \(N-1-d=\widehat{\lambda }_1\geqslant \cdots \geqslant \widehat{\lambda }_N\). Thus for \( 2\leqslant k \leqslant N\), we have the relation

$$\begin{aligned} \lambda _k+\widehat{\lambda }_{N+2-k}=-1. \end{aligned}$$

Our main results suggest that the shift \(-1\) is shared between \(\lambda \) and \(\widehat{\lambda }\), with the amount proportional to the graph degree. The shift is essential to getting the Tracy–Widom limit, as

$$\begin{aligned} \frac{1}{\sqrt{d(N-d)/N}}\cdot \frac{d}{N} \gg N^{-2/3} \end{aligned}$$

for \( N^{2/3}\ll d \leqslant N/2\).

Comparing the parameters of (1.4) and Theorem 1.3, together with the degree symmetry \(d \longleftrightarrow N-d-1\) for \(d-\)regular graphs, one could propose that edge universality holds for all non-trivial random d-regular graphs, in the following scaling.

Conjecture 1.5

Assume \(3 \leqslant d \leqslant N-4\). There exists a constant \(c_{N,d}\)Footnote 1 such that

$$\begin{aligned} N^{2/3}\bigg (\frac{\lambda _2+d/N}{\sqrt{(d-1)(N-d-2)/N}}-2\bigg )-c_{N,d} \overset{d}{\longrightarrow }\ \textrm{TW}_1, \end{aligned}$$

where \(\textrm{TW}_1\) is the Tracy–Widom distribution for GOE. Analogues results also hold for other non-trivial extreme eigenvalues.

Although the proof of Conjecture 1.5 in the regime when d is fixed, is apparently difficult, it is probable that combining the techniques of [4, 30] and the current paper, one can prove optimal edge rigidity and universality for \(N^{o(1)}\leqslant d \leqslant N/2\). Providing this is the case, the following will also stand.

Conjecture 1.6

For d large enough and \(N \geqslant 2d\), (1.3) holds if and only if \(N \ll d^3\).

The rest of this paper is organized as follows. In Sect. 2 we recall the local switching, and prove an integration by parts formula which is insensitive to the degree d. In Sect. 3 we prove a large deviation result on the powers of \(\mathcal {A}\). In Sect. 4 we prove the weak local semicircle law for all \( N^{o(1)}\leqslant d \leqslant N/2\). In Sect. 5 we prove a strong self-consistent equation near the spectral edge. Finally in Sect. 6 we use the results in Sects. 4 and 5 to conclude the proof of our main results.

1.1 Conventions

Unless stated otherwise, all quantities depend on the fundamental large parameter N, and we omit this dependence from our notation. We use the usual big O notation \(O(\cdot )\), and if the implicit constant depends on a parameter \(\alpha \) we indicate it by writing \(O_\alpha (\cdot )\). Let

$$\begin{aligned} X=(X^{(N)}(u): N \in \mathbb N, u \in U^{(N)}), \quad Y=(Y^{(N)}(u): N \in \mathbb N, u \in U^{(N)}) \end{aligned}$$

be two families of nonnegative random variables, where \(U^{(N)}\) is a possibly N-dependent parameter set, and \(Y\geqslant 0\). We say that X is stochastically dominated by Y, uniformly in u, if for any fixed \(\varepsilon ,D>0\),

$$\begin{aligned} \sup _{u \in U^{(N)}}\mathbb P(|X|\geqslant Y N^{\varepsilon }) =O_{\varepsilon ,D} (N^{-D}). \end{aligned}$$

We write \(X\asymp Y\) if \(X =O(Y)\) and \(Y=O(X)\). If X is stochastically dominated by Y, we use the notation \(X \prec Y\), or equivalently \(X=O_{\prec }(Y)\). We say an event \(\Omega \) holds with very high probability if for any \(D>0\), \(1-\mathbb P(\Omega )=O_D(N^{-D})\).

2 Local Switchings

As in [4, 6], we rely on switchings for regular graphs and the invariance under the permutation of vertices. For indices ijkl, we define the signed adjacency matrices

$$\begin{aligned} (\Delta _{ij})_{xy}{:}{=}\delta _{ix}\delta _{jy}+\delta _{iy}\delta _{jx}, \quad \xi _{ij}^{kl}=\Delta _{ij}+\Delta _{kl}-\Delta _{ik}-\Delta _{jl}. \end{aligned}$$
(2.1)

In addition, we denote the indicator function that the edges ij and kl are switchable by

$$\begin{aligned} \chi _{ij}^{kl}(\mathcal A)=\mathcal A_{ij}(1-\mathcal A_{ik})\mathcal A_{kl}(1-\mathcal A_{jl}). \end{aligned}$$
(2.2)

The following identity is a consequence of the uniform probability measure on \(\mathcal A\). The proof is given in [4, Proposition 3.1].

Lemma 2.1

Let ijkl be distinct indices. Let F be a function which depends on the random graph \(\mathcal A\), and possibly on the indices ijkl. We have

$$\begin{aligned} \mathbb E F(\mathcal A)\chi _{ij}^{kl}(\mathcal A)=\mathbb E F(\mathcal A+\xi _{ij}^{kl})\chi _{ik}^{jl}(\mathcal A), \end{aligned}$$

where \(\xi \) and \(\chi \) are defined in (2.1) and (2.2) respectively.

Let us abbreviate

$$\begin{aligned} \mathcal M_{ij}(F(\mathcal {A})){:}{=}\max _{kl} |F(\mathcal {A}+\xi _{ij}^{kl})|. \end{aligned}$$
(2.3)

The next result improves [4, Corollary 3.2] to adapt the dense graph setting. This is the main formula we use in generating non-trivial self-consistent equations.

Lemma 2.2

Let ij be distinct indices. Let F be a function which depends on the random graph \(\mathcal A\), and possibly on the indices ij. We have

$$\begin{aligned} \begin{aligned} \mathbb E \mathcal A_{ij} F(\mathcal A)=&\,\frac{1}{(N-d)d}\sum _{kl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\big (F(\mathcal A+\xi _{ij}^{kl})-F(\mathcal {A})\big ) +\frac{d}{N-d}\mathbb E F(\mathcal {A})\\&-\frac{1}{(N-d)d}\mathbb E (\mathcal {A}^3)_{ij}F(\mathcal {A})+O(N^{-1})\cdot \mathbb E\mathcal M_{ij}(F(\mathcal {A})). \end{aligned} \end{aligned}$$

We often refer the last term above as the remainder term.

Proof

Since \(\sum _{k} \mathcal A_{ik}=\sum _{l}\mathcal A_{kl}=d\), we have

$$\begin{aligned} \begin{aligned} \mathbb E \mathcal A_{ij}F(\mathcal A )&=\frac{1}{(N-d)d}\sum _{kl}\mathbb E \mathcal A_{ij}(1-\mathcal A_{ik})\mathcal A_{kl}F(\mathcal A)\\&=\frac{1}{(N-d)d}\sum _{kl}\mathbb E \chi _{ij}^{kl}(\mathcal A)F(\mathcal A)\\&\quad +\frac{1}{(N-d)d}\sum _{kl}\mathbb E \mathcal A_{ij}(1-\mathcal A_{ik})\mathcal A_{kl}\mathcal A_{jl}F(\mathcal A), \end{aligned} \end{aligned}$$
(2.4)

where in the second step we used \(1=(1-\mathcal A_{jl})+\mathcal A_{jl}\). By Lemma 2.1 and \(\chi _{ij}^{kl}(\mathcal {A})\leqslant \mathcal {A}_{ij}\mathcal {A}_{kl}\),

$$\begin{aligned}&\frac{1}{(N-d)d}\sum _{kl}\mathbb E \chi _{ij}^{kl}(\mathcal A)F(\mathcal A)\nonumber \\&\quad =\frac{1}{(N-d)d}\sum _{\begin{array}{c} kl: i,j,k,l\\ distinct \end{array}}\mathbb E \chi _{ij}^{kl}(\mathcal A)F(\mathcal A)+O\Big (\frac{1}{Nd}\Big )\cdot \hspace{-0.2cm}\sum _{\begin{array}{c} kl: i,j,k,l\\ not distinct \end{array}} \mathbb E |\mathcal A_{ij}\mathcal {A}_{kl}F(\mathcal {A})|\nonumber \\&\quad =\frac{1}{(N-d)d}\sum _{\begin{array}{c} kl: i,j,k,l\\ distinct \end{array}}\mathbb E \chi _{ik}^{jl}(\mathcal A)F(\mathcal A+\xi _{ij}^{kl})+O(N^{-1})\cdot \mathbb E|F(\mathcal {A})|\nonumber \\&\quad =\frac{1}{(N-d)d}\sum _{kl}\mathbb E \chi _{ik}^{jl}(\mathcal A)F(\mathcal A+\xi _{ij}^{kl})+O(N^{-1})\cdot \mathbb E \mathcal M_{ij}(F(\mathcal {A}))\,. \end{aligned}$$
(2.5)

Moreover, note that

$$\begin{aligned}{} & {} \frac{1}{(N-d)d}\sum _{kl}\mathbb E \chi _{ik}^{jl}(\mathcal A)F(\mathcal A)=\frac{1}{(N-d)d}\sum _{kl}\mathbb E (1-\mathcal {A}_{ij})\mathcal {A}_{ik}(1-\mathcal {A}_{kl})\mathcal {A}_{jl}F(\mathcal A)\nonumber \\{} & {} \quad =\,-\frac{1}{(N-d)d}\sum _{kl}\mathbb E (1-\mathcal {A}_{ij})(1-\mathcal {A}_{ik})(1-\mathcal {A}_{kl})\mathcal {A}_{jl}F(\mathcal A)+\mathbb E (1-\mathcal {A}_{ij})F(\mathcal {A})\nonumber \\{} & {} \quad =\,\frac{1}{(N-d)d}\sum _{kl}\mathbb E (1-\mathcal {A}_{ij})(1-\mathcal {A}_{ik})\mathcal {A}_{kl}\mathcal {A}_{jl}F(\mathcal A), \end{aligned}$$
(2.6)

where in the second and third steps we used \(\sum _{kl}(1-\mathcal {A}_{kl})\mathcal {A}_{jl}=\sum _{kl}(1-\mathcal {A}_{ik})\mathcal {A}_{jl}=(N-d)d\). Combining (2.4) - (2.6) we get

$$\begin{aligned} \begin{aligned} \mathbb E \mathcal A_{ij}F(\mathcal A )=&\ \frac{1}{(N-d)d}\sum _{kl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\big (F(\mathcal A+\xi _{ij}^{kl})-F(\mathcal {A})\big )\\&+\frac{1}{(N-d)d}\sum _{kl}\mathbb E (1-\mathcal {A}_{ij})(1-\mathcal {A}_{ik})\mathcal {A}_{kl}\mathcal {A}_{jl}F(\mathcal A) \\&+\frac{1}{(N-d)d}\sum _{kl}\mathbb E \mathcal A_{ij}(1-\mathcal A_{ik})\mathcal A_{kl}\mathcal A_{jl}F(\mathcal A)+O(N^{-1})\cdot \mathbb E \mathcal M_{ij}(F(\mathcal {A}))\\ =&\ \frac{1}{(N-d)d}\sum _{kl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\big (F(\mathcal A+\xi _{ij}^{kl})-F(\mathcal {A})\big )\\&+\frac{1}{(N-d)d}\sum _{kl}\mathbb E (1-\mathcal A_{ik})\mathcal A_{kl}\mathcal A_{jl}F(\mathcal A)+O(N^{-1})\cdot \mathbb E \mathcal M_{ij}(F(\mathcal {A})). \end{aligned} \end{aligned}$$

Applying \(\sum _{kl}(1-\mathcal A_{ik})\mathcal A_{kl}\mathcal A_{jl}=d^2-(\mathcal {A}^3)_{ij}\) to the second term on the RHS, we get the desired result. \(\square \)

3 Powers of \(\mathcal {A}\): Large Deviations

Let us abbreviate the discrete derivative for any indices ijkl by

$$\begin{aligned} \textrm{D}_{ij}^{kl}F(\mathcal A){:}{=}F(\mathcal {A}+\xi _{ij}^{kl})-F(\mathcal {A}) \end{aligned}$$
(3.1)

where \(\xi _{ij}^{kl}\) was defined as in (2.1). It satisfies the discrete product rule

$$\begin{aligned} \textrm{D}_{ij}^{kl}(FK)=\textrm{D}_{ij}^{kl}(F)K+F\textrm{D}_{ij}^{kl}(K)+\textrm{D}_{ij}^{kl}(F)\textrm{D}_{ij}^{kl}(K), \end{aligned}$$
(3.2)

and

$$\begin{aligned} \textrm{D}_{ij}^{kl}(\mathcal {A})=\xi _{ij}^{kl}. \end{aligned}$$

We have the following result.

Proposition 3.1

Fix \(\tau >0\) and assume \(N^{\tau } \leqslant d \leqslant N/2\). We have

$$\begin{aligned} (\mathcal {A}^2)_{ij} -d^2N^{-1} \prec 1+dN^{-1/2} \end{aligned}$$
(3.3)

uniformly for \(i \ne j\), and

$$\begin{aligned} (\mathcal {A}^3)_{ij} -d^3N^{-1} \prec d+d^2N^{-1/2} \end{aligned}$$
(3.4)

uniformly in ij. For fixed integer \(r\geqslant 4\), we have

$$\begin{aligned} (\mathcal {A}^r)_{ij} -d^rN^{-1} \prec d^{r-2}+Nd^{r-4} \end{aligned}$$
(3.5)

uniformly in ij.

Proof

(i) Fixed an integer \(r \geqslant 2\). In this step we shall prove

$$\begin{aligned} (\mathcal {A}^{2r})_{ii}-d^{2r}N^{-1} \prec d^{2r-2}+d^{2r-1}N^{-1/2} \end{aligned}$$
(3.6)

uniformly in i. By \(\sum _{j}(\mathcal {A}^r)_{ij}=d^r\), we have

$$\begin{aligned} 0\leqslant \sum _{j}\big ((\mathcal {A}^r)_{ij}-d^rN^{-1}\big )^2=(\mathcal {A}^{2r})_{ii}-d^{2r}N^{-1}, \end{aligned}$$
(3.7)

thus \(\mathcal R_r{:}{=}(\mathcal {A}^{2r})_{ii}-d^{2r}N^{-1}\geqslant 0\). Similarly, \(\mathcal R_{r+1}{:}{=}(\mathcal {A}^{2r+2})_{ii}-d^{2r+2}N^{-1}\geqslant 0\). Fix \(n \geqslant 1\). As \(A_{ii}=0\), we have

$$\begin{aligned} \mathbb E \mathcal R_r^n= & {} -\frac{d^{2r}}{N}\mathbb E \mathcal R_r^{n-1}+\sum _{j} \mathbb E \mathcal {A}_{ij}(\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1}\nonumber \\= & {} -\frac{d^{2r}}{N}\mathbb E \mathcal R_r^{n-1}+\sum _{j:j\ne i} \mathbb E \mathcal {A}_{ij}(\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1}. \end{aligned}$$
(3.8)

Applying Lemma 2.2 to the last term on RHS of (3.8), with \(F(\mathcal {A})=(\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1}\), we get

$$\begin{aligned} \mathbb E \mathcal R_r^n&=\,-\frac{d^{2r}}{N}\mathbb E \mathcal R_r^{n-1}+\frac{1}{(N-d)d}\sum _{jkl:j \ne i}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl}((\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1})\\&\quad +\frac{d}{N-d}\sum _{j:j\ne i}\mathbb E (\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1} -\frac{1}{(N-d)d}\sum _{j:j\ne i}\mathbb E (\mathcal {A}^3)_{ij}(\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1}\\&\quad +O(N^{-1})\cdot \sum _{j:j\ne i}\mathbb E\mathcal M_{ij}((\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1})\,. \end{aligned}$$

As \(\max _{ij}(\mathcal A^{r})_{ij}\leqslant \max _i \sum _k (\mathcal A^{r-1})_{ik}=d^{r-1}\) for all \(r \geqslant 2\), we can easily remove the restraint \(j\ne i\) in the third and fourth term on RHS of the above, by observing that

$$\begin{aligned}{} & {} \frac{d}{N-d}\mathbb E (\mathcal A^{2r-1})_{ii}\mathcal R_r^{n-1}=O( d^{2r-2})\cdot \mathbb E \mathcal R_r^{n-1} \quad \hbox {and} \quad \\{} & {} \frac{1}{(N-d)d}\mathbb E (\mathcal {A}^3)_{ii}(\mathcal {A}^{2r-1})_{ii}\mathcal R_r^{n-1}=O( d^{2r-2})\cdot \mathbb E \mathcal R^{n-1}_r. \end{aligned}$$

As a result,

$$\begin{aligned} \mathbb E \mathcal R_r^n&=\,-\frac{d^{2r}}{N}\mathbb E \mathcal R_r^{n-1}+\frac{1}{(N-d)d}\sum _{jkl:j \ne i}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl}((\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1})\nonumber \\&\quad +\frac{d}{N-d}\sum _{j}\mathbb E (\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1} -\frac{1}{(N-d)d}\sum _{j}\mathbb E (\mathcal {A}^3)_{ij}(\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1}\nonumber \\&\quad +O(N^{-1})\cdot \sum _{j:j\ne i}\mathbb E\mathcal M_{ij}((\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1})+O( d^{2r-2})\cdot \mathbb E \mathcal R^{n-1}_r\nonumber \\&\leqslant \,-\frac{d^{2r}}{N}\mathbb E \mathcal R_r^{n-1}+\frac{1}{(N-d)d}\sum _{jkl:j\ne i}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl}((\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1})\nonumber \\&\quad +\frac{d^{2r}}{N-d}\mathbb E \mathcal R_r^{n-1} -\frac{d^{2r+1}}{(N-d)N}\mathbb E \mathcal R_r^{n-1}\nonumber \\&\quad +O(N^{-1})\cdot \sum _{j:j\ne i}\mathbb E\mathcal M_{ij}((\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1}) +O( d^{2r-2})\cdot \mathbb E \mathcal R^{n-1}_r\nonumber \\&=\,\frac{1}{(N-d)d}\sum _{jkl:j\ne i}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl}((\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1})\nonumber \\&\quad +O(N^{-1})\cdot \sum _{j:j\ne i}\mathbb E\mathcal M_{ij}((\mathcal {A}^{2r-1})_{ji}\mathcal R_r^{n-1})\nonumber \\&\quad +O( d^{2r-2})\cdot \mathbb E \mathcal R^{n-1}_r{=}{:}R_1+R_2+O( d^{2r-2})\cdot \mathbb E \mathcal R^{n-1}_r\,, \end{aligned}$$
(3.9)

where in the second step we used

$$\begin{aligned} \sum _j (\mathcal {A}^3)_{ij}(\mathcal {A}^{2r-1})_{ji}=(\mathcal {A}^{2r+2})_{ii} =\mathcal R_{r+1}+d^{2r+2}N^{-1}\geqslant d^{2r+2}N^{-1} \end{aligned}$$

and \(\mathcal R_r,\mathcal R_{r+1} \geqslant 0\), and in the third step we have a cancellation among the three terms involving \(\mathbb E \mathcal R_{r}^{n-1}\). To estimate the RHS of (3.9), note that

$$\begin{aligned} (\mathcal {A}^{2r-1})_{ij} \leqslant d^{2r-2}, \quad \max _{jkl}\textrm{D}_{ij}^{kl} (\mathcal {A}^{2r-1})_{ji} =O(d^{2r-3}), \quad \hbox {and} \quad \textrm{D}_{ij}^{kl} \mathcal R_r =O( d^{2r-2}),\nonumber \\ \end{aligned}$$
(3.10)

and together with the product rule (3.2) and \((N-d)^{-1}\leqslant 2N^{-1}\), the term \(R_1\) can be bounded (up to a constant factor) by

$$\begin{aligned} \begin{aligned}&\frac{1}{Nd}\sum _{jkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)(\mathcal {A}^{2r-1})_{ji} |\textrm{D}_{ij}^{kl}\mathcal R_r^{n-1}|\\&\qquad + \frac{1}{Nd}\sum _{jkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)|\textrm{D}_{ij}^{kl}(\mathcal {A}^{2r-1})_{ji}|( |\textrm{D}_{ij}^{kl}\mathcal R_r^{n-1}|+\mathcal R_r^{n-1})\\&\quad \prec \,\frac{1}{Nd}\sum _{jkl}\mathbb E\bigg [ \chi _{ik}^{jl}(\mathcal {A})(\mathcal {A}^{2r-1})_{ji} \sum _{s=2}^n d^{(2r-2)(s-1)}\mathcal R_{r}^{n-s}\bigg ]\\&\qquad +\frac{1}{Nd}\sum _{jkl}\mathbb E\bigg [ \chi _{ik}^{jl}(\mathcal {A})d^{2r-3} \sum _{s=1}^n d^{(2r-2)(s-1)}\mathcal R_{r}^{n-s}\bigg ] \end{aligned} \end{aligned}$$

Note that

$$\begin{aligned}{} & {} \sum _{jkl}\chi _{ik}^{jl}(\mathcal {A})(\mathcal {A}^{2r-1})_{ji} \leqslant \sum _{jkl}\mathcal {A}_{ik}\mathcal {A}_{jl}(\mathcal {A}^{2r-1})_{ji} = d^2\sum _{j}(\mathcal {A}^{2r-1})_{ji}=d^{2r+1}\ \ \hbox {and} \\{} & {} \sum _{jkl}\chi _{ik}^{jl}(\mathcal {A})\leqslant d^2N, \end{aligned}$$

which implies

$$\begin{aligned} R_1\prec \sum _{s=1}^n\big (d^{(2r-2)}+d^{2r-1}N^{-1/2}\big )^s\mathbb E \mathcal R_r^{n-s}. \end{aligned}$$
(3.11)

More over, by (3.2) and (3.10), it is easy to check that

$$\begin{aligned} \mathcal M_{ij}((\mathcal {A}^{2r-1})_{ij}\mathcal R_{r}^{n-1})\leqslant & {} |(\mathcal {A}^{2r-1})_{ij}\mathcal R_{r}^{n-1}| +\max _{kl}|\textrm{D}_{ij}^{kl} ((\mathcal {A}^{2r-1})_{ij}\mathcal R_{r}^{n-1})| \nonumber \\\prec & {} \sum _{s=1}^nd^{(2r-2)s}\mathbb E \mathcal R_r^{n-s}. \end{aligned}$$
(3.12)

Hence we have

$$\begin{aligned} R_2 \prec \sum _{s=1}^nd^{(2r-2)s}\mathbb E \mathcal R_r^{n-s}. \end{aligned}$$
(3.13)

Combining (3.9), (3.11) and (3.13), we get

$$\begin{aligned} \mathbb E \mathcal R_r^n\prec & {} \sum _{s=1}^n\big (d^{(2r-2)}+d^{2r-1}N^{-1/2}\big )^s\mathbb E \mathcal R_r^{n-s}\\\leqslant & {} \sum _{s=1}^n\big (d^{(2r-2)}+d^{2r-1}N^{-1/2}\big )^s(\mathbb E \mathcal R_r^{n})^{(n-s)/n} \end{aligned}$$

which implies (3.6) as desired.

(ii) Fix an integer \(r\geqslant 4\). In this step we shall show that

$$\begin{aligned} (\mathcal A^{r})_{ij}-d^rN^{-1}\prec d^{r-2}+d^{r-1}N^{-1/2} . \end{aligned}$$
(3.14)

uniformly in ij. More precisely, by \(\sum _k (\mathcal {A}^2)_{ik}=d^2\) and \(\sum _k (\mathcal {A}^{r-2})_{kj}=d^{s-2}\) we get

$$\begin{aligned} \begin{aligned} \Big | (\mathcal A^{r})_{ij}-d^rN^{-1}\Big |&=\Big |\sum _k \big ((\mathcal {A}^2)_{ik}-d^2N^{-1}\big )\big ((\mathcal {A}^{r-2})_{kj}-d^{r-2}N^{-1}\big )\Big |\\&\leqslant \Big (\sum _k \big ((\mathcal {A}^2)_{ik}-d^2N^{-1}\big )^2\Big )^{1/2}\Big (\sum _k\big ((\mathcal {A}^{r-2})_{kj}-d^{r-2}N^{-1}\big )^2\Big )^{1/2}\\&=\big ((\mathcal {A}^4)_{ii}-d^4N^{-1}\big )^{1/2}\big ((\mathcal {A}^{2r-4})_{jj}-d^{2r-4}N^{-1}\big )^{1/2}, \end{aligned} \end{aligned}$$

and (3.14) follows from (3.6).

(iii) In this step we prove (3.3); the proof of (3.4) follows in a similar fashion. Let us denote \(\mathcal S{:}{=}(\mathcal {A}^2)_{ij}-d^2N^{-1}\) for some \(i \ne j\). Fix \(n \geqslant 1\). Using Lemma 2.2 with \(F(\mathcal A)=\mathcal {A}_{kj}\mathcal S^{2n-1}\), we have

$$\begin{aligned} \begin{aligned} {\mathbb {E}} \mathcal S^{2n}&=-d^2N^{-1}{\mathbb {E}} \mathcal S^{2n-1}+\sum _{k} \mathbb E \mathcal {A}_{ik}\mathcal {A}_{kj} \mathcal S^{2n-1}\\&=-d^2N^{-1}{\mathbb {E}} \mathcal S^{2n-1}+\frac{1}{(N-d)d}\sum _{klu:k\ne i}\mathbb E\chi _{il}^{ku}(\mathcal A)\textrm{D}_{ik}^{lu}(\mathcal {A}_{kj}\mathcal S^{2n-1})\\&\quad +\frac{d}{N-d}\sum _{k:k\ne i}\mathbb E\mathcal {A}_{kj} \mathcal S^{2n-1}-\frac{1}{(N-d)d}\sum _{k:k\ne i}\mathbb E (\mathcal {A}^3)_{ik}\mathcal {A}_{kj}\mathcal S^{2n-1}\\&\quad +O(N^{-1})\cdot \sum _{k:k\ne i}\mathcal M_{ik}(\mathcal {A}_{kj} \mathcal S^{2n-1}). \end{aligned} \end{aligned}$$

Similar as in (3.9) and (3.12), we can remove the restriction \(k\ne i\) and estimate the error term in the above, and get

$$\begin{aligned} \begin{aligned} \mathbb E \mathcal S^{2n}&=-d^2N^{-1}\mathbb E \mathcal S^{2n-1}+\frac{1}{(N-d)d}\sum _{klu}\mathbb E\chi _{il}^{ku}(\mathcal A)\textrm{D}_{ik}^{lu}(\mathcal {A}_{kj}\mathcal S^{2n-1})+\frac{d^2}{N-d}{{S}}^{2n-1}\\&\quad \,-\frac{1}{(N-d)d}\mathbb E (\mathcal {A}^4)_{ij}\mathcal S^{2n-1}+\sum _{s=1}^{2n}O_{\prec }(1)\cdot \mathbb E |\mathcal S^{2n-s}|\,. \end{aligned} \end{aligned}$$
(3.15)

The second term on RHS of (3.15) can be bounded (up to a constant factor) by

$$\begin{aligned}{} & {} \frac{1}{Nd}\sum _{klu} \mathbb E \mathcal {A}_{ku}\mathcal {A}_{il} \mathcal {A}_{kj}|\textrm{D}_{ik}^{ku}\mathcal S^{2n-1}|\\{} & {} \quad +\frac{1}{Nd}\sum _{klu} \mathbb E |\mathcal {A}_{ku}\mathcal {A}_{il} \textrm{D}_{ik}^{lu}(\mathcal {A}_{kj})| \big (|\textrm{D}_{ik}^{ku}\mathcal S^{2n-1}|+|\mathcal S^{2n-1}|\big ). \end{aligned}$$

Since \(i \ne j\), we have \(|\textrm{D}_{ik}^{lu}(\mathcal {A}_{kj})|=O( \delta _{uj}+\delta _{lj}+\delta _{lk}+\delta _{kj}\)). Together with the trivial bound \(\textrm{D}_{ik}^{ku}\mathcal S =O(1)\), we can get the estimate

$$\begin{aligned} \frac{1}{(N-d)d}\sum _{klu}\mathbb E\chi _{il}^{ku}(\mathcal A)\textrm{D}_{ik}^{lu}(\mathcal {A}_{kj}\mathcal S^{2n-1})\prec \sum _{s=1}^{2n} (1+dN^{-1/2})^s \cdot \mathbb E |\mathcal S^{2n-s}|.\nonumber \\ \end{aligned}$$
(3.16)

Using (3.14) with \(r=4\), we get

$$\begin{aligned}{} & {} -\frac{1}{(N-d)d}\mathbb E (\mathcal {A}^4)_{ij}\mathcal S^{2n-1}\nonumber \\{} & {} \quad =-\frac{d^3}{(N-d)N} \mathbb E \mathcal S^{2n-1}+O_\prec (1+dN^{-1/2})\cdot \mathbb E |\mathcal S^{2n-1}|. \end{aligned}$$
(3.17)

Combining (3.15)–(3.17) we get

$$\begin{aligned} {\mathbb {E}} \mathcal S^{2n}\prec \sum _{s=1}^{2n} (1+dN^{-1/2})^s \cdot \mathbb E |\mathcal S^{2n-s}| \leqslant \sum _{s=1}^{2n} (1+dN^{-1/2})^s \cdot \big (\mathbb E \mathcal S^{2n}\big )^{\frac{2n-s}{2n}}, \end{aligned}$$

which implies the desired result.

(iv) The proof of (3.5) follows from the relation

$$\begin{aligned} \begin{aligned} \big |(\mathcal A^{r})_{ij}-d^rN^{-1}\big |&=\,\Big |\sum _k \big ((\mathcal {A}^2)_{ik}-d^2N^{-1}\big )\big ((\mathcal {A}^{r-2})_{kj}-d^{r-2}N^{-1}\big )\Big |\\&\leqslant \, \Big |\sum _{k:k\ne i,j} \big ((\mathcal {A}^2)_{ik}-d^2N^{-1}\big )\big ((\mathcal {A}^{r-2})_{kj}-d^{r-2}N^{-1}\big )\Big |\\&\quad +\Big |\big ((\mathcal {A}^2)_{ii}-d^2N^{-1}\big )\big ((\mathcal {A}^{r-2})_{ij}-d^{r-2}N^{-1}\big )\Big |\\&\quad +\Big |\big ((\mathcal {A}^2)_{ij}-d^2N^{-1}\big )\big ((\mathcal {A}^{r-2})_{jj}-d^{r-2}N^{-1}\big )\Big | \end{aligned} \end{aligned}$$

and (3.3), (3.4), as well as the trivial bounds \(\max _{xy}(\mathcal {A}^2)_{xy}\leqslant d\), \(\max _{xy}(\mathcal {A}^{r-2})_{xy}\leqslant d^{r-3}\). \(\square \)

4 Green Function and Local Semicircle Law

For the rest of this paper, we shall use the parameter

$$\begin{aligned} q{:}{=}\sqrt{d(N-d)/N}. \end{aligned}$$
(4.1)

Note that \(q\asymp \sqrt{d}\) for \(d \leqslant N/2\). Let us define the projection \(P_{\bot }{:}{=}I-{{\textbf {e}}} {{\textbf {e}}}^*\), where \({{\textbf {e}}}=N^{-1/2}(1,\ldots ,1)^*\). For \(z \in \mathbb C\) with \({{\,\textrm{Im}\,}}z>0\), we define the Green function by

$$\begin{aligned} G\equiv G(z){:}{=}P_\bot (q^{-1}\mathcal {A}-z)^{-1} P_\bot . \end{aligned}$$

The projection \(P_\bot \) was introduced in [4] to eliminate the large, though trivial impact of \(\lambda _1\) in the computations. As a result, the eigenvalues of G are \((q^{-1}\lambda _2-z)^{-1}\),\((q^{-1}\lambda _3-z)^{-1}\),...,\((q^{-1}\lambda _N-z)^{-1}\) and 0. It is easy to check that

$$\begin{aligned} P_\bot \mathcal {A}=\mathcal {A}P_\bot ,\ G\mathcal {A}=\mathcal {A}G=q(zG+I-{{\textbf {e}}} {{\textbf {e}}}^*), \ \hbox {and}\ \sum _i G_{ij}=\sum _j G_{ij}=0.\nonumber \\ \end{aligned}$$
(4.2)

For \(M \in \mathbb C^{N\times N}\), we denote its normalized trace by \(\underline{M} \!\,{:}{=}N^{-1}{{\,\textrm{Tr}\,}}M\). For \(x \in \mathbb R\) and \(z \in \mathbb C\) with \({{\,\textrm{Im}\,}}z>0\), we denote the semicircle law and its Stieltjes transform by

$$\begin{aligned} \varrho (x)=\frac{1}{2\pi }\sqrt{(4-x^2)_+} \quad \hbox {and} \quad m(z){:}{=}\int \frac{\varrho (x)}{x-z} \,\textrm{d}x \end{aligned}$$

respectively. The quantity \(m\equiv m(z)\) satisfies \(1+zm+m^2=0\). In addition, we have the Ward identity

$$\begin{aligned} \sum _{i}|G_{ij}|^2=\sum _{i}G_{ij}G^*_{ji}=(GG^*)_{jj}=\frac{{{\,\textrm{Im}\,}}G_{jj}}{\eta }\leqslant \frac{|G_{jj}- m|+{{\,\textrm{Im}\,}}m}{\eta }. \end{aligned}$$
(4.3)

In the sequel, it is convenient to use the following continuous derivative for any indices ijkl,

$$\begin{aligned} \partial _{ij}^{kl}F(\mathcal A){:}{=}q\,\partial _t F(\mathcal {A}+t\xi _{ij}^{kl}) \big |_{t=0} , \end{aligned}$$
(4.4)

and by Taylor expansion we have

$$\begin{aligned} \textrm{D}_{ij}^{kl}F(\mathcal {A})=\sum _{s=1}^\ell \frac{1}{s!q^s}\big (\partial _{ij}^{kl}\big )^{s} F(\mathcal {A})+\frac{1}{(\ell +1)!q^{\ell +1}}\big (\partial _{ij}^{kl}\big )^{\ell +1}F(\mathcal {A}+\theta \xi _{ij}^{kl}) \end{aligned}$$
(4.5)

for some \(\theta \in [0,1]\). We have the differential rule

$$\begin{aligned}{} & {} \partial _{ij}^{kl}G_{xy}=-G_{xi}G_{jy}-G_{xj}G_{iy}-G_{xk}G_{ly}-G_{xl}G_{ky}\nonumber \\{} & {} \quad +G_{xi}G_{ky}+G_{xk}G_{iy}+G_{xj}G_{ly}+G_{xl}G_{jy}. \end{aligned}$$
(4.6)

The following lemma will be useful in our estimates.

Lemma 4.1

Fix \(r \in {\mathbb {N}}_+\). Suppose \(z=O(1)\), then

$$\begin{aligned} \max _{ij}|(\mathcal A^rG(z))_{ij}| \prec (d^{r/2}+d^{r-3/2})\Big (\max _{ij}|G(z)_{ij}|+1\Big ). \end{aligned}$$

Proof

The result follows by repeatedly applying the second relation of (4.2) r times, and estimating the result using the trivial bound \(\max _{ij}(\mathcal {A}^n)_{ij}=O(1+d^{n-1})\) and (4.1). \(\square \)

Fix \(\delta >0\), and we define the spectral domain

$$\begin{aligned} {{\textbf {D}}}\equiv {{\textbf {D}}}_{\delta }{:}{=}\{z=E+\textrm{i}\eta : |E|\leqslant \delta ^{-1}, N^{-1+\delta }\leqslant \eta \leqslant \delta ^{-1}\}. \end{aligned}$$
(4.7)

The random graph \(\mathcal {A}\) satisfies the following local semicircle law.

Theorem 4.2

Assume \(N^{\tau } \leqslant d \leqslant N/2\) for some fixed \(\tau >0\). Fix \(\delta \in (0,\tau /10)\). We have

$$\begin{aligned} \max _{ij}\big | G_{ij}(z)-\delta _{ij}m(z) \big |\prec \frac{1}{(N\eta )^{1/4}}+\frac{1}{d^{1/4}} \end{aligned}$$

uniformly for \(z \in {{\textbf {D}}}\).

For the rest of this section we prove the next result; Theorem 4.2 then follows through a standard stability analysis argument, see e.g.  [25, Section 4].

Proposition 4.3

Assume \(N^{\tau } \leqslant d \leqslant N/2\) for some fixed \(\tau >0\). Fix \(\delta \in (0,\tau /10)\) and \(\nu \in (0,\delta /10)\). Let \(z\in {{\textbf {D}}}\), and suppose that \(\max _{ij}|G_{ij}-\delta _{ij}m|\prec \phi \) for some deterministic \(\phi \in [N^{-1},N^{\nu }]\) at z. Then at z we have

$$\begin{aligned} \max _{ij}\big |\delta _{ij}+zG_{ij}+\underline{G} \!\,G_{ij} \big |\prec (1+\phi )^3\cdot \sqrt{\frac{\phi +{{\,\textrm{Im}\,}}m}{N\eta }+\frac{1}{d}}{=}{:}\widetilde{\mathcal E}. \end{aligned}$$

Suppose that

$$\begin{aligned} \max _{ij}\big |\delta _{ij}+zG_{ij}+\underline{G} \!\,G_{ij} \big |\prec \Phi \end{aligned}$$
(4.8)

for some deterministic \(\Phi \in [\widetilde{\mathcal E},N^2]\). Fix \(n\in \mathbb N_+\). Fix indices ij and denote \(P\equiv P_{ij}{:}{=}\delta _{ij}+zG_{ij}+\underline{G} \!\,G_{ij}\). Proposition 4.3 is an easy consequence of

$$\begin{aligned} \mathbb E |P|^{2n}\prec \Phi ^n\widetilde{\mathcal E}^{n}. \end{aligned}$$
(4.9)

More precisely, since n is an arbitrary fixed integer, we obtain from Markov’s inequality that \(P\prec (\Phi \widetilde{\mathcal E})^{1/2}\). Taking a union bound over indices ij, we get

$$\begin{aligned} \max _{ij}\big |\delta _{ij}+zG_{ij}+\underline{G} \!\,G_{ij} \big |\prec (\Phi \widetilde{\mathcal E})^{1/2}, \end{aligned}$$

provided that (4.8) holds. Iterating the above, we get Proposition 4.3 as desired.

Let us look into the proof of (4.9). By (4.2), we get \(P=q^{-1}(\mathcal {A}G)_{ij}+\underline{G} \!\,G_{ij}+N^{-1}\), and thus

$$\begin{aligned} \begin{aligned} \mathbb E |P|^{2n}&=\frac{1}{q}\sum _{k}\mathbb E\mathcal {A}_{ik}G_{kj}P^{n-1}\overline{P}^n+\mathbb E \underline{G} \!\,G_{ij}P^{n-1}\overline{P}^{n}\\&\quad +O(N^{-1})\cdot \mathbb E |P|^{2n-1}\\&{=}{:}\hbox {(I)+(II)}+O(N^{-1})\cdot \mathbb E |P|^{2n-1}. \end{aligned} \end{aligned}$$
(4.10)

We denote \(\mathcal P{:}{=}(\mathbb E|P|^{2n})^{\frac{1}{2n}}\) and \(\mathcal E{:}{=}(\Phi \widetilde{\mathcal E})^{1/2}\). It suffices to show that

$$\begin{aligned} \hbox {(I)+(II)}\prec \sum _{r=1}^{2n} \mathcal E^{r}\mathcal P^{2n-r}. \end{aligned}$$
(4.11)

To simplify notation, we shall drop the complex conjugates in (I)+(II) (which play no role in the subsequent analysis), and prove

$$\begin{aligned} \hbox {(I)'+(II)'}{:}{=}\frac{1}{q}\sum _{k}\mathbb E\mathcal {A}_{ik}G_{kj}P^{2n-1}+\mathbb E \underline{G} \!\,G_{ij}P^{2n-1}\prec \sum _{r=1}^{2n} \mathcal E^{r}\mathcal P^{2n-r} \end{aligned}$$
(4.12)

instead of (4.11). By triangle inequality and the fact that \(|m(z)|=O(1)\), we have

$$\begin{aligned} \max _{ij}|G_{ij}| \prec 1+\phi . \end{aligned}$$
(4.13)

By Lemma 2.2, we have

$$\begin{aligned} \hbox {(I)'}&=\,\frac{1}{(N-d)dq}\sum _{klx:k\ne i}\mathbb E \chi _{il}^{kx}(\mathcal A)\textrm{D}_{ik}^{lx}(G_{kj}P^{2n-1}) +\frac{d}{(N-d)q}\sum _{k:k\ne i}\mathbb E G_{kj}P^{2n-1}\nonumber \\&\quad -\frac{1}{(N-d)dq}\sum _{k:k\ne i}\mathbb E (\mathcal {A}^3)_{ik}G_{kj}P^{2n-1}+O(N^{-1}q^{-1})\cdot \sum _{k:k \ne i}\mathbb E\mathcal M_{ik}(G_{kj}P^{2n-1})\nonumber \\&=\,\,\frac{1}{(N-d)dq}\sum _{klx}\mathbb E \chi _{il}^{kx}(\mathcal A)\textrm{D}_{ik}^{lx}(G_{kj}P^{2n-1}) +\frac{d}{(N-d)q}\sum _{k}\mathbb E G_{kj}P^{2n-1}\nonumber \\&\quad -\frac{1}{(N-d)dq}\sum _{k}\mathbb E (\mathcal {A}^3)_{ik}G_{kj}P^{2n-1}+O(N^{-1}q^{-1})\cdot \sum _{k:k \ne i}\mathbb E\mathcal M_{ik}(G_{kj}P^{2n-1})\nonumber \\&\quad -\frac{1}{(N-d)dq}\sum _{lx}\mathbb E \chi _{il}^{ix}(\mathcal A)\textrm{D}_{ii}^{lx}(G_{ij}P^{2n-1}) -\frac{d}{(N-d)q}\mathbb E G_{ij}P^{2n-1}\nonumber \\&\quad +\frac{1}{(N-d)dq}\mathbb E (\mathcal {A}^3)_{ii}G_{kj}P^{2n-1} {=}{:}T_1+\cdots +T_7. \end{aligned}$$
(4.14)

By the last relation of (4.2), it is easy to see that \(T_2=0\). Applying Lemma 4.1 for \(r=3\), we have \((\mathcal {A}^3\,G)_{ij}\prec (1+\phi )\,d^{3/2}\). Thus

$$\begin{aligned}{} & {} T_3=O({N^{-1}d^{-3/2}})\cdot \mathbb E |(\mathcal A^3G)_{ij}P^{2n-1}| \\{} & {} \quad \prec (1+\phi )N^{-1} \mathbb E |P^{2n-1}|\leqslant (1+\phi )N^{-1}\mathcal P^{2n-1}. \end{aligned}$$

Let us estimate the remainder term \(T_4\). By (3.2), (4.5), (4.6) and (4.13), we see that

$$\begin{aligned}{} & {} \mathcal M_{ik}(G_{kj}P^{2n-1}) \prec |G_{kj}P^{2n-1}|+\max _{xy}|\textrm{D}_{ik}^{xy}G_{kj}P^{2n-1}| \\{} & {} \quad \prec (1+\phi ) \sum _{r=1}^{2n} ((1+\phi )^3q^{-1})^{r-1}P^{2n-r}, \end{aligned}$$

which implies

$$\begin{aligned} T_4 \prec N^{-1}q^{-1}\cdot N \cdot (1+\phi )\sum _{r=1}^{2n} ((1+\phi )^3q^{-1})^{r-1}\mathbb EP^{2n-r} \prec \sum _{r=1}^{2n} \mathcal E^{r}\mathcal P^{2n-r}.\nonumber \\ \end{aligned}$$
(4.15)

In the squeal, the remainder term from Lemma 2.2 will always be small enough for our purposes, and we shall omit their estimates. Moreover,

$$\begin{aligned} T_5 \prec N^{-1}d^{-3/2}(1+\phi )\mathbb E\sum _{il} \mathcal A_{il}\mathcal A_{ix} \sum _{r=1}^{2n} ((1+\phi )^3q^{-1})^{r-1}P^{2n-r}\prec \sum _{r=1}^{2n} \mathcal E^{r}\mathcal P^{2n-r} \end{aligned}$$

and

$$\begin{aligned} T_6,\,T_7 \prec \mathcal E \mathcal P^{2n-1}. \end{aligned}$$

As a result, (4.14) simplifies to

$$\begin{aligned} \hbox {(I)'}=T_1+\sum _{r=1}^{2n} O_{\prec }(\mathcal E^{r})\cdot \mathcal P^{2n-r}. \end{aligned}$$
(4.16)

To examine the terms in \(T_1\), we split according to (3.2)

$$\begin{aligned} \begin{aligned}&T_1=\,\frac{1}{(N-d)dq}\sum _{klx}\mathbb E \chi _{il}^{kx}(\mathcal A)\textrm{D}_{ik}^{lx}(G_{kj})P^{2n-1}\\&\qquad +\frac{1}{(N-d)dq}\sum _{klx}\mathbb E \chi _{il}^{kx}(\mathcal A)G_{kj}\textrm{D}_{ik}^{lx}(P^{2n-1})\\&\qquad +\frac{1}{(N-d)dq}\sum _{klx}\mathbb E \chi _{il}^{kx}(\mathcal A)\textrm{D}_{ik}^{lx}(G_{kj})\textrm{D}_{ik}^{lx}(P^{2n-1})\\&\quad {=}{:}T_{1,1}+T_{1,2}+T_{1,3}. \end{aligned} \end{aligned}$$
(4.17)

4.1 Computation of \(T_{1,1}\)

Applying (4.5) with \(\ell =1\), we get

$$\begin{aligned}&T_{1,1}=\frac{1}{(N-d)dq^2}\sum _{klx}\mathbb E \chi _{il}^{kx}(\mathcal A)\partial _{ik}^{lx}(G_{kj})P^{2n-1}\nonumber \\&\qquad +\frac{1}{2(N-d)dq^3}\sum _{klx}\mathbb E \chi _{il}^{kx}(\mathcal A)((\partial _{ik}^{lx})^2G_{kj}(\mathcal A+\theta \xi _{ik}^{lx}))P^{2n-1}\nonumber \\&\quad {=}{:}T_{1,1,1}+T_{1,1,2} \end{aligned}$$
(4.18)

for some \(\theta \in [0,1]\). By (4.6), we get

$$\begin{aligned} T_{1,1,1}= & {} \frac{1}{(N-d)dq^2}\sum _{klx}\mathbb E \chi _{il}^{kx}(\mathcal A)P^{2n-1}\big (-G_{kk}G_{ij}-G_{ki}G_{kj}-G_{kl}G_{xj}-G_{kx}G_{lj}\nonumber \\{} & {} +G_{ki}G_{xj}+G_{kx}G_{ij}+G_{kl}G_{kj}+G_{kk}G_{lj}\big ){=}{:}\sum _{s=1}^{8}T_{1,1,1,s}. \end{aligned}$$
(4.19)

The term \(T_{1,1,1,1}\) is the leading term in our computation. Recall the definition of \(\chi \) in (2.2). We have

$$\begin{aligned} \begin{aligned} T_{1,1,1,1}=&-\frac{1}{(N-d)dq^2}\sum _k\mathbb E\big (\mathcal {A}_{ik}(\mathcal {A}^3)_{ik}+d^2-d^2\mathcal {A}_{ik}-(\mathcal {A}^3)_{ik}\big )G_{kk}G_{ij}P^{2n-1}\\&=-\frac{1}{(N-d)dq^2}\sum _k\mathbb E\big (\mathcal {A}_{ik}d^3N^{-1}+d^2-d^2\mathcal {A}_{ik}-d^3N^{-1}\big )G_{kk}G_{ij}P^{2n-1}\\&\quad +O_\prec \big ((1+\phi )^2 N^{-1/2} \big )\mathbb E |P|^{2n-1}\\&=\frac{d}{Nq^2}\sum _{k}\mathbb E \mathcal {A}_{ik}G_{kk}G_{ij}P^{2n-1}-\frac{d}{q^2}\mathbb E \underline{G} \!\,G_{ij}P^{2n-1}+O_\prec (\mathcal E)\mathbb E |P|^{2n-1}. \end{aligned} \end{aligned}$$
(4.20)

Here in the second step we used (3.4), which implies

$$\begin{aligned} \begin{aligned}&\,\frac{1}{(N-d)dq^2}\sum _{k}\mathbb E \mathcal {A}_{ik} ((\mathcal {A}^3)_{ik}-d^3N^{-1})G_{kk}G_{ij} P^{2n-1} \\&\quad \prec \ \frac{(1+\phi )^2 }{Nd^2} \cdot (d+d^2N^{-1/2}) \cdot N\mathbb E |P|^{2n-1} \prec \mathcal E\,\mathbb E |P|^{2n-1} \end{aligned} \end{aligned}$$

and similarly

$$\begin{aligned} \frac{1}{(N-d)dq^2}\sum _{k}\mathbb E ((\mathcal {A}^3)_{ik}-d^3N^{-1})G_{kk}G_{ij} P^{2n-1} \prec \mathcal E\,\mathbb E |P|^{2n-1}. \end{aligned}$$

For the first term on RHS of (4.20), we again apply Lemma 2.2, this time with \(F(\mathcal {A})=G_{kk}G_{ij}P^{2n-1}\), and get

$$\begin{aligned}{} & {} \frac{d}{Nq^2}\sum _{k}\mathbb E \mathcal {A}_{ik}G_{kk}G_{ij}P^{2n-1}\nonumber \\{} & {} \quad =\,\frac{1}{(N-d)Nq^2}\sum _{kxy}\mathbb E \chi _{ix}^{ky}(\mathcal {A})\textrm{D}_{ik}^{xy}(G_{kk}G_{ij}P^{2n-1}) +\frac{d^2}{(N-d)q^2}\mathbb E \underline{G} \!\,G_{ij}P^{2n-1}\nonumber \\{} & {} \qquad -\frac{1}{(N-d)Nq^2}\sum _k\mathbb E (\mathcal {A}^3)_{ik}G_{kk}G_{ij}P^{2n-1}\nonumber \\{} & {} \qquad +O\bigg (\frac{d}{N^2q^2}\bigg )\cdot \sum _k\mathbb E\mathcal M_{ik}(G_{kk}G_{ij}P^{2n-1}) +\sum _{r=1}^{2n} O_{\prec }(\mathcal E^{r}) \mathcal P^{2n-r} \end{aligned}$$
(4.21)

By (4.6) and (4.13) one can check that

$$\begin{aligned} \begin{aligned} \textrm{D}_{ik}^{xy}(G_{kk}G_{ij}P^{2n-1})&\prec \,\frac{(1+\phi )^{3}}{q}\sum _{r=1}^{2n}\frac{(1+\phi )^{3r-1}}{q^{r-1}}P^{2n-r} +\sum _{r=2}^{2n}\frac{(1+\phi )^{3r-1}}{q^{r-1}}P^{2n-r}\\&\quad \big (|G_{ik}|+|G_{ix}|+|G_{iy}|+|G_{kj}|+|G_{xj}|+|G_{yj}|\big ). \end{aligned} \end{aligned}$$

By (4.3), we have

$$\begin{aligned} \sum _{kxy}\chi _{ix}^{ky}(\mathcal {A})\big (|G_{ik}|+|G_{ix}|+|G_{iy}|+|G_{kj}|+|G_{xj}|+|G_{yj}|\big ) \prec dN^2(1+\varphi ) \sqrt{\frac{\phi +{{\,\textrm{Im}\,}}m}{N\eta }}, \end{aligned}$$

and together with \(\sum _{kxy}\chi _{ix}^{ky}(\mathcal {A})\leqslant d^2N\) and \(q\asymp \sqrt{d}\), we get

$$\begin{aligned} \frac{1}{(N-d)Nq^2}\sum _{kxy}\mathbb E \chi _{ix}^{ky}(\mathcal {A})\textrm{D}_{ik}^{xy}(G_{kk}G_{ij}P^{2n-1}) \prec \sum _{r=1}^{2n} \mathcal E^{r} \mathcal P^{2n-r}. \end{aligned}$$
(4.22)

Applying (3.4), we get

$$\begin{aligned}{} & {} -\frac{1}{(N-d)Nq^2}\sum _k\mathbb E (\mathcal {A}^3)_{ik}G_{kk}G_{ij}P^{2n-1}\nonumber \\{} & {} \quad =-\frac{d^3}{(N-d)Nq^2}\mathbb E \underline{G} \!\,G_{ij}P^{2n-1}+O_\prec (\mathcal E)\cdot \mathcal P^{2n-1}. \end{aligned}$$
(4.23)

In addition, similar to (4.15), it can be shown that

$$\begin{aligned} O\bigg (\frac{d}{N^2q^2}\bigg )\cdot \sum _k\mathbb E\mathcal M_{ij}(G_{kk}G_{ij}P^{2n-1}) \prec \sum _{r=1}^{2n} \mathcal E^{r} \mathcal P^{2n-r}. \end{aligned}$$

Combing the above and (4.21)–(4.23), we get

$$\begin{aligned} \frac{d}{Nq^2}\sum _{k}\mathbb E \mathcal {A}_{ik}G_{kk}G_{ij}P^{2n-1}=\frac{d^2}{Nq^2}\mathbb E \underline{G} \!\,G_{ij}P^{2n-1}+\sum _{r=1}^{2n} O_\prec (\mathcal E^{r}\mathcal P^{2n-r}).\nonumber \\ \end{aligned}$$
(4.24)

Hence (4.1), (4.20) and (4.24) implies

$$\begin{aligned} T_{1,1,1,1}=-\mathbb E \underline{G} \!\,G_{ij}P^{2n-1}+\sum _{r=1}^{2n} O_\prec (\mathcal E^{r} \mathcal P^{2n-r}). \end{aligned}$$
(4.25)

Other terms on RHS of (4.19) are error terms, and let us estimate them one by one. By \(\chi _{il}^{kx}(\mathcal {A}) \leqslant \mathcal {A}_{il}\mathcal {A}_{kx}\), we have

$$\begin{aligned}{} & {} T_{1,1,1,2} \prec N^{-1}d^{-2}\sum _{klx} \mathbb E \mathcal {A}_{il}\mathcal {A}_{kx} |P|^{2n-1} |G_{ki}G_{kj}| \\{} & {} \quad \prec N^{-1}\sum _{k} \mathbb E |P|^{2n-1} |G_{ki}G_{kj}| \prec \mathcal E \mathcal P^{2n-1}, \end{aligned}$$

where in the last step we used (4.3) and Jensen’s inequality. In addition,

$$\begin{aligned}{} & {} |T_{1,1,1,3}| \prec N^{-1}d^{-2}\sum _{klx} \mathbb E \mathcal {A}_{il}\mathcal {A}_{kx} |P|^{2n-1} |(1+\phi )G_{xj}| \\{} & {} \quad \prec N^{-1}\sum _{x} \mathbb E |P|^{2n-1} |(1+\phi )G_{xj}| \prec \mathcal E \mathcal P^{2n-1}. \end{aligned}$$

Similarly, we can show that \(|T_{1,1,1,5}|+|T_{1,1,1,7}|\prec \mathcal E \mathcal P^{2n-1}\). Next, we have

$$\begin{aligned} \begin{aligned} T_{1,1,1,4}&=-\frac{1}{(N-d)dq^2}\sum _{klx}\mathbb E(\mathcal {A}_{il}\mathcal {A}_{kx}+\mathcal {A}_{il}\mathcal {A}_{ik}\mathcal {A}_{kx}\mathcal {A}_{lx}\\&\quad -\mathcal {A}_{il}\mathcal {A}_{kx}\mathcal {A}_{lx}-\mathcal {A}_{il}\mathcal {A}_{ik}\mathcal {A}_{kx})\\ G_{kx}G_{lj}P^{2n-1}&=-\frac{1}{(N-d)dq^2}\mathbb E {{\,\textrm{Tr}\,}}(\mathcal {A}G) (\mathcal {A}G)_{ij} P^{2n-1}+O(N^{-1}d^{-2})\\&\quad \times \sum _{klx} \mathbb E (\mathcal {A}_{kx}\mathcal {A}_{lx}+\mathcal {A}_{ik}\mathcal {A}_{kx})(1+\phi )|G_{lj}| |P|^{2n-1}\\&\prec d^{-1}\mathcal P^{2n-1}+N^{-1}\sum _{l}(1+\phi )|G_{lj}| |P|^{2n-1}\prec \mathcal E\mathcal P^{2n-1}, \end{aligned} \end{aligned}$$

where in the third step we used Lemma 4.1. Similarly, we can show that \(|T_{1,1,1,6}|+|T_{1,1,1,8}|\prec \mathcal E \mathcal P^{2n-1}\). As a result,

$$\begin{aligned} \sum _{s=2}^{8}T_{1,1,1,s} \prec \mathcal E \mathcal P^{2n-1}. \end{aligned}$$
(4.26)

By resolvent identity and (4.13), it is easy to see that \(((\partial _{ik}^{lx})^2G_{kj}(\mathcal A+\theta \xi _{ik}^{lx})) \prec (1+\phi )^3\), and thus

$$\begin{aligned} T_{1,1,2}\prec & {} \frac{(1+\phi )^3}{(N-d)dq^3}\sum _{klx}\mathbb E | \chi _{il}^{kx}(\mathcal A)P^{2n-1}| \leqslant \frac{(1+\phi )^3}{Nd^{5/2}}\sum _{klx}\mathbb E | \mathcal {A}_{il}\mathcal {A}_{kx}P^{2n-1}|\\\prec & {} \frac{(1+\phi )^2}{d^{1/2}} \mathcal P^{2n-1}, \end{aligned}$$

where in the second step we used \(q\asymp \sqrt{d}\). Combining the above with (4.18), (4.19), (4.25) and (4.26), we finish the computation of \(T_{1,1}\) by getting

$$\begin{aligned} T_{1,1}=-\mathbb E \underline{G} \!\,G_{ij} P^{2n-1}+\sum _{r=1}^{2n} O_\prec (\mathcal E^{r}\mathcal P^{2n-r}). \end{aligned}$$
(4.27)

4.2 Estimate of \(T_{1,2}\)

Case 1. Let us first illustrate the steps on the dense regime \(d \asymp N\). In this case, \(q\asymp \sqrt{d}\asymp \sqrt{N}\). Trivially, we have \(\textrm{D}_{ik}^{lx}P \prec \mathcal E\). By (4.3), (4.6) and (4.13), we have

$$\begin{aligned}{} & {} q^{-1}\partial _{ik}^{lx} P \prec q^{-1}(1+\phi )^2(|G_{kj}|+|G_{xj}|+|G_{lj}|+|G_{ik}|+|G_{il}|)+q^{-1}\mathcal E \ \ \hbox {and}\ \ \\{} & {} q^{-s}(\partial _{ik}^{lx})^s P \prec q^{-1}\mathcal E \end{aligned}$$

for \(s \geqslant 2\). Together with (3.2) and (4.5), it is not hard to see that

$$\begin{aligned} \textrm{D}_{ik}^{lx}(P^{2n-1})\prec & {} q^{-1}(1+\phi )^2(|G_{kj}|+|G_{xj}|+|G_{lj}|+|G_{ik}|+|G_{il}|)\\{} & {} \times \sum _{r=2}^{2n} \mathcal E^{r-2}|P|^{2n-r}+q^{-1}\sum _{r=2}^{2n} \mathcal E^{r-1}|P|^{2n-r}. \end{aligned}$$

Together with the trivial bound \(\chi _{il}^{kx}(\mathcal {A})\leqslant 1\) and (4.3), we conclude that

$$\begin{aligned} \begin{aligned} T_{1,2}&\prec N^{-5/2}\sum _{klx}\mathbb E |G_{kj}\textrm{D}_{ik}^{lx}(P^{2n-1})|\\&\prec N^{-3} (1+\phi )^2 \sum _{klx}\mathbb E\bigg ( |G_{kj}|(|G_{kj}|+|G_{xj}|+|G_{lj}|\\&\quad +|G_{ik}|+|G_{il}|)\sum _{r=2}^{2n} \mathcal E^{r-2}|P|^{2n-r}\bigg )+\sum _{r=2}^{2n} \mathcal E^{r}\mathcal P^{2n-r}\\&\prec \sum _{r=2}^{2n} \mathcal E^{r}\mathcal P^{2n-r}. \end{aligned} \end{aligned}$$

Case 2. Now let us examine the general case. The growing complexity is largely due to the fact that we are including the sparse regime \(d \ll N\), and as a result we cannot estimate the entries of \(\mathcal {A}\) by 1: they have to be used in the summations. By (4.2), we can rewrite P by \(P=q^{-1}(\mathcal {A}G)_{ij}+\underline{G} \!\,G_{ij}+N^{-1}\). Using (3.2), (4.3), (4.5), (4.6) and (4.13), we get

$$\begin{aligned} \begin{aligned} \textrm{D}_{ik}^{lx}P&=\,q^{-1}\sum _a (\textrm{D}_{ik}^{lx}\mathcal {A}_{ia}) G_{aj} +q^{-1}\sum _a (\textrm{D}_{ik}^{lx}\mathcal {A}_{ia})( \textrm{D}_{ik}^{lx}G_{aj})+q^{-1}\sum _a \mathcal {A}_{ia}( \textrm{D}_{ik}^{lx}G_{aj})\\&\quad +(\textrm{D}_{ik}^{lx}\underline{G} \!\,)G_{ij} +(\textrm{D}_{ik}^{lx} \underline{G} \!\,)(\textrm{D}_{ik}^{lx} G_{ij})+ \underline{G} \!\,(\textrm{D}_{ik}^{lx} G_{ij})\\&=\, q^{-1}\sum _a (\textrm{D}_{ik}^{lx}\mathcal {A}_{ia}) G_{aj}+q^{-1}\sum _a \mathcal {A}_{ia}( \textrm{D}_{ik}^{lx}G_{aj})\\&\quad + \underline{G} \!\,(\textrm{D}_{ik}^{lx} G_{ij})+O_\prec ((1+\phi )^3)\cdot \bigg ( \frac{1}{q^2}+\frac{{{\,\textrm{Im}\,}}m +\phi }{N\eta q}\bigg )\\&=\,q^{-1}\sum _a (\textrm{D}_{ik}^{lx}\mathcal {A}_{ia}) G_{aj}+q^{-2}\sum _a \mathcal {A}_{ia}( \partial _{ik}^{lx}G_{aj})\\&\quad + q^{-1}\underline{G} \!\,(\partial _{ik}^{lx} G_{ij})+O_{\prec }(\mathcal Eq^{-1}). \end{aligned} \end{aligned}$$

Let us denote \(P_*{:}{=}\max _{ij}|\delta _{ij}+z\underline{G} \!\,_{ij}+\underline{G} \!\,G_{ij}|=\max _{ij}|q^{-1}(\mathcal {A}G)_{ij}+\underline{G} \!\,G_{ij}+N^{-1}|\). By (4.6) and (4.13), it is not hard to see that

$$\begin{aligned} q^{-2}\sum _a \mathcal {A}_{ia}( \partial _{ik}^{lx}G_{aj})+ q^{-1}\underline{G} \!\,(\partial _{ik}^{lx} G_{ij}) \prec (1+\phi )q^{-1} P_*+N^{-1}\prec (1+\phi )q^{-1} \Phi +N^{-1}, \end{aligned}$$

where in the last step we also used our assumption (4.8). The above shows that heuristically, \(\textrm{D}_{ik}^{lx}\) on P generates some self-similar terms. Hence

$$\begin{aligned} \textrm{D}_{ik}^{lx}P&=\,q^{-1}\sum _a (\textrm{D}_{ik}^{lx}\mathcal {A}_{ia}) G_{aj}+O_{\prec }((1+\phi )\Phi q^{-1}+\mathcal Eq^{-1})\nonumber \\&=\,q^{-1}(G_{kj}+\delta _{ik}G_{ij}+\delta _{il}G_{xj}+\delta _{ix}G_{lj} -G_{lj}-\delta _{il}G_{ij}-\delta _{ik}G_{xj}-\delta _{ix}G_{kj})\nonumber \\&\quad +O_{\prec }((1+\phi )\Phi q^{-1}+\mathcal Eq^{-1}). \end{aligned}$$
(4.28)

Let us define \(X{:}{=}(2n-1)P^{2n-2}+(2n-2)P \textrm{D}_{ik}^{jl}(P^{2n-3})+(2n-3)P^2 \textrm{D}_{ik}^{jl}(P^{2n-4})+\cdots + 2P^{2n-3} \textrm{D}_{ik}^{jl}(P)\). Note that the trivial estimate \(\textrm{D}_{ik}^{lx}P \prec \mathcal E\) implies

$$\begin{aligned} X\prec \sum _{r=2}^{2n}\mathcal E^{r-2} |P|^{2n-r}. \end{aligned}$$
(4.29)

By (4.28) and (4.29), we get

$$\begin{aligned}&\textrm{D}_{ik}^{lx}(P^{2n-1})=(\textrm{D}_{ik}^{lx}P) X \nonumber \\&\quad =\ q^{-1}(G_{kj}+\delta _{ik}G_{ij}+\delta _{il}G_{xj}\nonumber \\&\qquad +\delta _{ix}G_{lj}-G_{lj}-\delta _{il}G_{ij}-\delta _{ik}G_{xj}-\delta _{ix}G_{kj})X\nonumber \\&\qquad +\sum _{r=2}^{2n} O_{\prec }(\mathcal E^r)\cdot |P|^{2n-r} \,. \end{aligned}$$
(4.30)

By (4.1), (4.3) and (4.13), it is easy to see that

$$\begin{aligned}{} & {} \frac{1}{(N-d)dq^2}\sum _{klx}\mathbb E |\chi _{il}^{kx}(\mathcal A)G_{kj}q^{-1}(\delta _{ik}G_{ij}+\delta _{il}G_{xj}\nonumber \\{} & {} \quad +\delta _{ix}G_{lj}-\delta _{il}G_{ij}-\delta _{ik}G_{xj}-\delta _{ix}G_{kj})| \prec \mathcal E^2. \end{aligned}$$
(4.31)

Inserting (4.29)–(4.31) into (4.17), we get

$$\begin{aligned} \begin{aligned} T_{1,2}=&\ \frac{1}{(N-d)dq}\sum _{klx} \mathbb E \chi _{il}^{kx}(\mathcal {A})G_{kj} (\textrm{D}_{ik}^{lx}P) X\\ =&\ \frac{1}{(N-d)dq^2}\sum _{klx} \mathbb E \chi _{il}^{kx}(\mathcal {A})G_{kj}(G_{kj}-G_{lj})X+\sum _{r=2}^{2n}O_\prec (\mathcal E^{r})\cdot \mathcal P^{2n-r}. \end{aligned} \end{aligned}$$
(4.32)

By (4.3) and (4.29), we have

$$\begin{aligned} \begin{aligned}&\frac{1}{(N-d)dq^2}\sum _{klx} \mathbb E \chi _{il}^{kx}(\mathcal {A})G^2_{kj}X \\&\quad \prec \frac{1}{Nd^{2}}\sum _{klx} \mathbb E\bigg ( |\mathcal {A}_{il}\mathcal {A}_{kx}G^2_{kj}| \cdot \sum _{r=2}^{2n} \mathcal E^{r-2} |P|^{2n-r} \bigg )\\&\quad \prec \sum _{r=2}^{2n} \mathcal E^{r} \mathcal P^{2n-r} , \end{aligned} \end{aligned}$$

and combing the above with (4.32) yields

$$\begin{aligned} \begin{aligned} T_{1,2}&=-\frac{1}{(N-d)dq^2}\sum _{klx} \mathbb E \chi _{il}^{kx}(\mathcal {A})G_{kj}G_{lj}X+\sum _{r=2}^{2n}O_\prec (\mathcal E^{r})\cdot \mathcal P^{2n-r}\\&{=}{:}T_{1,2,1}+ \sum _{r=2}^{2n}O_\prec (\mathcal E^{r})\cdot \mathcal P^{2n-r}. \end{aligned} \end{aligned}$$
(4.33)

If we look at the term \(T_{1,2,1}\), it contains the factor \(\mathcal {A}_{il}G_{lj}\) so we cannot use the smallness of \(\sum _l \mathcal {A}_{il}\) and Ward identity at the same time. We (unfortunately) have to apply Lemma 2.2 again. Let us abbreviate \(F(\mathcal {A})=(1-\mathcal {A}_{ik})\mathcal {A}_{kx}(1-\mathcal {A}_{lx})G_{kj}G_{lj}X\). Lemma 2.2 implies

$$\begin{aligned} T_{1,2,1}=&\,-\frac{1}{(N-d)^2d^2q^2}\sum _{klxyz}\mathbb E \chi _{iy}^{lz}(\mathcal A)\textrm{D}_{il}^{yz} F(\mathcal {A})-\frac{1}{(N-d)^2q^2}\sum _{klx}\mathbb E F(\mathcal {A})\nonumber \\&+\frac{1}{(N-d)^2d^2q^2}\sum _{klx}\mathbb E (\mathcal {A}^3)_{il}F(\mathcal {A})+O(N^{-2}d^{-2})\cdot \sum _{klx}\mathbb E\mathcal M_{il}(F(\mathcal {A}))\nonumber \\&+\sum _{r=1}^{2n}O_\prec (\mathcal E^{r})\cdot \mathcal P^{2n-r}\nonumber \\ {=}{:}&\, T_{1,2,1,1}+\cdots +T_{1,2,1,4}+\sum _{r=1}^{2n}O_\prec (\mathcal E^{r})\cdot \mathcal P^{2n-r}\,. \end{aligned}$$
(4.34)

By (4.3) and (4.29) we have

$$\begin{aligned} T_{1,2,1,2}\prec N^{-2}d^{-1}\cdot \sum _{klx}\mathbb E\bigg ( |\mathcal {A}_{kx}G_{kj}G_{lj}|\cdot \sum _{r=2}^{2n} \mathcal E^{r-2} |P|^{2n-r} \bigg )\prec \sum _{r=2}^{2n}\mathcal E^r \mathcal P^{2n-r}.\nonumber \\ \end{aligned}$$
(4.35)

Note that the above estimate works because of the absence of \(\mathcal {A}_{il}\). Similarly, we can use (4.5) and resolvent identity to show that

$$\begin{aligned} T_{1,2,1,1}+T_{1,2,1,4} \prec \sum _{r=2}^{2n}\mathcal E^r \mathcal P^{2n-r}. \end{aligned}$$
(4.36)

With the help of (3.4), (4.3) and (4.13), we have

$$\begin{aligned} \begin{aligned} T_{1,2,1,3}&\prec N^{-2}d^{-3}\sum _{klx} \mathbb E\bigg ( |(\mathcal {A}^3)_{il}\mathcal {A}_{kx}G_{kj}G_{lj}| \cdot \sum _{r=2}^{2n} \mathcal E^{r-2} |P|^{2n-r} \bigg )\\&\prec N^{-1}d^{-2}\sum _{l} \mathbb E\bigg ( |(\mathcal {A}^3)_{il}G_{lj}|\cdot \sum _{r=2}^{2n}\mathcal E^{r-1}\mathcal P^{2n-r}\bigg ) \\&\prec (1+\phi ) N^{-1}d^{-2}\sum _{l} \mathbb E\bigg ( |(\mathcal {A}^3)_{il}-N^{-1}d^3|\cdot \sum _{r=2}^{2n}\mathcal E^{r-1}\mathcal P^{2n-r}\bigg ) \\&\ + N^{-2}d \sum _{l} \mathbb E\bigg ( |G_{lj}|\cdot \sum _{r=2}\mathcal E^{r-1}\mathcal P^{2n-r}\bigg )\prec \sum _{r=2}^{2n}\mathcal E^r \mathcal P^{2n-r}. \end{aligned} \end{aligned}$$
(4.37)

Combining (4.33)–(4.37) we get

$$\begin{aligned} T_{1,2}\prec \sum _{r=2}\mathcal E^r \mathcal P^{2n-r}. \end{aligned}$$
(4.38)

4.3 Estimate of \(T_{1,3}\)

The estimates of \(T_{1,3}\) are very similar to those of \(T_{1,2}\). In \(T_{1,3}\), the factor \(G_{kj}\) is replaced by \(\textrm{D}_{ik}^{lx}G_{kj}\), which means we cannot use the Ward identity over summation index k. However, we are compensate by the fact that \(\textrm{D}_{ik}^{lx}G_{kj}\) generates at least one factor of \(q^{-1}\), which is equivalently good in our estimates. Thus by steps that are very similar to how we estimated \(T_{1,2}\), it can be shown that

$$\begin{aligned} T_{1,3} \prec \sum _{r=2}^{2n} \mathcal E^{r}\mathcal P^{2n-r}. \end{aligned}$$
(4.39)

Combining (4.16), (4.17), (4.27), (4.38) and (4.39) yields

$$\begin{aligned} \hbox {(I)'}=-\mathbb E \underline{G} \!\,G_{ij}P^{2n-1}+\sum _{r=1}^{2n} O_{\prec }(\mathcal E^{r})\cdot \mathcal P^{2n-r}, \end{aligned}$$

and thus we have (4.12) as desired. This finishes the proof of Proposition 4.3.

5 Strong Self-Consistent Equation Near the Edge

To get a more precise description of the spectrum, let us define the shifted Stieltjes transform

$$\begin{aligned} \widehat{m}(z){:}{=}m\Big (z+\frac{d}{Nq}\Big ). \end{aligned}$$

We have

$$\begin{aligned} \widehat{m}^2+\Big (z+\frac{d}{Nq}\Big )\widehat{m}+1=0, \quad \hbox {and} \quad \widehat{m}(z)-m(z)=O(N^{-1/4}) \end{aligned}$$
(5.1)

uniformly for \(z \in {{\textbf {D}}}\). Let us write \(z=E+\textrm{i}\eta \) and \(\kappa {:}{=}|(E+\frac{d}{Nq})^2-4|\). It is easy to see that

$$\begin{aligned} {{\,\textrm{Im}\,}}\widehat{m}(z) \asymp {\left\{ \begin{array}{ll} \sqrt{\kappa +\eta } \quad &{}\hbox {if} \quad |E|\leqslant 2\\ \frac{\eta }{\sqrt{\kappa +\eta }} \quad &{}\hbox {if} \quad |E|>2. \end{array}\right. } \end{aligned}$$

Having the weak local law at hand, we can relate the entrywise law to the average law in the following sense. A standard consequence of Theorem 4.2 is the eigenvector delocalization Corollary 1.4, which together with (4.3) implies

$$\begin{aligned} \sum _{i}|G_{ij}|^2 =\frac{{{\,\textrm{Im}\,}}G_{jj}}{\eta }\prec \frac{{{\,\textrm{Im}\,}}\underline{G} \!\,}{\eta } \prec \frac{|\underline{G} \!\,-\widehat{m}|+{{\,\textrm{Im}\,}}\widehat{m}}{\eta }. \end{aligned}$$
(5.2)

Comparing (5.2) with (4.3), we see that the improved Ward identity (5.2) contains the term \(|\underline{G} \!\,-\widehat{m}|\) instead of \(|G_{ii}-m|\), and \(|\underline{G} \!\,-\widehat{m}|\) is expected to fluctuate on a smaller scale. In addition, by Theorem 4.2, triangle inequality and the fact that \(m(z)=O(1)\), we have

$$\begin{aligned} \max _{ij}|G_{ij}| \prec 1 \end{aligned}$$
(5.3)

for all \(z \in {{\textbf {D}}}\). Using (5.2) and (5.3), instead of (4.3) and (4.13), we can redo the proof of Proposition 4.3 and show that

$$\begin{aligned} \max _{i,j}|\delta _{ij}+zG_{ij}+\underline{G} \!\,G_{ij}|\prec \sqrt{\frac{|\underline{G} \!\,-\widehat{m}|+{{\,\textrm{Im}\,}}\widehat{m}}{N\eta }+\frac{1}{d}}, \end{aligned}$$

and thus

$$\begin{aligned} \max _{i,j}\Big |\delta _{ij}+\Big (z+\frac{d}{Nq}\Big )G_{ij}+\underline{G} \!\,G_{ij}\Big |\prec \sqrt{\frac{|\underline{G} \!\,-\widehat{m}|+{{\,\textrm{Im}\,}}\widehat{m}}{N\eta }+\frac{1}{d}}. \end{aligned}$$

The above and (5.1) imply

$$\begin{aligned} \max _{ij}|G_{ij}-\widehat{m}\delta _{ij}| \prec |\underline{G} \!\,-\widehat{m}|+\sqrt{\frac{|\underline{G} \!\,-\widehat{m}|+{{\,\textrm{Im}\,}}\widehat{m}}{N\eta }+\frac{1}{d}}. \end{aligned}$$
(5.4)

In this section we shall prove the following result.

Proposition 5.1

Assume \(N^{\tau } \leqslant d \leqslant N/2\) for some fixed \(\tau >0\). Fix \(\delta \in (0,\tau /10)\). Let \(z\in {{\textbf {D}}}\), and suppose that \(|\underline{G} \!\,-\widehat{m}|\prec \psi \) and \(\max _{ij}|G_{ij}-\delta _{ij}\widehat{m}|\prec \psi +\sqrt{\mathcal E_1}\) for some deterministic \(\psi \in [N^{-1},1]\) at z, where

$$\begin{aligned} \mathcal E_1{:}{=}\mathcal E_2+\frac{1}{d}, \quad \hbox {and} \quad \mathcal E_2 {:}{=}\frac{\psi +{{\,\textrm{Im}\,}}\widehat{m}}{N\eta }. \end{aligned}$$

Then at z we have

$$\begin{aligned} 1+(z+d/(Nq))\underline{G} \!\,+\underline{G} \!\,^2\prec \mathcal E_1+\mathcal E_1^{1/4}\mathcal E_2^{1/2} (\psi +|z+d/(Nq)+2\widehat{m}|)^{1/2}+d^{-1/2}\psi {=}{:}\widehat{\mathcal E}. \end{aligned}$$

Fix \(n \geqslant 1\). Let us denote \(Q {:}{=}1+(z+d/(Nq))\underline{G} \!\,+\underline{G} \!\,^2\). By (4.2), \(Q=q^{-1}\underline{\mathcal {A}G} \!\,+d/(Nq) \cdot \underline{G} \!\,+\underline{G} \!\,^2+N^{-1}\) we have

$$\begin{aligned} \begin{aligned} \mathbb E |Q|^{2n}&=\frac{1}{Nq}\sum _{ij}\mathbb E\mathcal {A}_{ij}G_{ji}Q^{n-1}\overline{Q}^n+\mathbb E (d/(Nq)\cdot \underline{G} \!\,+ \underline{G} \!\,^2)Q^{n-1}\overline{Q}^{n}\\&\quad +O(N^{-1})\cdot \mathbb E |Q|^{2n-1}\\&{=}{:}\hbox {(III)+(IV)}+O(N^{-1})\cdot \mathbb E |Q|^{2n-1}. \end{aligned} \end{aligned}$$

We denote \(\mathcal Q{:}{=}(\mathbb E|Q|^{2n})^{\frac{1}{2n}}\), it suffices to show that

$$\begin{aligned} \hbox {(III)+(IV)}\prec \sum _{r=1}^{2n} \widehat{\mathcal E}^{r}\mathcal Q^{2n-r}. \end{aligned}$$
(5.5)

To simplify notation, we shall drop the complex conjugates in (III)+(IV) (which play no role in the subsequent analysis), and prove

$$\begin{aligned} \hbox {(III)'+(IV)'}{:}{=}\frac{1}{Nq}\sum _{ij}\mathbb E\mathcal {A}_{ij}G_{ji}Q^{2n-1}+\mathbb E (d/(Nq)\cdot \underline{G} \!\,+ \underline{G} \!\,^2)Q^{2n-1}{\prec } \sum _{r=1}^{2n} \widehat{\mathcal E}^{r}\mathcal Q^{2n-r}\nonumber \\ \end{aligned}$$
(5.6)

instead of (5.5). By Lemma 2.2, we have

$$\begin{aligned} \hbox {(III)'}&=\frac{1}{(N-d)Ndq}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl}(G_{ij}Q^{2n-1}) +\frac{d}{(N-d)Nq}\sum _{ij}\mathbb E G_{ij}Q^{2n-1}\nonumber \\&\quad -\frac{1}{(N-d)Ndq}\sum _{ij}\mathbb E (\mathcal {A}^3)_{ij}G_{ji}Q^{2n-1}+O(N^{-2}q^{-1})\cdot \sum _{ij:i\ne j}\mathbb E\mathcal M_{ij}(G_{ji}Q^{2n-1})\nonumber \\&\quad -\frac{1}{(N-d)Ndq}\sum _{ikl}\mathbb E \chi _{ik}^{il}(\mathcal A)\textrm{D}_{ii}^{kl}(G_{ii}Q^{2n-1})-\frac{d}{(N-d)Nq}\sum _{i}\mathbb E G_{ii}Q^{2n-1}\nonumber \\&\quad +\frac{1}{(N-d)Ndq}\sum _{i}\mathbb E (\mathcal {A}^3)_{ii}G_{ii}Q^{2n-1}{=}{:}\, S_1+\cdots +S_7. \end{aligned}$$
(5.7)

By the last relation of (4.2), it is easy to see that \(S_2=0\). Using Lemma 4.1 with \(r=3\), we have \(\sum _{ij}(\mathcal {A}^3)_{ij}G_{ij}={{\,\textrm{Tr}\,}}(\mathcal {A}^3G)\prec Nd^{3/2}\). Thus

$$\begin{aligned} S_3=O({N^{-2}d^{-3/2}})\cdot \mathbb E |{{\,\textrm{Tr}\,}}(\mathcal A^3G)Q^{2n-1}| \prec N^{-1}\mathcal Q^{2n-1}. \end{aligned}$$

Using resolvent identity and (5.3), it can be easily shown that \(S_4 \prec \sum _{r=1}^{2n} \widehat{\mathcal E}^{r}\mathcal Q^{2n-r}\). In addition, we have \( S_5\prec \sum _{r=1}^{2n} \widehat{\mathcal E}^{r}\mathcal Q^{2n-r}, \) and

$$\begin{aligned} S_7= & {} \frac{1}{(N-d)Ndq}\sum _{ij}\mathbb E (\mathcal {A}^2)_{ij}\mathcal A_{ji}G_{ii}Q^{2n-1}\\= & {} \frac{d}{(N-d)N^2q}\sum _{ij}\mathbb E \mathcal A_{ji}G_{ii}Q^{2n-1}+O_{\prec }(\widehat{\mathcal E}) \mathcal Q^{2n-1}\\= & {} \frac{d^2}{(N-d)Nq}\mathbb E\underline{G} \!\,Q^{2n-1}+O_{\prec }(\widehat{\mathcal E}) \mathcal Q^{2n-1}, \end{aligned}$$

where in the second step we used (3.3). Thus \(S_6+S_7=-d/(Nq)\mathbb E\underline{G} \!\,Q^{2n-1}\). As a result, (5.7) simplifies to

$$\begin{aligned} \hbox {(III)'}=S_1-d/(Nq)\mathbb E\underline{G} \!\,Q^{2n-1}+\sum _{r=1}^{2n} O_{\prec }(\widehat{\mathcal E}^{r})\cdot \mathcal Q^{2n-r}. \end{aligned}$$
(5.8)

To examine the terms in \(S_1\), we split

$$\begin{aligned} \begin{aligned}&S_1=\,\frac{1}{(N-d)Ndq}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl}(G_{ij})Q^{2n-1}\\&\qquad +\frac{1}{(N-d)Ndq}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)G_{ij}\textrm{D}_{ij}^{kl}(Q^{2n-1})\\&\qquad +\frac{1}{(N-d)Ndq}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl}(G_{ij})\textrm{D}_{ij}^{kl}(Q^{2n-1}){=}{:}S_{1,1}+S_{1,2}+_{1,3}. \end{aligned} \end{aligned}$$
(5.9)

5.1 Estimates of \(S_{1,2}\) and \(S_{1,3}\)

Let us first look at the interaction terms. As we shall see, the steps are much easier compared to those in Sect. 4.2, due to the smallness of \(\textrm{D}_{ij}^{kl}Q\). By (4.6) and (5.2), we have

$$\begin{aligned} q^{-1}\partial _{ij}^{kl}Q \prec q^{-1}|z+d/(Nq)+2\underline{G} \!\,|\cdot \frac{\psi +{{\,\textrm{Im}\,}}\widehat{m}}{N\eta } \prec q^{-1}(|z+d/(Nq)+2\widehat{m}|+\psi ) \mathcal E_2, \end{aligned}$$

and \(q^{-s}(\partial _{ij}^{kl})^sQ \prec q^{-s}\mathcal E_2\) for \(s \geqslant 2\). Together with (3.2), (4.5) and \(\textrm{D}_{ij}^{kl}Q \prec \widehat{\mathcal E}\) we get

$$\begin{aligned} \begin{aligned} \textrm{D}_{ij}^{kl}(Q^{2n-1})&\prec |\textrm{D}_{ij}^{kl} Q| \cdot \sum _{r=2}^{2n} \widehat{\mathcal E}^{r-2} |Q|^{2n-r}\\&\prec (q^{-1}(|z+d/(Nq)+2\widehat{m}|+\psi ) \mathcal E_2+q^{-2}\mathcal E_2\big ) \sum _{r=2}^{2n} \widehat{\mathcal E}^{r-2} |Q|^{2n-r}. \end{aligned} \end{aligned}$$
(5.10)

By (5.10) and \(\chi _{ik}^{jl}(\mathcal {A})\leqslant \mathcal {A}_{ik}\mathcal {A}_{jl}\), we get

$$\begin{aligned} S_{1,2}&\prec \frac{1}{N^2d^{3/2}}\sum _{ijkl} \mathbb E \bigg ( |\mathcal {A}_{ik}\mathcal {A}_{jl}G_{ij}| (q^{-1}(|z+d/(Nq)\nonumber \\&\quad +2\widehat{m}|+\psi )\mathcal E_2 +q^{-2}\mathcal E_2\big ) \sum _{r=2}^{2n} \widehat{\mathcal E}^{r-2} |Q|^{2n-r}\bigg )\nonumber \\&\prec \frac{(|z+d/(Nq)+2\widehat{m}|+\psi ) +q^{-1}}{N^2d^2}\sum _{i,j,k,l}\mathbb E\bigg (|\mathcal {A}_{ik}\mathcal {A}_{jl}G_{ij} \mathcal E_2| \sum _{r=2}^{2n} \widehat{\mathcal E}^{r-2} |Q|^{2n-r}\bigg )\nonumber \\&\prec \frac{(|z+d/(Nq)+2\widehat{m}|+\psi ) +q^{-1}}{N^2}\sum _{i,j}\mathbb E\bigg ( |G_{ij} \mathcal E_2|\sum _{r=2}^{2n} \widehat{\mathcal E}^{r-2} |Q|^{2n-r}\bigg )\nonumber \\&\quad \prec \sum _{r=2}^{2n} \widehat{\mathcal E}^{r} \mathcal Q^{2n-r}\,. \end{aligned}$$
(5.11)

Here in the last step we used (5.2) and Jensen’s inequality. Similarly, by (5.10), \(\chi _{ik}^{jl}(\mathcal {A})\leqslant \mathcal {A}_{ik}\mathcal {A}_{jl}\) and \(\textrm{D}_{ij}^{kl}G_{ij}\prec q^{-1}\), we have

$$\begin{aligned} S_{1,3}&\prec \frac{1}{N^2d^{3/2}}\sum _{ijkl} \mathbb E \bigg ( |\mathcal {A}_{ik}\mathcal {A}_{jl}q^{-1} (q^{-1}(|z+d/(Nq)+2\widehat{m}|+\psi )\mathcal E_2\nonumber \\&\quad +q^{-2} \mathcal E_2\big ) \sum _{r=2}^{2n}\widehat{\mathcal E}^{r-2} |Q|^{2n-r}\bigg )\nonumber \\&\prec \frac{(|z+d/(Nq)+2\widehat{m}|+\psi ) +q^{-1}}{N^2d^{5/2}}\sum _{i,j,k,l}\mathbb E\bigg ( |\mathcal {A}_{ik}\mathcal {A}_{jl} \mathcal E_2|\sum _{r=2}^{2n} \widehat{\mathcal E}^{r-2} |Q|^{2n-r}\bigg )\nonumber \\&\prec \sum _{r=2}^{2n} \widehat{\mathcal E}^{r} \mathcal Q^{2n-r}\,. \end{aligned}$$
(5.12)

5.2 Computation of \(S_{1,1}\)

The computation of \(S_{1,1}\) is similar to that of \(T_{1,1}\) in Sect. 4.1. Applying (4.5) with \(\ell =2\), we get

$$\begin{aligned}&S_{1,1}=\ \frac{1}{(N-d)Ndq^2}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\partial _{ij}^{kl}(G_{ij})Q^{2n-1}\\&\qquad +\frac{1}{2(N-d)Ndq^3}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)((\partial _{ij}^{kl})^2 G_{ij})Q^{2n-1}\\&\qquad +\frac{1}{6(N-d)Ndq^4}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)((\partial _{ij}^{kl})^3G_{kj}(\mathcal A+\theta \xi _{ik}^{lx}))Q^{2n-1}\\&\quad {=}{:}S_{1,1,1}+S_{1,1,2}+S_{1,1,3} \end{aligned}$$

for some \(\theta \in [0,1]\).

Let us first compute \(S_{1,1,1}\). By (4.6), we get

$$\begin{aligned} S_{1,1,1}= & {} \frac{1}{(N-d)Ndq^2}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\big (-G_{ii}G_{jj}-G_{ij}G_{ji}-G_{ik}G_{lj}-G_{il}G_{kj}\nonumber \\{} & {} +G_{ii}G_{kj}+G_{ik}G_{ij}+G_{ij}G_{lj}+G_{il}G_{jj}\big )Q^{2n-1}{=}{:}\sum _{s=1}^{8}S_{1,1,1,s}. \end{aligned}$$
(5.13)

Recall the definition of \(\chi \) in (2.2). We have

$$\begin{aligned} \begin{aligned} S_{1,1,1,1}&=-\frac{1}{(N-d)Ndq^2}\sum _{ij}\mathbb E\big (\mathcal {A}_{ij}(\mathcal {A}^3)_{ij}+d^2-d^2\mathcal {A}_{ij}-(\mathcal {A}^3)_{ij}\big )G_{ii}G_{jj}Q^{2n-1}\\&=-\frac{1}{(N-d)Ndq^2}\sum _{ij}\mathbb E\Big [\big (\mathcal {A}_{ij}(\mathcal {A}^3)_{ij}+d^2-d^2\mathcal {A}_{ij}-(\mathcal {A}^3)_{ij}\big )\\&\quad \cdot \big ((G_{ii}-\underline{G} \!\,)(G_{jj}-\underline{G} \!\,)+\underline{G} \!\,(G_{ii}-\underline{G} \!\,)+\underline{G} \!\,(G_{jj}-\underline{G} \!\,)+\underline{G} \!\,^2\big )Q^{2n-1}\Big ]\\&{=}{:}\, S_{1,1,1,1,1}+\cdots +S_{1,1,1,1,4}. \end{aligned} \end{aligned}$$
(5.14)

Similar as in (4.20), using (3.4), we get

$$\begin{aligned} S_{1,1,1,1,1}&=-\frac{1}{(N-d)Ndq^2}\sum _{ij}\mathbb E \cdot \Big [\big (\mathcal {A}_{ij}d^3N^{-1}+d^2-d^2\mathcal {A}_{ij}-d^3N^{-1}\big )(G_{ii}-\underline{G} \!\,)\nonumber \\&\quad \times (G_{jj}-\underline{G} \!\,)Q^{2n-1}\Big ]+O_{\prec }(N^{-1/2}+d^{-1})(\psi ^2+\mathcal E_1)\mathcal Q^{2n-1} \nonumber \\&=\ \frac{d}{N^2q^2}\sum _{ij}\mathbb E \mathcal {A}_{ij} (G_{ii}-\underline{G} \!\,)(G_{jj}-\underline{G} \!\,)Q^{2n-1}+O_{\prec }(\widehat{\mathcal E})\mathcal Q^{2n-1}\,. \end{aligned}$$
(5.15)

Here in the last step we used \(\sum _i(G_{ii}-\underline{G} \!\,)=0\). Let us denote \(\widetilde{F}(\mathcal {A}){:}{=}(G_{ii}-\underline{G} \!\,)(G_{jj}-\underline{G} \!\,)Q^{2n-1}\). Applying Lemma 2.2 to the first term on RHS of (5.15), we get

$$\begin{aligned} S_{1,1,1,1,1}&=\ \frac{1}{(N-d)N^2q^2}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl} \widetilde{F}(\mathcal {A}) +\frac{d^2}{(N-d)N^2q^2}\sum _{ij}\mathbb E \widetilde{F}(\mathcal {A})\nonumber \\&\quad -\frac{1}{(N-d)N^2q^2}\sum _{ij}\mathbb E (\mathcal {A}^3)_{ij}\widetilde{F}(\mathcal {A})+O(dN^{-3}q^{-2})\cdot \sum _{ij}\mathbb E\mathcal M_{ij}(F(\mathcal {A}))\nonumber \\&\quad +\sum _{r=1}^{2n}O_{\prec }(\widehat{\mathcal E}^r)\mathcal Q^{2n-r}\nonumber \\&= \frac{1}{(N-d)N^2q^2}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl} \widetilde{F}(\mathcal {A})\nonumber \\&\quad -\frac{1}{(N-d)N^2q^2}\sum _{ij}\mathbb E (\mathcal {A}^3)_{ij}\widetilde{F}(\mathcal {A})+\sum _{r=1}^{2n}O_{\prec }(\widehat{\mathcal E}^r)\mathcal Q^{2n-r}\,, \end{aligned}$$
(5.16)

where in the second step we used \(\sum _{ij}\widetilde{F}(\mathcal {A})=0\). By (3.2), (4.5), (4.6), and \(\textrm{D}_{ij}^{kl}Q \prec q^{-1}\widehat{\mathcal E}\), we have

$$\begin{aligned} \textrm{D}_{ij}^{kl} \widetilde{F}(\mathcal {A})\prec & {} ((\psi +\sqrt{\mathcal E_1})q^{-1}+q^{-2})\sum _{r=1}^{2n} \widehat{\mathcal E}^{r-1}|Q|^{2n-r}\\{} & {} +(\psi +\sqrt{\mathcal E_1})^2q^{-1}\sum _{r=2}^{2n} \widehat{\mathcal E}^{r-1}|Q|^{2n-r} \prec \sum _{r=1}^{2n}\widehat{\mathcal E}^{r}|Q|^{2n-r}, \end{aligned}$$

which implies

$$\begin{aligned} \frac{1}{(N-d)N^2q^2}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal A)\textrm{D}_{ij}^{kl} \widetilde{F}(\mathcal {A})\prec & {} \frac{1}{N^3d}\sum _{ijkl} \mathbb E\bigg ( \mathcal {A}_{ik}\mathcal {A}_{jl}\sum _{r=1}^{2n} \widehat{\mathcal E}^{r}|Q|^{2n-r}\bigg ) \nonumber \\\prec & {} \sum _{r=1}^{2n}\widehat{\mathcal E}^{r} \mathcal Q^{2n-r}. \end{aligned}$$
(5.17)

In addition, (3.4) and \(\sum _{ij}\widetilde{F}(\mathcal {A})=0\) imply

$$\begin{aligned} -\frac{1}{(N-d)N^2q^2}\sum _{ij}\mathbb E (\mathcal {A}^3)_{ij}\widetilde{F}(\mathcal {A})+O_{\prec }(\widehat{\mathcal E})\mathcal Q^{2n-1} \prec \widehat{\mathcal E} \mathcal Q^{2n-1}. \end{aligned}$$
(5.18)

Combining (5.16)–(5.18) we get

$$\begin{aligned} S_{1,1,1,1,1} \prec \sum _{r=1}^{2n}\widehat{\mathcal E}^{r}\mathcal Q^{2n-r}. \end{aligned}$$
(5.19)

Similarly to (5.15), we can show that

$$\begin{aligned} S_{1,1,1,1,2}=\frac{d}{N^2q^2}\sum _{ij}\mathbb E \mathcal {A}_{ij} \underline{G} \!\,(G_{ii}-\underline{G} \!\,)Q^{2n-1}+O_{\prec }(\widehat{\mathcal E})\mathcal Q^{2n-1}. \end{aligned}$$

By first summing over j and then summing over i, the first term on RHS of the above vanishes, and thus

$$\begin{aligned} S_{1,1,1,1,2}\prec \sum _{r=1}^{2n} \widehat{\mathcal E}^{r}\mathcal Q^{2n-r}. \end{aligned}$$
(5.20)

Similarly,

$$\begin{aligned} S_{1,1,1,1,3} \prec \sum _{r=1}^{2n} \widehat{\mathcal E}^{r}\mathcal Q^{2n-r}. \end{aligned}$$
(5.21)

Moreover, applying (3.5) with \(r=4\), we have

$$\begin{aligned} \begin{aligned}&S_{1,1,1,1,4}\\&\quad =-\frac{1}{(N-d)Ndq^2}\mathbb E( {{\,\textrm{Tr}\,}}\mathcal {A}^4+N^2d^2-Nd^3-Nd^3)\underline{G} \!\,^2 Q^{2n-1}\\&\quad =-\mathbb E \underline{G} \!\,^2 Q^{2n-1}\\&\qquad +\sum _{r=1}^{2n} O_{\prec }(\widehat{\mathcal E}^{r})\cdot \mathcal Q^{2n-r}. \end{aligned} \end{aligned}$$
(5.22)

Inserting (5.19)–(5.22) into (5.14), we get

$$\begin{aligned} S_{1,1,1,1}=-\mathbb E \underline{G} \!\,^2 Q^{2n-1}+\sum _{r=1}^{2n} O_{\prec }(\widehat{\mathcal E}^{r})\cdot \mathcal Q^{2n-r}. \end{aligned}$$
(5.23)

When \(s=2,\ldots ,8\), the estimates of \(S_{1,1,1,s}\) are relatively simple. By \(\chi _{ik}^{jl}\leqslant \mathcal {A}_{ik}\mathcal {A}_{jl}\) and first summing over indices kl, it is not hard to see that \(S_{1,1,1,2}\prec \widehat{\mathcal E}\mathcal Q^{2n-1}\). For the next term, we have

$$\begin{aligned}&S_{1,1,1,3}=-\frac{1}{(N-d)Ndq^2}\sum _{ijkl}\mathbb E (\mathcal {A}_{ik}\mathcal {A}_{lj}-\mathcal {A}_{ik}\mathcal {A}_{jl}\mathcal {A}_{lk}\\&\qquad -\mathcal {A}_{ik}\mathcal {A}_{jl}\mathcal {A}_{ij}+\mathcal {A}_{ik}\mathcal {A}_{jl}\mathcal {A}_{ij}\mathcal {A}_{lk})G_{ik}G_{lj}Q^{2n-1}\\&\quad =-\frac{1}{(N-d)Ndq^2}\sum _{ijkl}\mathbb E \mathcal {A}_{ik}\mathcal {A}_{jl}\mathcal {A}_{ij}\mathcal {A}_{lk}G_{ik}G_{lj}Q^{2n-1}\\&\qquad +O(d^{-1})\mathcal Q^{2n-1}\,, \end{aligned}$$

where in the second step we used Lemma 4.1 with \(r=1\). The first term on RHS of the above can be bounded by

$$\begin{aligned}{} & {} O(N^{-2}d^{-2}) \cdot \mathbb E \Big (\sum _{ijkl}|\mathcal {A}_{ik}\mathcal {A}_{lk}G^2_{jl}|\Big )^{1/2}\\{} & {} \quad \Big (\sum _{ijkl}|\mathcal {A}_{jl}\mathcal {A}_{ij}G^2_{ik}|\Big )^{1/2}|Q|^{2n-1}\prec \widehat{\mathcal E} \mathcal Q^{2n-1}. \end{aligned}$$

Hence \(S_{1,1,1,3}\prec \widehat{\mathcal E}\mathcal Q^{2n-1}\). Similarly \(S_{1,1,1,4}\prec \widehat{\mathcal E}\mathcal Q^{2n-1}\). We have

$$\begin{aligned} S_{1,1,1,5}= & {} \frac{1}{(N-d)Ndq^2}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal {A}) (G_{ii}-\underline{G} \!\,)G_{kj}Q^{2n-1}\\{} & {} +\frac{1}{(N-d)Ndq^2}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal {A}) \underline{G} \!\,G_{kj}Q^{2n-1}. \end{aligned}$$

By (5.2), the first term on RHS of above can be bounded by

$$\begin{aligned}{} & {} O_{\prec }(N^{-2}d^{-2})(\psi +\sqrt{\mathcal E_1})\sum _{ijkl} \mathbb E \mathcal {A}_{ik}\mathcal {A}_{jl} |G_{kj}||Q|^{2n-1} \\{} & {} \quad \prec N^{-2}(\psi +\sqrt{\mathcal E_1})\sum _{jk} \mathbb E |G_{kj}||Q|^{2n-1} \prec \widehat{\mathcal E}\mathcal Q^{2n-1}; \end{aligned}$$

the second term on RHS can be estimated by

$$\begin{aligned} \begin{aligned}&\,\frac{1}{(N-d)Ndq^2}\sum _{kj} \mathbb E( d^2-2d (\mathcal {A}^2)_{jk}+ (\mathcal {A}^2)_{jk}^2 )\underline{G} \!\,G_{kj}Q^{2n-1}\\&\quad =\,\frac{1}{(N-d)Ndq^2} \sum _{kj} \mathbb E (\mathcal {A}^2)_{jk}^2 G_{kj}Q^{2n-1}+O(N^{-1})\mathcal Q^{2n-1}\\&\quad =\,\frac{1}{(N-d)Ndq^2} \sum _{kj} \mathbb E \big (((\mathcal {A}^2)_{jk}-d^2N^{-1})^2+2d^2N^{-1}((A^2)_{jk}-d^2N^{-1})\big )\\&\qquad \times G_{kj}Q^{2n-1}+O(N^{-1})\mathcal Q^{2n-1}\\&\quad \prec \, N^{-2}d^{-2}\mathbb E \sum _{kj}((\mathcal {A}^2)_{kj}-d^2N^{-1})^2 |Q|^{2n-1}+N^{-3} \mathbb E \Big (\sum _{kj}((\mathcal {A}^2)_{kj}-d^2N^{-1})^2\\&\qquad \times \sum _{kj} |G^2_{kj}| \Big )^{1/2} |Q|^{2n-1} +O(N^{-1})\mathcal Q^{2n-1} \prec \widehat{\mathcal E} \mathcal Q^{2n-1} \end{aligned} \end{aligned}$$

Here in the first step we used \(\sum _{k}G_{kj}=0\) and Lemma 4.1, in the second step we used \(\sum _{k}G_{kj}=0\), and in the last step we used

$$\begin{aligned} \sum _{kj}( (\mathcal {A}^2)_{kj}-d^2N^{-1})^2={{\,\textrm{Tr}\,}}\mathcal {A}^4 -d^4 \prec d^2 N+N^2 \end{aligned}$$

which is a consequence of (3.5). Thus \(S_{1,1,1,5}\prec \widehat{\mathcal E}\mathcal Q^{2n-1}\), and similarly we have \(S_{1,1,1,8}\prec \widehat{\mathcal E}\mathcal Q^{2n-1}\). Next, we have

$$\begin{aligned} S_{1,1,1,6}&=\frac{1}{(N-d)Ndq^2}\sum _{ijk}\mathbb E (d\mathcal {A}_{ik}-\mathcal {A}_{ik}(\mathcal {A}^2)_{jk}-d\mathcal {A}_{ik}\mathcal {A}_{ij}\\&\quad +(\mathcal {A}^2)_{jk}\mathcal {A}_{ik}\mathcal {A}_{ij})G_{ik}G_{ij}Q^{2n-1}\\&=\frac{1}{(N-d)Ndq^2}\mathbb E \Big (-\sum _{ik} \mathcal {A}_{ik}G_{ik}(\mathcal {A}^2G)_{ik}-d\sum _i\mathbb E (\mathcal {A}G)_{ii}^2\\&\quad +\sum _{ijk}(\mathcal {A}^2)_{jk}\mathcal {A}_{ik}\mathcal {A}_{ij}G_{ik}G_{ij}\Big )Q^{2n-1}\\&=\frac{1}{(N-d)Ndq^2}\sum _{ijk}\mathbb E (\mathcal {A}^2)_{jk}\mathcal {A}_{ik}\mathcal {A}_{ij}G_{ik}G_{ij}Q^{2n-1} +O_{\prec }(\widehat{\mathcal E})\mathcal Q^{2n-1}\\&\prec N^{-2}d^{-2} \mathbb E \Big [|Q|^{2n-2} \sum _{jk}(\mathcal {A}^2)_{jk}\sum _{i}|G_{ik}G_{ij}|\Big ]+\widehat{\mathcal E}\mathcal Q^{2n-1} \prec \widehat{\mathcal E}\mathcal Q^{2n-1}, \end{aligned}$$

where in the second step we used \(\sum _j G_{ij}=0\), and in the third step we used Lemma 4.1. Similarly, we also have \(S_{1,1,1,7}\prec \widehat{\mathcal E}\mathcal Q^{2n-1}\).

Now we have finishes estimates of \(S_{1,1,1,s}\) for all \(s =2,\ldots ,8\). Together with (5.13) and (5.23) we get

$$\begin{aligned} S_{1,1,1}=-\mathbb E \underline{G} \!\,^2 Q^{2n-1}+\sum _{r=1}^{2n} O_{\prec }(\widehat{\mathcal E}^{r})\cdot \mathcal Q^{2n-r}. \end{aligned}$$
(5.24)

The estimate of \(S_{1,1,2}\) is very similar to those of \(S_{1,1,1,2},\ldots ,S_{1,1,1,8}\): by (4.6), there is at least one off-diagonal factor of G in every term of \(S_{1,1,2}\). In addition, compared to \(S_{1,1,1,2},\ldots ,S_{1,1,1,8}\), there is an extra factor of \(q^{-1}\prec \widehat{\mathcal E}^{1/2}\) in \(S_{1,1,2}\). Thus we can show that

$$\begin{aligned} S_{1,1,2}\prec \widehat{\mathcal E} \mathcal Q^{2n-1}. \end{aligned}$$
(5.25)

By resolvent identity, (4.6) and \(\max _{ij}|G_{ij}| \prec 1\), it is not hard to see that \((\partial _{ij}^{kl})^3G_{kj}(\mathcal A+\theta \xi _{ik}^{lx})\prec 1\), hence

$$\begin{aligned} S_{1,1,3}\prec N^{-2}d^{-3}\sum _{ijkl} \mathbb E \mathcal {A}_{ik}\mathcal {A}_{jl}|Q|^{2n-1}\prec \widehat{\mathcal E}\mathcal Q^{2n-1}. \end{aligned}$$
(5.26)

Combining (5.24)–(5.26) we have

$$\begin{aligned} S_{1,1}=-\mathbb E \underline{G} \!\,^2 Q^{2n-1}+\sum _{r=1}^{2n} O_{\prec }(\widehat{\mathcal E}^{r})\cdot \mathcal Q^{2n-r}. \end{aligned}$$
(5.27)

Inserting (5.9), (5.11), (5.12) and (5.27) into (5.8), we get

$$\begin{aligned} \hbox {(III)'}=-\mathbb E \underline{G} \!\,^2 Q^{2n-1}-d/(Nq)\mathbb E\underline{G} \!\,Q^{2n-1}+\sum _{r=1}^{2n} O_{\prec }(\widehat{\mathcal E}^{r})\cdot \mathcal Q^{2n-r}. \end{aligned}$$

Since \(\hbox {(IV)'}{:}{=}\mathbb E (d/(Nq)\cdot \underline{G} \!\,+ \underline{G} \!\,^2)Q^{2n-1}\), we have finished the proof of (5.6). This concludes the proof of Proposition 5.1.

6 Edge Rigidity and Universality

Throughout this section we assume

$$\begin{aligned} N^{2/3+\tau }\leqslant d \leqslant N/2 \end{aligned}$$
(6.1)

for some fixed \(\tau >0\), and fix parameters

$$\begin{aligned} \mu \in (0,\tau /100), \quad \delta \in (0,\mu /10), \quad \nu \in (0,\delta /10). \end{aligned}$$
(6.2)

We abbreviate

$$\begin{aligned} A {:}{=}q^{-1}\mathcal {A}. \end{aligned}$$

We shall prove Theorems 1.1 and 1.3 at the right edge of the spectrum; the left edge case follows analogously.

6.1 Improved estimate of averaged Green function

Recall the notion of \({{\textbf {D}}}\) in (4.7). Let us define the regime

$$\begin{aligned} {{\textbf {S}}}\equiv {{\textbf {S}}}_\delta {:}{=}\{z=E+\textrm{i}\eta : 2-d/(Nq)+N^{-2/3+\delta }\leqslant E \leqslant \delta ^{-1}, N^{-2/3}\leqslant \eta \leqslant \delta ^{-1}\}\subset {{\textbf {D}}}, \end{aligned}$$

and we use \(\kappa \equiv \kappa (E){:}{=}|(E+d/(Nq))^2-4|\) to denote the distance to edge. We first prove the following consequence of Theorem 4.2 and Proposition 5.1.

Proposition 6.1

We have

$$\begin{aligned} \begin{aligned} |\underline{G} \!\,-\widehat{m}| \prec&\ \frac{1}{N(\kappa +\eta )}+\frac{1}{d(\kappa +\eta )^{1/2}}+\frac{1}{N^2(\kappa +\eta )^{5/2}}+\frac{1}{(N\eta )^2(\kappa +\eta )^{1/2}}\\&+\frac{1}{N^{2/3}(\kappa +\eta )^{1/2}} \end{aligned} \end{aligned}$$
(6.3)

for \(z \in {{\textbf {S}}}\), and

$$\begin{aligned} |\underline{G} \!\,-\widehat{m}|\prec \frac{1}{N\eta }+\frac{1}{d^{1/2}}+\frac{(\kappa +\eta )^{1/6}}{(N\eta )^{2/3}} \end{aligned}$$
(6.4)

for all \(z \in {{\textbf {D}}}\). In addition, we have

$$\begin{aligned} \max _{ij} |G_{ij}-\delta _{ij}\widehat{m}|\prec \frac{1}{(N\eta )^{1/2}}+\frac{1}{d^{1/2}} \end{aligned}$$
(6.5)

for all \(z \in {{\textbf {D}}}\).

Proof

Since for each fixed E, the function \(\eta \mapsto \widehat{\mathcal E}(E+\textrm{i}\eta )\) is non-increasing for \(\eta >0\), a standard stability analysis (see e.g. [7, Lemma 5.4]) and Proposition 5.1 imply

$$\begin{aligned} |\underline{G} \!\,-\widehat{m}| \prec \frac{\widehat{\mathcal E}}{\sqrt{\widehat{\mathcal E}+\kappa +\eta }} \end{aligned}$$
(6.6)

for all \(z \in {{\textbf {D}}}\).

  1. (i)

    Let \(z \in {{\textbf {S}}}\). Recall the definition of \(\widehat{\mathcal E}\) in Proposition 5.1. Note that

    $$\begin{aligned} \kappa \asymp E+d/(Nq)-2, \quad {{\,\textrm{Im}\,}}\widehat{m}\asymp \frac{\eta }{(\kappa +\eta )^{1/2}}\quad \hbox {and} \quad |z+d/(Nq)+2\widehat{m}|\asymp (\kappa +\eta )^{1/2}, \end{aligned}$$

    together with Young’s inequality we get

    $$\begin{aligned} \widehat{\mathcal E}&\prec \mathcal E_1+\mathcal E_2^{2/3}(\psi +|z+d/(Nq) +2\widehat{m}|)^{2/3}+d^{-1/2}\psi \nonumber \\&\prec \frac{\psi }{N\eta }+\frac{1}{N(\kappa +\eta )^{1/2}}+\frac{1}{d} +\Big (\frac{\psi }{N\eta }+\frac{1}{N(\kappa +\eta )^{1/2}}\Big )^ {2/3}\nonumber \\&\quad \times (\psi +(\kappa +\eta )^{1/2})^{2/3}+\frac{\psi }{d^{1/2}} \nonumber \\&\prec \frac{\psi }{N\eta }+\frac{1}{N(\kappa +\eta )^{1/2}}+\frac{1}{d}+\frac{\psi ^{4/3}}{(N\eta )^{2/3}}+\frac{\psi ^{2/3}}{N^{2/3} (\kappa +\eta )^{1/3}}\nonumber \\&\quad +\frac{\psi ^{2/3}(\kappa +\eta )^{1/3}}{(N\eta )^{2/3}} +\frac{1}{N^{2/3}}+\frac{\psi }{d^{1/2}}\,. \end{aligned}$$
    (6.7)

    By (6.6) and the fact that \(x \mapsto x/\sqrt{x+\kappa +\eta }\) is increasing, we know that

    $$\begin{aligned} |\underline{G} \!\,-\widehat{m} |&\prec \ \frac{\psi }{N\eta (\kappa +\eta )^{1/2}}+\frac{1}{N(\kappa +\eta )} +\frac{1}{d(\kappa +\eta )^{1/2}}\nonumber \\&\quad +\frac{\psi }{(N\eta )^{1/2}(\kappa +\eta )^{1/4}}+\frac{\psi ^{2/3}}{N^{2/3}(\kappa +\eta )^{5/6}} +\frac{\psi ^{2/3}}{(N\eta )^{2/3}(\kappa +\eta )^{1/6}}\nonumber \\&\quad +\frac{1}{N^{2/3}(\kappa +\eta )^{1/2}} +\frac{\psi }{d^{1/2}(\kappa +\eta )^{1/2}} \nonumber \\&\prec \ \frac{1}{N(\kappa +\eta )}+\frac{1}{d(\kappa +\eta )^{1/2}}+\frac{1}{N^2(\kappa +\eta )^{5/2}}\nonumber \\&\quad +\frac{1}{(N\eta )^2(\kappa +\eta )^{1/2}}+\frac{1}{N^{2/3}(\kappa +\eta )^{1/2}}+N^{-\nu }\psi \end{aligned}$$
    (6.8)

    provided that \(|\underline{G} \!\,-\widehat{m}|\prec \psi \). Here in the first step the fourth term is obtained through

    $$\begin{aligned}{} & {} \frac{\psi ^{4/3}}{(N\eta )^{2/3}} \cdot (\widehat{\mathcal E}+\kappa +\eta )^{-1/2}\leqslant \frac{\psi ^{4/3}}{(N\eta )^{2/3}} \cdot \\{} & {} \bigg (\frac{\psi ^{4/3}}{(N\eta )^{2/3}}\bigg )^{-1/4} \cdot (\kappa +\eta )^{-1/4}=\frac{\psi }{(N\eta )^{1/2}(\kappa +\eta )^{1/4}}, \end{aligned}$$

    and in last step we used \(\kappa +\eta \geqslant N^{-2/3+\delta }\), \(\eta \geqslant N^{-2/3}\) and \(d \geqslant N^{2/3+\tau }\). Iterating (6.8), we obtain (6.3).

  2. (ii)

    Let \(z \in {{\textbf {D}}}\). We have

    $$\begin{aligned} {{\,\textrm{Im}\,}}\widehat{m} =O(\sqrt{\kappa +\eta })\quad \hbox {and} \quad \quad |z+d/(Nq)+2\widehat{m}| \asymp (\kappa +\eta )^{1/2}. \end{aligned}$$

    Similar to (6.7), we get

    $$\begin{aligned} \begin{aligned} \widehat{\mathcal E}&\prec \frac{\psi }{N\eta }+\frac{(\kappa +\eta )^{1/2}}{N\eta }+\frac{1}{d}+\Big (\frac{\psi }{N\eta }+\frac{(\kappa +\eta )^{1/2}}{N\eta }\Big )^{2/3}(\psi +(\kappa +\eta )^{1/2})^{2/3}+\frac{\psi }{d^{1/2}}\\&\prec \frac{\psi }{N\eta }+\frac{(\kappa +\eta )^{1/2}}{N\eta }+\frac{1}{d}+\frac{\psi ^{4/3}}{(N\eta )^{2/3}}+\frac{(\kappa +\eta )^{2/3}}{(N\eta )^{2/3}}+\frac{\psi }{d^{1/2}}. \end{aligned} \end{aligned}$$

    By (6.6) and the fact that \(x \mapsto x/\sqrt{x+\kappa +\eta }\) is increasing, we get

    $$\begin{aligned} \begin{aligned} |\underline{G} \!\,-\widehat{m}|&\prec \Big (\frac{\psi }{N\eta }\Big )^{1/2}+\frac{1}{N\eta }+\frac{1}{d^{1/2}}+\frac{\psi ^{2/3}}{(N\eta )^{1/3}}+\frac{(\kappa +\eta )^{1/6}}{(N\eta )^{2/3}}+\frac{\psi ^{1/2}}{d^{1/4}} \end{aligned} \end{aligned}$$

    provided that \(|\underline{G} \!\,-\widehat{m}| \prec \psi \). Iterating the above yields (6.4) as desired.

  3. (iii)

    The estimate (6.5) is a direct consequence of (5.4) and (6.4).

\(\square \)

6.2 Proof of Theorem 1.1

We shall need the following bound on the magnitude of \(\lambda _2,\lambda _N\) as an input, which follows from [42, Theorem A].

Theorem 6.2

For any fixed \(D>0\), there exists a constant \(L\equiv L(D)>0\) such that

$$\begin{aligned} {\mathbb {P}}(|\lambda _N/q|\geqslant L)+{\mathbb {P}}(|\lambda _2/q|\geqslant L) =O_D(N^{-D}). \end{aligned}$$

The upper bound. Let \(z=E+\textrm{i}N^{-2/3} \in {{\textbf {S}}}\). By (6.3) and \(\kappa (E) \geqslant N^{-2/3+\delta }\), we get

$$\begin{aligned} {{\,\textrm{Im}\,}}\underline{G} \!\,(z) \leqslant |\underline{G} \!\,(z)-\widehat{m}(z)|+{{\,\textrm{Im}\,}}\widehat{m}(z) \prec \frac{1}{N^{1+\nu }\eta }+\frac{\eta }{\sqrt{\kappa +\eta }} \prec \frac{1}{N^{1+\nu }\eta }. \end{aligned}$$

This implies that whenever \(E\in [2-d/(Nq)+N^{-2/3+\delta },\delta ^{-1}]\), with very high probability, there is no eigenvalue of A in the interval \([E-N^{-2/3},E+N^{-2/3}]\). Together with Theorem 6.2, we get

$$\begin{aligned} (\lambda _2/q-d/(Nq)-2)_+\prec N^{-2/3+\delta }. \end{aligned}$$
(6.9)

The lower bound. Let \(\widehat{{{\textbf {S}}}}{:}{=}\{z=E-d/(Nq)+\textrm{i}\eta :2- N^{-2/3+\delta }\leqslant E\leqslant 2+N^{-2/3+\delta }, N^{-2/3-\delta /3} \leqslant \eta \leqslant N^{-2/3} \}\subset {{\textbf {D}}}\), one can easily deduce from (6.1) and (6.4) that

$$\begin{aligned} |\underline{G} \!\,-\widehat{m}| \prec N^{-1/3+7\delta /18} \end{aligned}$$
(6.10)

for all \(z \in \widehat{{{\textbf {S}}}}\). Thus

$$\begin{aligned} {{\,\textrm{Im}\,}}\underline{G} \!\,\leqslant |\underline{G} \!\,-\widehat{m}|+{{\,\textrm{Im}\,}}\widehat{m} \prec N^{-1/3+7\delta /18}+(\eta +\kappa )^{1/2}\prec N^{-1/3+\delta /2}. \end{aligned}$$
(6.11)

Let \(f: \mathbb R \rightarrow [0,1]\) be a smooth function such that \(f(x)=1\) for \(|x+d/(Nq)-2|\leqslant N^{-2/3+\delta }-N^{-2/3}\), \(f(x)=0\) for \(|x+d/(Nq)-2|\geqslant N^{-2/3+\delta }\) and \(\Vert f^{(j)}\Vert _{\infty }=O(N^{2j/3})\) for all fixed \(j\in \mathbb N_+\). We see that

$$\begin{aligned}&|\varrho _A([2-d/(Nq)-N^{-2/3+\delta },2-d/(Nq)+ N^{-2/3+\delta }])-N^{-1}{{\,\textrm{Tr}\,}}f(A)|\nonumber \\&\quad \leqslant \,\varrho _A([2-d/(Nq)-N^{-2/3+\delta },2-d/(Nq)- N^{-2/3+\delta }+N^{-2/3}])\nonumber \\&\qquad +\varrho _A([2-d/(Nq)+N^{2/3+\delta }-N^{-2/3},2-d/(Nq)+ N^{-2/3+\delta }]) \nonumber \\&\quad \leqslant \,2N^{-2/3}\big ( {{\,\textrm{Im}\,}}\underline{G} \!\,(2-d/(Nq)-N^{-2/3+\delta }+\textrm{i}N^{-2/3})\nonumber \\&\qquad +{{\,\textrm{Im}\,}}\underline{G} \!\,(2-d/(Nq)+N^{-2/3+\delta }+\textrm{i}N^{-2/3})\big ) \nonumber \\&\quad \prec \, N^{-1+\delta /2}\,, \end{aligned}$$
(6.12)

where in the last step we used (6.11). Now we compute \(N^{-1}{{\,\textrm{Tr}\,}}f(A)\). Set \(l {:}{=}\lceil {3\delta ^{-1}} \rceil \), and let \(\tilde{f}\) be the almost analytic extension of f, defined by

$$\begin{aligned} \tilde{f}(x)=f(x)+\sum _{j=1}^l\frac{1}{j!}(\textrm{i}y)^j f^{(j)}(x). \end{aligned}$$

We define the regime \(D {:}{=}\{w=x+\textrm{i}y: x \in \mathbb R, |y|\leqslant N^{-2/3-\delta /3}\}\). Note that \(\lambda _1/q=d/q \notin {{\,\textrm{supp}\,}}f\). By [22, Lemma 3.5], we have

$$\begin{aligned} \begin{aligned}&N^{-1}{{\,\textrm{Tr}\,}}f(A)-\int _{\mathbb R} f(x)\varrho (x+d/(Nq))\textrm{d}x\\&\quad =-\frac{\textrm{i}}{2\pi }\oint _{\partial D} \tilde{f}(w) (\underline{G} \!\,(w)-\widehat{m}(w))\, \textrm{d}w+ \frac{1}{\pi }\int _{D} \partial _{\bar{w}}\tilde{f}(w) (\underline{G} \!\,(w)-\widehat{m}(w))\textrm{d}^2 w. \end{aligned} \end{aligned}$$

By the trivial bound \(|\underline{G} \!\,(w)| \leqslant |y|^{-1}\), we see that

$$\begin{aligned}{} & {} \bigg |\frac{1}{\pi }\int _{D} \partial _{\bar{w}}\tilde{f}(w) (\underline{G} \!\,(w)-\widehat{m}(w))\textrm{d}^2 w \bigg | \\{} & {} \quad = O(1)\cdot \int _D |y^{l-1}f^{(l+1)}(x)|\, \textrm{d}^2 w =O\big (N^{-(2/3+\delta /3)l}\cdot N^{2l/3}\big )=O(N^{-1}). \end{aligned}$$

By (6.10) and \(\Vert f\Vert _1=O(N^{2/3+\delta })\), we have

$$\begin{aligned} \bigg |-\frac{\textrm{i}}{2\pi }\oint _{\partial D} \tilde{f}(w) (\underline{G} \!\,(w)-\widehat{m}(w))\, \textrm{d}w\bigg |\prec N^{-1/3+7\delta /18} \oint _{\partial D} |\tilde{f}(w)| \, \textrm{d}w\prec N^{-1+25\delta /18}. \end{aligned}$$

As a result, we get

$$\begin{aligned} N^{-1}{{\,\textrm{Tr}\,}}f(A)= & {} \int _{\mathbb R} f(x)\varrho (x+d/(Nq))\textrm{d}x+O_{\prec }(N^{-1+25\delta /18})\nonumber \\= & {} \frac{2}{3}N^{-1+3\delta /2}+O_{\prec }(N^{-1+25\delta /18}). \end{aligned}$$
(6.13)

Combining (6.12) and (6.13) yields

$$\begin{aligned} \varrho _A([2-d/(Nq)-N^{-2/3+\delta },2-d/(Nq)+ N^{-2/3+\delta }])=\frac{2}{3}N^{-1+3\delta /2}+O(N^{-1+25\delta /18})\, \end{aligned}$$

and thus \((2-d/(Nq)-\lambda _k/q)_+ \prec N^{-2/3+\delta }\) for any fixed k. Together with (6.9) we finished the proof of Theorem 1.1 on the right side of the spectrum.

Remark 6.3

In Theorem 1.1 we restrict ourselves on the regime \(N^{2/3+\tau }\leqslant d \leqslant N/2\), where we have the optimal rigidity estimate. It can be deduced from Theorem 4.2 and Proposition 5.1 that for all \(N^{\tau }\leqslant d \leqslant N/2\), we have

$$\begin{aligned} \lambda _2 =2\sqrt{d(N-d)/N}(1+o(1)) \end{aligned}$$

with very high probability. We do not pursuit it here.

6.3 Proof of Theorem 1.3

Let us define the spectral domain.

$$\begin{aligned} \widetilde{{{\textbf {D}}}}{:}{=}\{z=E+\textrm{i}\eta : 1\leqslant E\leqslant 4, N^{-2/3}\leqslant \eta \leqslant 1\}. \end{aligned}$$

The next result follows from Theorem 1.1 and Proposition 6.1.

Corollary 6.4

For all \(z \in \widetilde{{{\textbf {D}}}}\), we have

$$\begin{aligned} |\underline{G} \!\,-\widehat{m}| \prec \frac{1}{N\eta }+\frac{1}{d^{1/2}}+\frac{(\kappa +\eta )^{1/6}}{(N\eta )^{2/3}} \end{aligned}$$

and

$$\begin{aligned} \max _{ij} |G_{ij}-\delta _{ij}\widehat{m}|\prec \frac{1}{(N\eta )^{1/2}}+\frac{1}{d^{1/2}}. \end{aligned}$$

In addition, we have

$$\begin{aligned} |\lambda _2/q+d/(Nq)-2| \prec N^{-2/3} \end{aligned}$$

and

$$\begin{aligned} {{\,\textrm{Im}\,}}G \asymp {{\,\textrm{Im}\,}}\widehat{m} \end{aligned}$$

for all \(z=E+\textrm{i}\eta \) satisfying \(1\leqslant E \leqslant 4\) and \(N^{-2/3+\delta }\leqslant \eta \leqslant 1\).

With the help of Corollary 6.4, one can now obtain Theorem 1.3 (at the right spectral edge) using a strategy very similar to that of [4, Section 9].

More precisely, by [1, 31] and Corollary 6.4, one immediately gets that, near the right edge of the spectrum, a Dyson Brownian motion starting at A reaches local equilibrium at time \(t_*\gg N^{-1/3}\). Theorem 1.3 then follows by comparing the edge statistics of the Dyson Brownian motion at times 0 and \(t_*\). The main difference in the comparison argument is that one needs to use Lemma 2.2 instead of [4, Corollary 3.2]. We shall sketch the steps, with emphasis on this difference.

Let us adopt the conventions in [4], i.e. we consider the constrained GOE W satisfying

$$\begin{aligned} \mathbb EW_{ij}W_{kl}=\frac{1}{N}\Big (\delta _{ik}-\frac{1}{N}\Big )\Big (\delta _{jl}-\frac{1}{N}\Big )+\frac{1}{N}\Big (\delta _{il}-\frac{1}{N}\Big )\Big (\delta _{jk}-\frac{1}{N}\Big ). \end{aligned}$$

We have the integration by parts formula

$$\begin{aligned} \mathbb E W_{ij}F(W)=\frac{1}{N^3}\sum _{k,l}\mathbb E \big [\partial _tF(W+t\xi _{ij}^{kl})\big |_{t=0}\big ]. \end{aligned}$$
(6.14)

The matrix-valued process is defined by

$$\begin{aligned} A(t){:}{=}\textrm{e}^{-t/2}A+\sqrt{1-\textrm{e}^{-t}}W, \end{aligned}$$
(6.15)

and we denote its eigenvalues by \(\xi _1(t)\geqslant \cdots \geqslant \xi _N(t)\). We define the parameter \(s{:}{=}1-\textrm{e}^{-t}\). The Green function is defined by \(G(t)\equiv G(t;z){:}{=}P_\bot (A(t)-z)^{-1}P_\bot \). Recall that we use \(\varrho (x)\) to denote the semicircle distribution on \([-2,2]\). As A and W have asymptotic eigenvalue densities \(\varrho (x+d/(Nq))\) and \(\varrho (x)\) respectively, A(t) has asymptotic eigenvalue density

$$\begin{aligned} \varrho (t;x){:}{=}\varrho (x+d/(e^{t/2}Nq)), \end{aligned}$$

and we define its Stieltjes transform by

$$\begin{aligned} m(t;z){:}{=}m(z+d/(e^{t/2}Nq)). \end{aligned}$$

As in [4, Sectiom 9.2], the next result follows from Corollary 6.4 and [1, 10, 31].

Lemma 6.5

  1. (i)

    Let \(0 \leqslant t \ll 1\). We have

    $$\begin{aligned} |\xi _2(t)+d/(e^{t/2}Np)-2| \prec N^{-2/3}. \end{aligned}$$
  2. (ii)

    Let \(0 \leqslant t \ll 1\). Uniformly for any \(z \in \widetilde{{{\textbf {D}}}}\), we have

    $$\begin{aligned} \big | \underline{G} \!\,(t;z)-m(t;z) \big |\prec \frac{1}{N\eta }+\frac{1}{d^{1/2}}+\frac{(\kappa +\eta )^{1/6}}{(N\eta )^{2/3}} \end{aligned}$$

    and

    $$\begin{aligned} \max _{ij} |G_{ij}(t;z)-\delta _{ij}m(t;z)|\prec \frac{1}{(N\eta )^{1/2}}+\frac{1}{d^{1/2}}. \end{aligned}$$
  3. (iii)

    Recall the definition of \(\mu \) from (6.2) and set \(t_*=N^{-1/3+\mu }\). Fix \(s \in {\mathbb {R}}\). We have

    $$\begin{aligned} \lim _{N \rightarrow \infty }{\mathbb {P}}_{A(t_*)}\big (N^{2/3}(\xi _2(t_*)+d/(e^{2/t}Nq)\!-\!2)\!\geqslant \! s\big )=\lim _{N \rightarrow \infty }{\mathbb {P}}_{GOE }\big (N^{2/3}(\mu _1\!-\!2)\!\geqslant \! s\big ). \end{aligned}$$

The limiting distribution of \(\lambda _2\) can be obtained through the following estimate.

Lemma 6.6

Let \(t_*=N^{-1/3+\mu }\), \(\eta =N^{-2/3-\mu }\). For \(\kappa \asymp N^{-2/3}\), we define

$$\begin{aligned} X_{t}{:}{=}{{\,\textrm{Im}\,}}\bigg [N \int _{\kappa }^{N^{-2/3+\mu }} \underline{G} \!\,(t;2-d/(e^{t/2}Nq)+x+\textrm{i}\eta ) \textrm{d}x \bigg ]. \end{aligned}$$

Let \(L: \mathbb R \rightarrow \mathbb R\) be a fixed smooth test function with bounded derivatives. We have

$$\begin{aligned} \big |\mathbb E L(X_{t_*})-\mathbb E L(X_0)\big | =O( N^{-\tau /4}) \end{aligned}$$

By Lemma 6.6 and an analogue of (9.33) in [4], we get

$$\begin{aligned} \lim _{N \rightarrow \infty }{\mathbb {P}}_{A}\big (N^{2/3}(\lambda _2/q+d/(Nq)-2)\geqslant s\big )=\lim _{N \rightarrow \infty }{\mathbb {P}}_{A(t_*)}\big (N^{2/3}(\xi _2(t_*)-2)\geqslant s\big ) \end{aligned}$$

for any fixed \(s \in {\mathbb {R}}\). Together with Lemma 6.5 (iii) we conclude the universality of \(\lambda _2\). Analogue results for other non-trivial eigenvalues of \(\mathcal {A}\) can be proved in the same way. We omit the details.

Proof of Lemma 6.6

Let us abbreviate \(G\equiv G(t)\). We have

$$\begin{aligned} \begin{aligned} \frac{\textrm{d}}{\textrm{d}t} {\mathbb {E}} L(X_t)=\mathbb E \bigg [L'(X_t){{\,\textrm{Im}\,}}\int _\kappa ^{N^{-2/3}+\mu }-\sum _{ij} \dot{A}_{ij}(t)(G^2)_{ij}+Nd/(2e^{t/2}Nq)\underline{G^2} \!\,\, \textrm{d}x\bigg ]. \end{aligned}\nonumber \\ \end{aligned}$$
(6.16)

By (4.4), (6.14), and (6.15)

$$\begin{aligned} \begin{aligned} -\sum _{ij} \mathbb E\dot{A}_{ij}(t)L'(X_t)(G^2)_{ij}&=\frac{1}{2}\sum _{ij}\mathbb E \bigg [\bigg (e^{-t/2}A_{ij}-\frac{\textrm{e}^{-t}}{\sqrt{1-e^{-t}}}W_{ij}\bigg )L'(X_t)(G^2)_{ij}\bigg ]\\&=\frac{\textrm{e}^{-t/2}}{2q}\sum _{ij} \mathbb E \mathcal {A}_{ij} L'(X_t) (G^2)_{ij}\\ {}&\quad -\frac{\textrm{e}^{-t/2}}{2N^3}\sum _{ijkl}\mathbb E \partial _{ij}^{kl}(L'(X_t)(G^2)_{ij}). \end{aligned} \end{aligned}$$
(6.17)

By Lemma 2.2, the first term on RHS of (6.17) can be computed by

$$\begin{aligned} \begin{aligned}&\,\frac{ \textrm{e}^{-t/2}}{2(N-d)dq}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal {A})\textrm{D}_{ij}^{kl}(L'(X_t)(G^2)_{ij}) +\frac{\textrm{e}^{-t/2}d}{2(N-d)q}\sum _{ij}\mathbb E L'(X_t)(G^2)_{ij}\\&-\frac{e^{-t/2}}{2(N-d)dq}\sum _{ij}\mathbb E (\mathcal {A}^3)_{ij}L'(X_t)(G^2)_{ij}+O(N^{-1}q^{-1})\cdot \sum _{ij}\mathbb E\mathcal M_{ij}(L'(X_t)(G^2)_{ij})\\&-\frac{\textrm{e}^{-t/2}}{2(N-d)dq}\sum _{ikl}\mathbb E \chi _{ik}^{il}(\mathcal {A})\textrm{D}_{ii}^{kl}(L'(X_t)(G^2)_{ii}) -\frac{\textrm{e}^{-t/2}d}{2(N-d)q}\sum _{i}\mathbb E L'(X_t)(G^2)_{ii}\\&+\frac{e^{-t/2}}{2(N-d)dq}\sum _{i}\mathbb E (\mathcal {A}^3)_{ii}L'(X_t)(G^2)_{ii} {=}{:}Y_1+\cdots +Y_7 \end{aligned} \end{aligned}$$

By \(\sum _{i} G_{ij}=0\), we have \(Y_2=0\). By Lemma 6.5 (ii), one can deduce that

$$\begin{aligned} {{\,\textrm{Im}\,}}\underline{G} \!\,(2+x+\textrm{i}N^{-2/3}) \prec N^{-1/3+\mu } \quad \hbox {and} \quad \max _{ij}|G_{ij}(2+x+\textrm{i}N^{-2/3})| \prec 1 \end{aligned}$$

for \(\kappa \leqslant x\leqslant N^{-2/3+\mu }\). Since \(y {{\,\textrm{Im}\,}}[\underline{G} \!\,(2+x+\textrm{i}y)]\) is a monotone decreasing function of y, we get

$$\begin{aligned} {{\,\textrm{Im}\,}}\underline{G} \!\,(2+x+\textrm{i}\eta ) \prec N^{-1/3+2\mu } \quad \hbox {and} \quad \max _{ij}|G_{ij}(2+x+\textrm{i}\eta )| \prec N^{\mu } \end{aligned}$$
(6.18)

for \(\kappa \leqslant x\leqslant N^{-2/3+\mu }\). From the above and (4.3) we can deduce that

$$\begin{aligned} Y_4 \prec Nq^{-1} \frac{N^{-1/3+2\mu }}{\eta }=N^{4/3+3\mu }q^{-1}. \end{aligned}$$

Similar as in Lemma 4.1, we can apply the second relation of (4.2) and show that

$$\begin{aligned} (\mathcal {A}^3G^2)_{ii}\prec d^{3/2} \frac{{{\,\textrm{Im}\,}}\underline{G} \!\,}{\eta } \prec d^{3/2} N^{1/3+3\mu }, \end{aligned}$$

where in the last step we used (6.18). This implies \(Y_3 \prec N^{1/3+3\mu }\). Similar to the estimates of \(S_5,S_6,S_7\) in (5.7), we can show that \(Y_5\prec N^{1/3+3\mu }\) and

$$\begin{aligned} Y_6+Y_7=-\frac{\textrm{e}^{-t/2}Nd}{2(N-d)q}\mathbb E\underline{G^2} \!\,+O_{\prec }(N^{5/6+10\mu }d^{-1/2}). \end{aligned}$$

Next, by (4.5), we get

$$\begin{aligned} Y_1=\frac{\textrm{e}^{-t/2}}{{2(N-d)dq^2}}\sum _{ijkl}\mathbb E \chi _{ik}^{jl}(\mathcal {A})\partial _{ij}^{kl}(L'(X_t)(G^2)_{ij})+O_\prec (N^{4/3+10\mu }d^{-1/2}). \end{aligned}$$
(6.19)

Let us denote the first term on RHS of the above by \(Y_{1,1}\). Using Lemma 2.2 with \(F(\mathcal {A})=(1-\mathcal {A}_{ij})\mathcal {A}_{jl}(1-\mathcal {A}_{kl})\partial _{ij}^{kl}(L'(X_t)(G^2)_{ij})\), we get

$$\begin{aligned} \begin{aligned} Y_{1,1}=&\,\frac{\textrm{e}^{-t/2}}{2(N-d)^2d^2q^2}\sum _{ijklab}\mathbb E \chi _{ia}^{kb}(\mathcal {A})\textrm{D}_{ik}^{ab}F(\mathcal {A}) +\frac{\textrm{e}^{-t/2}}{2(N-d)^2q^2}\sum _{ijkl}\mathbb E F(\mathcal {A})\\&-\frac{\textrm{e}^{-t/2}}{2(N-d)^2d^2q^2} \sum _{ijkl}\mathbb E (\mathcal {A}^3)_{ik}F(\mathcal {A})+O(N^{-2}d^{-2})\cdot \sum _{ijkl}\mathbb E\mathcal M_{ik}(F(\mathcal {A}))\\ =&\,\frac{\textrm{e}^{-t/2}}{2(N-d)^2q^2}\sum _{ijkl}\mathbb E F(\mathcal {A})-\frac{\textrm{e}^{-t/2}}{2(N-d)^2d^2q^2} \sum _{ijkl}\mathbb E (\mathcal {A}^3)_{ik}F(\mathcal {A})+O_{\prec }(N^{1-\tau /3}), \end{aligned} \end{aligned}$$

where in the second step we used (3.4), (4.5) and (6.18). By Proposition 3.1, the second term on RHS of the above can be estimated by

$$\begin{aligned} \begin{aligned}&-\frac{\textrm{e}^{-t/2}d}{2N(N-d)^2q^2} \sum _{ijkl} \mathbb EF(\mathcal {A})+O_\prec (N^{-2}d^{-3})\sum _{ijkl} \mathbb E |(\mathcal {A}^3)_{ik}-d^3N^{-1}| |F(\mathcal {A})|\\ =&-\frac{\textrm{e}^{-t/2}d}{2N(N-d)^2q^2} \sum _{ijkl} \mathbb EF(\mathcal {A}) +O_\prec (N^{-2}d^{-3}\cdot N^{4/3+10\mu }d)\sum _{ik} \mathbb E |(\mathcal {A}^3)_{ik}-d^3N^{-1}| \\ =&-\frac{\textrm{e}^{-t/2}d}{2N(N-d)^2q^2} \sum _{ijkl} \mathbb EF(\mathcal {A})+O_{\prec }(N^{5/6+10\mu }). \end{aligned} \end{aligned}$$

Since \(N^{2/3+\tau }\leqslant d\leqslant N/2\), we have

$$\begin{aligned} \begin{aligned} Y_{1,1}&=\frac{\textrm{e}^{-t/2}}{2(N-d)^2q^2}\sum _{ijkl}\mathbb E F(\mathcal {A}) -\frac{e^{-t/2}d}{2N(N-d)^2q^2} \sum _{ijkl} \mathbb EF(\mathcal {A})+O_{\prec }(N^{1-\tau /3})\\&=\frac{\textrm{e}^{-t/2}}{2(N-d)Nq^2}\sum _{ijkl} (1-\mathcal {A}_{ij})\mathcal {A}_{jl}(1-\mathcal {A}_{kl})\partial _{ij}^{kl}(L'(X_t)(G^2)_{ij})+O_{\prec }(N^{1-\tau /3}). \end{aligned}\nonumber \\ \end{aligned}$$
(6.20)

Comparing to (6.19), we see that heuristically, the above replaces the factor \(\mathcal {A}_{ik}\) in \(Y_{1,1}\) by \(dN^{-1}\), with a small error. Repeating (6.20) three times we get

$$\begin{aligned} Y_{1,1}= & {} \frac{\textrm{e}^{-t/2}d(N-d)}{2q^2N^4}\sum _{ijkl}\partial _{ij}^{kl}(L'(X_t)(G^2)_{ij})\\{} & {} +O_{\prec }(N^{1-\tau /3})=\frac{\textrm{e}^{-t/2}}{2N^3}\sum _{ijkl}\partial _{ij}^{kl}(L'(X_t)(G^2)_{ij})\\{} & {} +O_{\prec }(N^{1-\tau /3}), \end{aligned}$$

and together with (6.19) yields

$$\begin{aligned} Y_1=\frac{\textrm{e}^{-t/2}}{2N^3}\sum _{ijkl}\partial _{ij}^{kl}(L'(X_t)(G^2)_{ij})+O_{\prec }(N^{1-\tau /3}). \end{aligned}$$

Inserting the above results of \(Y_1,\dots ,Y_7\) to (6.17), we get

$$\begin{aligned} -\sum _{ij} \mathbb E\dot{A}_{ij}(t)L'(X_t)(G^2)_{ij}=-\frac{\textrm{e}^{-t/2}Nd}{2(N-d)q}\mathbb E\underline{G^2} \!\, +O_{\prec }( N^{1-\tau /3}). \end{aligned}$$

Together with (6.16) we conclude the proof. \(\square \)