The Bandwidth Theorem in Sparse Graphs

The bandwidth theorem [Mathematische Annalen, 343(1):175–205, 2009] states that any n-vertex graph G with minimum degree ( k−1 k + o(1) ) n contains all n-vertex k-colourable graphs H with bounded maximum degree and bandwidth o(n). We provide sparse analogues of this statement in random graphs as well as pseudorandom graphs. More precisely, we show that for p ≫ ( logn n )1/∆ a.a.s. each spanning subgraph G of G(n, p) with minimum degree ( k−1 k +o(1) ) pn contains all n-vertex k-colourable graphs H with maximum degree ∆, bandwidth o(n), and at least Cp−2 vertices not contained in any triangle. A similar result is shown for sufficiently bijumbled graphs, which, to the best of our knowledge, is the first resilience result in pseudorandom graphs for a rich class of subgraphs. Finally, we provide improved results for H with small degeneracy, which in particular imply a resilience result in G(n, p) with respect to the containment of spanning bounded degree trees for p ≫ ( log n n )1/3 .


Introduction
A central topic in extremal graph theory is to determine minimum degree conditions which force a graph G to contain a copy of some large or even spanning subgraph H. The prototypical example of such a theorem is Dirac's theorem [20], which states that if δ(G) ≥ 1 2 v(G) then G is Hamiltonian. Analogous results were established for a wide range of spanning subgraphs H with bounded maximum degree such as powers of Hamilton cycles, trees, or F -factors for any fixed graph F (see e.g. [30] for a survey). One feature that all these subgraphs H have in common is that their bandwidth is small. The bandwidth of a graph H is the minimum b such that there is a labelling of the vertex set of H by integers 1, . . . , n with |i − j| ≤ b for every edge ij of H. And indeed, it was shown in [12] that a more general result holds, which provides a minimum degree condition forcing any spanning bounded degree subgraphs of small bandwidth. This result is by now often called the bandwidth theorem.
Theorem 1 (Bandwidth Theorem [12]). For every γ > 0, ∆ ≥ 2, and k ≥ 1, there exist β > 0 and n 0 ≥ 1 such that for every n ≥ n 0 the following holds. If G is a graph on n vertices with minimum degree δ(G) ≥ k−1 k + γ n and if H is a k-colourable graph on n vertices with maximum degree ∆(H) ≤ ∆ and bandwidth at most βn, then G contains a copy of H.
We remark that in contrast to the above mentioned earlier results for specific bounded degree spanning subgraphs the minimum degree condition in this theorem has an error term γn, but it is known that this cannot completely be omitted in this general statement. In that sense the minimum degree condition in Theorem 1 is best-possible. It is also known that the bandwidth condition cannot be dropped completely (see [12] for further explanations). Moreover, this condition does not limit the class of graphs under consideration unreasonably, because many interesting classes of graphs have sublinear bandwidth. Indeed, it was shown in [11] that for bounded degree n-vertex graphs, restricting the bandwidth to o(n) is equivalent to restricting the treewidth to o(n) or forbidding linear sized expanding subgraphs, which implies that bounded degree planar graphs, or more generally classes of bounded degree graphs defined by forbidding some fixed minor have bandwidth o(n). Generalisations of Theorem 1 were obtained in [9,13,25,31].
In this paper we are interested in the transference of Theorem 1 to sparse graphs. Such transference results recently received much attention, including for example the breakthrough result on the transference of Turán's theorem to random graphs by Conlon and Gowers [16] and Schacht [36]. The random graph model we shall consider here is the binomial random graph G(n, p), which has n vertices, and each pair of vertices forms an edge independently with probability p. The appearance of large or spanning subgraphs of G(n, p) was studied since the early days of probabilistic combinatorics and by now many important results were obtained. Gems include the Johansson-Kahn-Vu theorem [24] which determines the threshold for G(n, p) to contain an F -factor whenever F is strictly balanced (as is the case, for example, when F is a clique), and the theorem of Riordan [35] which gives a very good, and in many cases tight, upper bound on the threshold for G(n, p) to contain a general spanning graph H.
For spanning graphs H with maximum degree ∆(H) ≤ ∆ Riordan's theorem implies that G(n, p) asymptotically almost surely (a.a.s.), that is, with probability tending to 1 as n tends to infinity, contains H as a subgraph if p·n 2 ∆+1 − 2 ∆(∆+1) → ∞. This is not believed to be best possible. Indeed, it follows from the Johansson-Kahn-Vu theorem that the threshold for G(n, p) to contain a K ∆+1 -factor is (log n) 1/( ∆+1 2 ) /n 2/(∆+1) , and it is conjectured in [21] that above this probability we also get any other sequence of spanning graphs H = (H n ) with ∆(H) ≤ ∆. This was proved, using the Johansson-Kahn-Vu theorem, to be true for almost spanning graphs by Ferber, Luh, and Nguyen [21].
Better bounds are available if we further know that the degeneracy of H is bounded by a constant much smaller than ∆(H). The degeneracy of H is the smallest integer D such that any subgraph of H has a vertex of degree at most D. Surprisingly, for this class of graphs H already Riordan's theorem implies an essentially optimal bound. Corollary 3 (of Riordan's theorem [35]). For every ∆ ≥ 1 and D ≥ 3, and every sequence H = (H n ) of graphs with v(H) ≤ n and ∆(H) ≤ ∆ and degeneracy at most D, the random graph G(n, p) a.a.s. contains H if p · n 1/D → ∞. This is best possible because a simple first moment calculation shows that if p · n 1/D → 0 then G(n, p) a.a.s. does not contain the D-th power of a Hamilton path, which is a D-degenerate graph with maximum degree 2D.
A feature that both Riordan's theorem and the Johansson-Kahn-Vu theorem (and consequently all results which rely on them, such as Theorem 2 and Corollary 3) have in common is that their proofs are non-constructive. Furthermore, they do not allow for so-called universality results. A graph G is said to be universal for a family H of graphs if G contains copies of all graphs in H simultaneously. The random graph G(n, p) is known to be universal for various families of graphs, but in almost all cases we only know an upper bound on the threshold for universality, which we do not believe is the correct answer.
The reason why probabilistic existence results such as Corollary 3 do not imply universality is that in G(n, p) the failure probability for containing any given spanning graph H without isolated vertices is at least (1 − p) n−1 , the probability that a fixed vertex of G(n, p) is isolated. This probability is too large to apply a union bound. Thus, to prove universality results one needs to show that any graph G with some collection of properties that G(n, p) a.a.s. possesses must contain any given H ∈ H. Using this approach, and improving on a series of earlier results, Dellamonica, Kohayakawa, Rödl and Ruciński [19] obtained the following universality result for the family H(n, ∆) of n-vertex graphs with maximum degree ∆.
However, it is conjectured that universality and the appearance of a K ∆+1 -factor occur together, at the threshold given in Theorem 2. A probability bound which is better, but still far from the conjectured truth, was so far only established for almost spanning graphs by Conlon, Ferber, Nenadov andŠkorić [14], who showed that for ∆ ≥ 3, if p ≫ n −1/(∆−1) log 5 n then G(n, p) is a.a.s. universal for H (1 − o(1))n, ∆) . For graphs with small degeneracy, again, the following better bound exists, but this also is far away from the threshold in Corollary 3, which is a plausible candidate for the correct answer.
Theorem 5 (Allen, Böttcher, Hàn, Kohayakawa, Person [2]). For all ∆, D ≥ 1 there is C such that if p ≥ C log n vertices with ∆(H) ≤ ∆, bandwidth at most β * n, and with at least C * p −2 vertices which are not contained in any triangles of H. Then G contains a copy of H.
Observe that the bound on p achieved in this result matches the bound in the universality result in Theorem 4. Hence, though we do not believe it to be optimal, improving it will most likely be hard. Moreover, as explained in conjunction with Theorem 1, the minimum degree of G cannot be decreased, nor can the bandwidth restriction be removed. As indicated above, it is also necessary that Θ(p −2 ) vertices of H are not in triangles.
If in addition the subgraph H is also D-degenerate, we can prove a variant of Theorem 6 for p ≫ (log n/n) 1/(2D+1) . Again, this probability bound matches the one in the currently best universality result for D-degenerate graphs given in Theorem 5. As before we require a certain number of vertices which are not in triangles of H. But, due to technicalities of our proof method, in addition these vertices are now also required not to be in four-cycles.
Theorem 7. For each γ > 0, ∆ ≥ 2, and D, k ≥ 1, there exist constants β * > 0 and C * > 0 such that the following holds asymptotically almost surely for Γ = G(n, p) if p ≥ C * log n n 1/(2D+1) . Let G be a spanning subgraph of Γ with δ(G) ≥ k−1 k + γ pn and let H be a D-degenerate, k-colourable graph on n vertices with ∆(H) ≤ ∆, bandwidth at most β * n and with at least C * p −2 vertices which are not contained in any triangles or four-cycles of H. Then G contains a copy of H.
Since trees are 1-degenerate this implies a resilience result for trees when p ≫ ( log n n ) 1/3 . This probability bound is much worse than that obtained by Balogh, Csaba, and Samotij [6] for almostspanning trees, and unlikely to be optimal, but it is the first resilience result for bounded degree spanning trees in G(n, p).
Finally, we also establish a sparse analogue of Theorem 1 in bijumbled graphs, one of the most widely studied classes of pseudorandom graphs. A graph Γ is (p, ν)-bijumbled if for all disjoint sets X, Y ⊆ V (Γ) we have e(X, Y ) − p|X||Y | ≤ ν |X||Y | .
This goes back to an equivalent notion introduced by Thomason [39] who initiated the study of pseudorandom graphs. It is also related to the well investigated class of (n, d, λ)-graphs in that an (n, d, λ)-graph is d n , λ -bijumbled. Only very recently a universality result similar to Theorem 4 was established for bijumbled graphs in [2], where it was shown that (p, ν)-bijumbled graphs G with δ(G) ≥ 1 2 pn and ν ≪ p max(4,(3∆+1)/2) n are universal for H(n, ∆). Our resilience result works for the same bijumbledness condition, though we do not believe it to be optimal. Local resilience results in bijumbled graphs were so far only obtained for special subgraphs H: Dellamonica, Kohayakawa, Marciniszyn, and Steger [18] considered cycles H of length (1 − o(1))n, the results of Conlon, Fox and Zhao [15] imply resilience for F -factors covering (1 − o(1))n vertices, and Krivelevich, Lee and Sudakov [29] established a resilience result for pancyclicity. Hence, previous to this work only little was known about the resilience of bijumbled (or indeed any other common notion of pseudorandom) graphs.
Theorem 8. For each γ > 0, ∆ ≥ 2, and k ≥ 1, there exists a constant c > 0 such that the following holds for any p > 0. Given ν ≤ cp max(4,(3∆+1)/2) n, suppose Γ is a p, ν -bijumbled graph, G is a spanning subgraph of Γ with δ(G) ≥ k−1 k + γ pn, and H is a k-colourable graph on n vertices with ∆(H) ≤ ∆ and bandwidth at most cn. Suppose further that there are at least c −1 p −6 ν 2 n −1 vertices in V (H) that are not contained in any triangles of H. Then G contains a copy of H.
The proofs of our results rely on sparse versions of the so-called blow-up lemma. The blowup lemma is an important tool in extremal graph theory, proved by Komlós, Sárközy and Szemerédi [28] and was for example instrumental in the proof of the bandwidth theorem and its analogue in G(n, p) for constant p by Huang, Lee and Sudakov [22]. However it applies only to dense graphs. Several of the earlier resilience results in sparse random graphs developed sparse blow-up type results handling special classes of graphs: Balogh, Lee and Samotij [7] proved a sparse blow-up lemma for embedding triangle factors, and in [10] a blow-up lemma for embedding almost spanning bipartite graphs in sparse graphs was used. Full versions of the blow-up lemma in sparse random graphs and pseudorandom graphs were established only very recently in [2]. We will use these here.
Further, we note that we actually prove somewhat stronger statements than Theorem 6, Theorem 7, and Theorem 8 in the same sense in that a stronger statement than Theorem 1 was proven in [12]: we allow H in fact to be (k + 1)-colourable, where the additional colour may only be assigned to very few well distributed vertices (for details see, e.g., Theorem 23 below). Thus, for instance, even though Theorem 6 only implies that the local resilience of G(n, p) with respect to Hamiltonicity is a.a.s. at least 1 2 − o(1) when n is even, Theorem 23 implies it also for n odd, since although the chromatic number of a Hamilton cycle is 3, there are 3-colourings which use the third colour only on one vertex.
Organisation. The remainder of this paper is organised as follows. In Section 2 we introduce necessary definitions and collect some known results which we need in our proofs. Next, in Section 3, we outline the proof of the bandwidth theorem in sparse random graphs, Theorem 6, and state the four technical lemmas we require. Their proofs are given in Sections 4-7, and the proof of Theorem 6 is presented in Section 8. We provide the modifications required to obtain Theorem 7 in Section 9, and those required for Theorem 8 in Section 10. Finally, Section 11 contains some concluding remarks, and Appendix A contains proofs of a few results which are more or less standard but which we could not find in the form we need in the literature.

Preliminaries
Throughout the paper log denotes the natural logarithm. We assume that the order n of all graphs tends to infinity and therefore is sufficiently large whenever necessary. For reals a, b > 0 and integer k ∈ N, we use the notation (a ± b) = [a − b, a + b] and [k] = {1, . . . , k}. Our graphtheoretic notation is standard and follows [8]. In particular, given a graph G its vertex set is denoted by V (G) and its edge set by E(G). Let A, B ⊆ V be disjoint vertex sets. We denote the number of edges between A and B by e (A, B).
. Finally, we use the notation deg G (v) := |N G (v)| and deg G (v, A) := |N G (v, A)|, as well as deg G (v 1 , . . . , v k ; A) := |N G (v 1 , . . . , v k ; A)| for the degree of v in G, the degree of v restricted to A in G and the size of the joint neighbourhood of v 1 , . . . , v k restricted to A in G. Finally, let deg G (v) := |N G (v)| be the degree of v in G. For the sake of readability, we do not intend to optimise the constants in our theorems and proofs. Now we introduce some definitions and results of the regularity method as well as related tools that are essential in our proofs. In particular, we state a minimum degree version of the sparse regularity lemma (Lemma 12) and the sparse blow up lemma (Lemma 15). Both lemmas use the concept of regular pairs. Let G = (V, E) be a graph, ε, d > 0, and p ∈ (0, 1]. Moreover, let X, Y ⊆ V be two disjoint nonempty sets. The p-density of the pair (X, Y ) is defined as For most of this paper, when we work with random graphs, we will be interested in the regularity concept called lower-regularity. When we work with bijumbled graphs, on the other hand, we will need the stronger concept regularity. The difference is that in the former we impose only lower bounds on p-densities, whereas in the latter we impose in addition upper bounds. The main reason for this difference is that our 'regularity inheritance lemmas' below have different requirements in random and in bijumbled graphs; we do not otherwise make use of the extra strength of 'regular' as opposed to 'lower-regular'.
We also need to define super-regularity, for which we require G to be a subgraph of a graph Γ, which will be the random or bijumbled graph whose resilience properties we are establishing.
If (X, Y ) is either (ε, d, p) G -lower-regular or (ε, d, p) G -regular, and in addition we have for every x ∈ X and y ∈ Y , then the pair (X, Y ) is called (ε, d, p) G -super-regular. When we use super-regularity it will be clear from the context whether (X, Y ) is lower-regular or regular.
Note that a regular pair is by definition lower-regular, though the converse does not hold. Furthermore, although the definition of super-regularity of G contains a reference to Γ, at each place in this paper where we use super-regularity, we will see that the first term in the maximum is larger than the second. When it is clear from the context, we may omit the subscript G in (ε, d, p) G -(super-)regular which is used to indicate with respect to which graph a pair is (super-)regular. A direct consequence of the definition of (ε, d, p)-lower-regular pairs is the following proposition about the sizes of neighbourhoods in lower-regular pairs. Proposition 10. Let (X, Y ) be (ε, d, p)-lower-regular. Then there are less than ε|X| vertices The next proposition asserts that small alterations of the vertex sets of an (ε, d, p)-(lower-)regular pair do not destroy (lower-)regularity.
We defer the proof of this to Appendix A. In order to state the sparse regularity lemma, we need some more definitions. A partition V = {V i } i∈{0,...,r} of the vertex set of G is called an (ε, p) G -regular partition of V (G) if |V 0 | ≤ ε|V (G)| and (V i , V i ′ ) forms an (ε, 0, p) G -regular pair for all but at most ε r The graph R is referred to as the (ε, d, p) G -reduced graph of V, the partition classes V i with i ∈ [r] as clusters, and V 0 as the exceptional set. We also say that V is (ε, d, p) G -super-regular on a graph R ′ with vertex set [r] if (V i , V i ′ ) is (ε, d, p) G -super-regular for every {i, i ′ } ∈ E(R ′ ). Again, when we talk about reduced graphs or super-regularity, whether we are using lower-regularity or regularity will be clear from the context. We will however always specify whether a partition is regular or only lower-regular on R.
Analogously to Szemeredi's regularity lemma for dense graphs, the sparse regularity lemma, proved by Kohayakawa and Rödl [26,27], asserts the existence of an (ε, p)-regular partition of constant size of any sparse graph. We state a minimum degree version of this lemma, whose proof (following [10]) we defer to Appendix A.
Lemma 12 (Minimum degree version of the sparse regularity lemma). For each ε > 0, each α ∈ [0, 1], and r 0 ≥ 1 there exists r 1 ≥ 1 with the following property. For any d ∈ [0, 1], any p > 0, and any n-vertex graph G with minimum degree αpn such that for any disjoint X, A key ingredient in the proof of our main theorem is the so-called sparse blow up lemma developed by Hàn, Kohayakawa, Person, and two of the current authors in [2]. Given a subgraph G ⊆ Γ = G(n, p) with p ≫ (log n/n) 1/∆ and an n-vertex graph H with maximum degree at most ∆ with vertex partitions V and W, respectively, the sparse blow up lemma guarantees under certain conditions a spanning embedding of H in G which respects the given partitions. In order to state this lemma we need to introduce some definitions.
Definition 13 ((ϑ, R ′ )-buffer). Let R ′ be a graph on r vertices and let H be a graph with vertex partition W = {W i } i∈ [r] . We say that the family , and (ii ) for each i ∈ [r] and each x ∈ W i , the first and second neighbourhood of x go along R ′ , i.e., for each {x, y}, {y, z} ∈ E(H) with y ∈ W j and z ∈ W k we have {i, j} ∈ E(R ′ ) and {j, k} ∈ E(R ′ ).
Let G and H be graphs on n vertices with partitions We will actually need a little more than just an embedding of H into G respecting given partitions: we will need to restrict the images of some vertices of H to subsets of the clusters of G. The following definition encapsulates the properties we have to guarantee for the sparse blow-up lemma to obtain such an embedding.
Definition 14 (Restriction pair). Let ε, d > 0, p ∈ [0, 1], and let R be a graph on r vertices. Furthermore, let G be a (not necessarily spanning) subgraph of Γ = G(n, p) and let H be a graph given with vertex partitions We say that I and J are a (ρ, ζ, ∆, ∆ J )-restriction pair if the following properties hold for each i ∈ [r] and x ∈ W i .
Suppose V is an (ε, d, p) G -lower-regular partition of V (G) with reduced graph R, and let R ′ be a subgraph of R. We say (G, V) has one-sided inheritance on R ′ if for every {i, j}, {j, k} ∈ E(R ′ ) and Now we can finally state the sparse blow up lemma.
Lemma 15 ([2, Lemma 1.21]). For each ∆, ∆ R ′ , ∆ J , ϑ, ζ, d > 0, κ > 1 there exist ε BL , ρ > 0 such that for all r 1 there is a C BL such that for p ≥ C BL (log n/n) 1/∆ the random graph Γ = G n,p asymptotically almost surely satisfies the following. Let R be a graph on r ≤ r 1 vertices and let R ′ ⊆ R be a spanning subgraph with ∆(R ′ ) ≤ ∆ R ′ . Let H and G ⊆ Γ be graphs given with κ-balanced, size-compatible vertex partitions W = {W i } i∈ [r] and V = {V i } i∈[r] with parts of size at least m ≥ n/(κr 1 ). Let I = {I x } x∈V (H) be a family of image restrictions, and J = {J x } x∈V (H) be a family of restricting vertices. Suppose that is an (ϑ, R ′ )-buffer for H, (BUL 2) V is (ε BL , d, p) G -lower-regular on R, (ε BL , d, p) G -super-regular on R ′ , has one-sided inheritance on R ′ , and two-sided inheritance on R ′ for W, (BUL 3) I and J form a (ρ, ζ, ∆, ∆ J )-restriction pair. Then there is an embedding φ : Observe that in the blow up lemma for dense graphs, proved by Komlós, Sárközy, and Szemerédi [28], one does not need to explicitly ask for one-and two-sided inheritance properties since they are always fulfilled by dense regular partitions. This is, however, not true in general in the sparse setting. The following two lemmas will be very useful whenever we need to choose vertices whose neighbourhoods inherit lower-regularity.
Lemma 16 (One-sided lower-regularity inheritance, [2]). For each ε OSRIL , α OSRIL > 0 there exist ε 0 > 0 and C > 0 such that for any 0 < ε < ε 0 and 0 < p < 1 asymptotically almost surely Γ = G(n, p) has the following property. For any disjoint sets X and Y in V (Γ) with |X| ≥ C max p −2 , p −1 log n and |Y | ≥ Cp −1 log n, and any subgraph Lemma 17 (Two-sided lower-regularity inheritance, [2]). For each ε TSRIL , α TSRIL > 0 there exist ε 0 > 0 and C > 0 such that for any 0 < ε < ε 0 and 0 < p < 1, asymptotically almost surely Γ = G n,p has the following property. For any disjoint sets X and Y in V (Γ) with |X|, |Y | ≥ C max{p −2 , p −1 log n}, and any subgraph We close this section with two Chernoff bounds for random variables that follow a binomial (Theorem 19) and a hypergeometric distribution (Theorem 20), respectively, and the following useful observation. Roughly speaking, it states that a.a.s. nearly all vertices in G(n, p) have approximately the expected number of neighbours within large enough subsets.
Note that in most of this paper we will use the upper bound log(en/|X|) ≤ log n when applying this proposition, and Lemmas 16 and 17, valid since (in all applications) we have |X| ≥ e. We will only need the full strength of these three results when proving the Lemma for G (Lemma 24).
In the proof of Proposition 18 we use the following version of Chernoff's Inequalities (see e.g. [23, Chapter 2] for a proof).
Theorem 19 (Chernoff's Inequality, [23]). Let X be a random variable which is the sum of independent Bernoulli random variables. Then we have for ε ≤ 3/2 Proof of Proposition 18. Since the statement of the proposition is stronger when ε is smaller, we may assume that 0 < ε ≤ 1. We set C ′ = 100ε −2 and C = 1000C ′ ε −1 .
We first show that Γ = G(n, p) a.a.s. has the following two properties. For any disjoint A, B ⊆ V (Γ), with |A| ≥ C ′ p −1 log n and |B| ≥ C ′ p −1 log(en/|A|), we have e(A, B) = 1 ± ε 2 p|A||B|. For any A ⊆ V (Γ), we have e(A) ≤ 4p|A| 2 + 2|A| log n, and if |A| ≥ C ′ p −1 log n then e(A) ≤ 2p|A| 2 . Note that these properties imply the first two conclusions of the proposition.
We estimate the failure probability of the first property using Theorem 19 and the union bound. Assuming without loss of generality that |A| ≥ |B|, this probability is at most For the second property, observe that 4p|A| 2 > 7p |A| 2 , so that for any given A by Theorem 19 we have P e(A) ≥ 4p|A| 2 + 2|A| log n ≤ e −2|A| log n = n −2|A| . Taking a union bound over the at most n |A| choices of A given |A|, we see that the failure probability of the second property is at most n a=1 n −a . Finally, the failure probability of the last property is at most and since all three failure probabilities tend to zero as n → ∞, we conclude that a.a.s. G(n, p) enjoys both properties. Now suppose Γ has these properties, and let X ⊆ V (Γ) have size at least Cp −1 log n. We first show that there are at most C ′ p −1 log(en/|X|) vertices in Γ which have less than (1 − ε)p|X| neighbours in X. If this were false, then we could choose a set Y of C ′ p −1 log(en/|X|) vertices in Γ which have less than (1 − ε)p|X| neighbours in X. By choice of C and since |X| > e, we Next we show that there are at most 2C ′ p −1 log(en/|X|) vertices of Γ which have more than (1+ ε)p|X| neighbours in X. Again, if this is not the case we can let Y be a set of 2C ′ p −1 log(en/|X|) vertices of Γ with more than (1 + ε)p|X| neighbours in X. Now e(Y ) ≤ 4p|Y | 2 + 2|Y | log n = 8C ′ |Y | log(en/|X|) + 2|Y | log n ≤ 10C ′ |Y | log n, so there are at most |Y |/2 vertices in Y which have 40C ′ log n or more neighbours in Y . Let Y ′ ⊆ Y consist of those vertices with at most 40C ′ log n neighbours in Y . For each v ∈ Y ′ we have and so, by choice of C, each vertex of Y ′ has at least 1 + ε 2 p|X \ Y | neighbours in X \ Y . Since |Y ′ | ≥ C ′ p −1 log(2en/|X|) and |X \ Y | ≥ |X|/2 ≥ C ′ p −1 log n, this is a contradiction. Finally, since by choice of C we have 3C ′ p −1 log n < Cp −1 log n we conclude that all but at most Cp −1 log(en/|X|) vertices of Γ have (1 ± ε)p|X| neighbours in X, as desired.
Finally, let N , m, and s be positive integers and let S and S ′ ⊆ S be two sets with |S| = N and |S ′ | = m. The hypergeometric distribution is the distribution of the random variable X that is defined by drawing s elements of S without replacement and counting how many of them belong to S ′ . It can be shown that Theorem 19 still holds in the case of hypergeometric distributions (see e.g. [23], Chapter 2 for a proof) with E[X] := ms/N . Theorem 20 (Hypergeometric inequality, [23]). Let X be a random variable that follows the hypergeometric distribution with parameters N , m, and s. Then for any ε > 0 and t ≥ εms/N we have P |X − ms/N | > t < 2e −ε 2 t/3 .
We require the following technical lemma, which is a consequence of the hypergeometric inequality stated in Theorem 20.
Lemma 21. For each η > 0 and ∆ there exists C such that the following holds. Let W ⊆ [n], let t ≤ 100n ∆ , and let T 1 , . . . , T t be subsets of W . For each m ≤ |X| there is a set S ⊆ W of size m such that Proof. Set C = 30η −2 ∆. Observe that for each i, the size of T i ∩ S is hypergeometrically distributed. By Theorem 20, for each i we have so taking the union bound over all i ∈ [t] we conclude that the probability of failure is at most 2t/n 1+∆ ≤ 200/n → 0 as n → ∞, as desired.

Proof overview and main lemmas
Theorem 6 is a corollary of the following more general Theorem 23, which we prove in Section 8. We require one preliminary definition.
Definition 22 (Zero-free colouring). Let H be a (k + 1)-colourable graph on n vertices and let L be a labelling of its vertex set of bandwidth at most βn. A proper (k + 1)-colouring σ : V (H) → {0, . . . , k} of its vertex set is said to be (z, β)-zero-free with respect to L if any z consecutive blocks contain at most one block with colour zero, where a block is defined as a set of the form {(t − 1)4kβn + 1, . . . , t4kβn} with some t ∈ [1/(4kβ)].

. Let
G be a spanning subgraph of Γ with δ(G) ≥ k−1 k + γ pn and let H be a graph on n vertices with ∆(H) ≤ ∆ that has a labelling L of its vertex set of bandwidth at most βn, a (k + 1)-colouring that is (z, β)-zero-free with respect to L and where the first √ βn vertices in L are not given colour zero and the first βn vertices in L include Cp −2 vertices that are not contained in any triangles of H. Then G contains a copy of H.
3.1. Proof overview. We now give a brief sketch of the proof of Theorem 23. Ultimately, our goal is to apply the sparse blow-up lemma, Lemma 15, to find an embedding of H into G. Thus, the proof boils down to obtaining the required conditions. But there is a catch: this is not as such possible, as for any lower-regular partition of G there can be some exceptional vertices which are 'badly behaved' with respect to the partition. These vertices will never satisfy the conditions of the sparse blow-up lemma, and we will have to deal with them beforehand. We will do this by 'pre-embedding' some vertices of H to cover the exceptional vertices, and then apply the sparse blow-up lemma to complete the embedding of H into G, using image restrictions to ensure we really obtain an embedding of H. Let us now fill in a few more details.
We start by obtaining, in the lemma for G, Lemma 24, a lower-regular partition of G into parts V 0 and V i,j for i ∈ [r] (where r may be large but is bounded above by a constant) and j ∈ [k] with several extra properties. The most important properties are that |V 0 | = O p −2 , that the corresponding reduced graph, which we call R k r , has high minimum degree and contains a backbone graph, that is, contains the edge (i, j), (i ′ , j ′ ) whenever |i = i ′ | ≤ 1 and j = j ′ , that the lowerregular pairs (V i,j , V i,j ′ ) are not just lower-regular but super-regular, and that all vertices outside V 0 have inheritance properties with respect to all lower-regular pairs. In short, if the exceptional vertices V 0 did not exist, this partition, together with a corresponding partition of V (H), would be what we need to apply the sparse blow-up lemma.
Passing over for now the inconvenient existence of V 0 , our next task is to find the corresponding partition of V (H), for which we use the lemma for H, Lemma 25. One should think of the backbone graph as consisting of copies of K k (one for each i ∈ [r]) connected in a linear order; and the high minimum degree of R k r ensures that each K k extends to K k+1 in R k r . The basic idea is then to split H into intervals in the bandwidth order. We assign the first interval to the first K k of the backbone graph according to the given colouring of H, with the few vertices of colour zero assigned to a vertex extending this clique of the backbone graph to K k+1 , and so on. Using the bandwidth property and zero-freeness of the colouring one can do this in such a way as to obtain a graph homomorphism from H to R k r , which is what we need. In addition, we need the number of vertices assigned to each (i, j) ∈ V (R k r ) to be very close to |V i,j |. We cannot guarantee exact equality, but we can get very close by making further use of bandwidth, zero-freeness, and the fact that K k s in R k r extend to K k+1 s. Now we have to deal with the exceptional set V 0 . We do this as follows. We choose a vertex v in the exceptional set, and 'pre-embed' to it a vertex x picked from the first √ βn vertices of L which is not in any triangle of H. Using the common neighbourhood lemma, Lemma 26, we choose ∆ neighbours of v which are 'well-behaved' with respect to the clusters V i,j for some i ∈ [r], and pre-embed the neighbours of x to these vertices. The 'well-behaved' properties are what we need to generate image restrictions for the second neighbours of x (which we will embed using the sparse blow-up lemma) satisfying the restriction pair properties. We also need to change the assignment from the Lemma for H locally (up to a large but constant distance from x) to accommodate this: the vertex x, and its first and second neighbours, might have been assigned somewhere quite different previously. We repeat this until we have pre-embedded to all exceptional vertices, and let H ′ and G ′ be respectively the unembedded vertices of H and the vertices of G to which we did not pre-embed.
At this point we have all the conditions we need to apply the sparse blow-up lemma to complete the embedding, except that the partitions of H ′ and G ′ we have do not quite have parts of matching sizes. We use the balancing lemma, Lemma 27, to deal with this. The idea is simple: we take some carefully selected vertices in clusters of G which are too big (compared to the assigned part of H) and move them to other clusters, first in order to make sure that the total number of vertices in i V i,j is correct for each j (using the high minimum degree of R k r ) and then (using the structure of the backbone graph) to give each cluster the correct size.
At last, applying the sparse blow-up lemma, Lemma 15, we complete the embedding of H into G.
We note that this proof sketch glosses over some subtleties. In particular, at the two places where 'we choose' vertices onto which to pre-embed, we have to be quite careful to choose vertices correctly so that this strategy can be completed and we do not destroy good properties obtained earlier. We will return to this point immediately before the proof of Theorem 23 in Section 8 to explain how we do this.

Main lemmas.
In this subsection we formulate the four main lemmas that we use in the proof of Theorem 23 mentioned in the above overview. We defer the proofs of these lemmas to later sections. Before stating these lemmas, we need some more definitions.
Let r, k ≥ 1 and let B k r be the backbone graph on kr vertices. That is, we have be the spanning subgraph of B k r that is the disjoint union of r complete graphs on k vertices given by the following components: the complete graph The lemma for G says that a.a.s. Γ = G(n, p) satisfies the following property if p ≫ (log n/n) 1/2 . For any spanning subgraph G ⊆ Γ with minimum degree a sufficiently large fraction of pn, there exists an (ε, d, p) G -lower-regular vertex partition V of V (G) whose reduced graph R k r contains a clique factor K k r on which the corresponding vertex sets of V are pairwise (ε, d, p)-super-regular. Furthermore, (G, V) has one-sided and two-sided inheritance with respect to R k r , and the Γneighbourhoods of all vertices but the ones in the exceptional set of V have almost exactly their expected size in each cluster. The proof of Lemma 24 is given in Section 4.
Lemma 24 (Lemma for G). For each γ > 0 and integers k ≥ 2 and r 0 ≥ 1 there exists d > 0 such that for every ε ∈ 0, 1 2k there exist r 1 ≥ 1 and C * > 0 such that the following holds a.a.s. for Γ = G(n, p) if p ≥ C * (log n/n) kr, and such that the following is true.
After Lemma 24 has constructed a lower-regular partition V of V (G), the second main lemma deals with the graph H that we would like to find as a subgraph of G. More precisely, Lemma 25 provides a homomorphism f from the graph H to the reduced graph R k r given by Lemma 24 which has among others the following properties. The edges of H are mapped to the edges of R k r , and the vast majority of the edges of H are assigned to edges of the clique factor K k r ⊆ R k r . The number of vertices of H mapped to a vertex of R k r only differs slightly from the size of the corresponding cluster of V. The lemma further guarantees that each of the first √ βn vertices of the bandwidth ordering of V (H) is mapped to (1, j) with j being the colour that the vertex has received by the given colouring of H. In case H is D-degenerate the next lemma also ensures that for every (i, j) ∈ [r] × [k], a constant fraction of vertices mapped to (i, j) have each at most 2D neighbours.
Lemma 25 (Lemma for H). Given D, k, r ≥ 1 and ξ, β > 0 the following holds if ξ ≤ 1/(kr) and β ≤ 10 −10 ξ 2 /(Dk 4 r). Let H be a D-degenerate graph on n vertices, let L be a labelling of its vertex set of bandwidth at most βn and let σ : V (H) → {0, . . . k} be a proper (k + 1)-colouring that is (10/ξ, β)-zero-free with respect to L, where the colour zero does not appear in the first √ βn vertices of L. Furthermore, let R k r be a graph on vertex set Lemma 8]. The proof of [13, Lemma 8] is deterministic; here we use a probabilistic argument to show the existence of a function f that also satisfies the additional property (H 6), which is required for Theorem 7. However, we still borrow ideas from the proof of [13,Lemma 8]. The proof of Lemma 25 will be given in Section 5.
During the pre-embeding, we embed a vertex x of H onto a vertex v of V 0 , and we also embed its neighbours N H (x). This creates restrictions on the vertices of G to which we can embed the second neighbours, and for application of Lemma 15 we need certain conditions to be satisfied. The next lemma states that we can find vertices in N G (v), to which we will embed N H (x), satisfying these conditions. Lemma 26 (Common neighbourhood lemma). For each d > 0, k ≥ 2, and ∆ ≥ 2 there exists α > 0 such that for every ε * ∈ (0, 1) there exists ε 0 > 0 such that for every r ≥ 1 and every 0 < ε ≤ ε 0 there exists C * > 0 such that the following is true. If p ≥ C * (log n/n) 1/∆ , then Γ = G(n, p) a.a.s. satisfies the following. Let G = (V, E) be a (not necessarily spanning) subgraph of Γ and {V i } i∈[k] ∪ {W } a vertex partition of a subset of V such that the following is true for Let H ′ and G ′ denote the subgraphs of H and G that result from removing all vertices that were used in the pre-embedding process. As a last step before finally applying the sparse blow-up lemma, the clusters in V G ′ need to be adjusted to the sizes of W i,j H ′ . The next lemma states that this is possible, and that after this redistribution the regularity properties needed for Lemma 15 still hold.

The lemma for G
In this section we prove the Lemma for G (Lemma 24), which borrows from the proof of [12,Proposition 17] and from the proof of [10, Lemma 9]. Our strategy is as follows. We first apply Lemma 12 to obtain an equitable partition of V (G) within whose reduced graph we can find a backbone graph by Theorem 1. We let Z 1 be the vertices whose Γ-degrees are 'wrong' to this partition, or whose neighbourhoods fail to inherit lower-regularity (plus a few extra to maintain k-equitability), and we remove the vertices Z 1 . Now there may be some vertices in each cluster which destroy super-regularity on the clique factor of the backbone graph. We redistribute these, and the exceptional set of the regular partition, to other clusters. Now we would like to say we are finished, but the moving of vertices may have destroyed some of the regularity inheritance, Γ-neighbourhood, and super-regularity properties we tried to obtain. However, it is easy to check that a vertex only witnesses failure of these properties if exceptionally many of its Γ-neighbours were moved from or to a cluster. We let Z 2 be the set of all such vertices, and remove them. We will see that Z 2 is so small that its removal does not significantly affect the properties we want, so that we can set V 0 = Z 1 ∪ Z 2 and we are done.
Given G ⊆ Γ with δ(G) ≥ k−1 k +γ pn, we apply Lemma 12, with input 1 k ε * , k−1 k +γ, r ′ 0 +k, and d, to G. We may do this because G is a subgraph of Γ, and by choice of C * we have Cp −1 log n ≤ ε * n kr1 , so that the condition of Lemma 12 is satisfied because the good event of Proposition 18 holds for Γ. The result is a 1 k ε * , p -lower-regular partition of V (G) into t ′ ∈ [r ′ 0 + k, r 1 ] equally sized clusters, with exceptional set of size at most 1 k ε * n, whose (ε * , d, p)-reduced graph has minimum degree at least k−1 We remove at most k − 1 of these clusters to the exceptional set, obtaining an (ε * , p)-lower-regular partition U of V (G) into kr equally sized clusters, where r ′ 0 ≤ kr ≤ r 1 , with exceptional set U 0 of size at most ε * n, whose (ε * , d, p)-reduced graph R k r has minimum degree at least k−1 k + γ − d − 1 k ε * kr − k. By choice of d and ε * , and by choice of r ′ 0 , we have r has bandwidth at most 2k < βr ′ 0 , and maximum degree less than 3k. Thus Theorem 1, with input γ 2 , 3k, and k, in particular states that R k r contains a copy of B k r . We fix one such copy. We let its vertices {(i, j)} i∈[r],j∈[k] label the vertices of R k r , and similarly let the cluster of U corresponding to the vertex (i, j) of B k r be U i,j for each i ∈ [r] and j ∈ [k]. The partition U is equitable, and thus in particular k-equitable.
We now create Z 1 as follows. We start with all vertices v of G for which there are (i, j) Finally we add a minimum number of vertices to obtain k-equitability of the sets U i,j \ Z 1 i∈[r],j∈ [k] . Note that we have |U i,j | ≥ n/(2kr 1 ) for each i, j, and we can estimate the number of vertices with more than 2ε * pn neighbours in U 0 by considering a superset of U 0 of size ε * n. It follows that for each i, j we have log(en/|U i,j |), log(en/|U 0 |) ≤ log(ekr 1 /ε * ). By Lemma 16 and Lemma 17, and Proposition 18, we have where the factor k accounts for vertices removed to maintain k-equitability.
We now try to obtain super-regularity on the copy of K k , and a minimum number of additional vertices from V (G)\ Z 1 to obtain k-equitability of the sets U i,j \ (Z 1 ∪W ) i∈[r],j∈ [k] . By construction, we have |W | ≤ ε * n + kr · kε * n kr ≤ 2kε * n. Given any w ∈ W , because w ∈ Z 1 we have . Now let us consider the edges of G leaving w. At most 2ε * pn of these go to U 0 , and by definition at most 2dpn go to sets Note that the sets V ′ i,j and V ′ i,j ′ differ in size by at most one for any i ∈ [r] and j, j ′ ∈ [k], by our construction of the assignment c. We apply Proposition 18 to estimate the number of vertices This gives Adding up (1) and (2), we conclude is by construction k-equitable, and the graph R k r has minimum degree k−1 k + γ 2 kr as desired. For each i ∈ [r] and j ∈ [k] we have |U i,j | = (1±ε * ) n kr , and so (3) and our choice of ε * give (G 1). Next, if {(i, j), (i ′ , j ′ )} is an edge of R k r , then G is (ε * , d, p)-lower-regular on (U i,j , U i ′ ,j ′ ) by construction. By (3), Proposition 11, and our choice of ε * , G is (ε, d, p)-lower-regular on (3) and (4) we have (3) and (4), Proposition 11 and our choice of ε * , we conclude (G 3).
Finally, (G 4) follows from (5) and our choice of ε * . Note that if we alter the definition of Z 1 , removing the condition on , then we do not need to use Lemma 17 and the bound in (1) improves to |Z 1 | ≤ 8k 2 r 3 1 Cp −1 /ε * . Thus, if we only require (G 3'), we obtain |V 0 | ≤ C * p −1 as claimed.

The lemma for H
In this section we present the proof of Lemma 25. First let us state McDiarmid's Inequality (see e.g. [23] for a proof) that we will use in the proof.
Then, for any ε > 0, we have The proof idea is then as follows. First, given the zero-free labelling L and (k + 1)-colouring σ of H, we split L into the blocks of the definition of zero-freeness. We partition the blocks into r 'sections' of consecutive blocks, such that the i-th section contains about j∈[k] m i,j vertices, and furthermore such that the 'boundary vertices, namely the first and last βn vertices of each section, do not receive colour zero. Now it is easy to check that assigning the vertices of colour j in the i-th section to (i, j) for each i ∈ [r] and j ∈ [k], and the vertices of colour zero in the i-th section to z i , is a graph homomorphism. However it can be very unbalanced, since different colours in [k] may be used with very different frequencies in each section. To fix this, we replace σ with a new colouring σ ′ , which we obtain as follows. We partition each section into 'intervals' of consecutive blocks, and for each interval except the last in each section, we pick a random permutation of [k]. We will show that there is a colouring σ ′ such that all but the first few vertices of each interval are coloured according to the permutation applied to σ, with vertices of colour zero staying coloured zero. We use this colouring σ ′ in place of σ to define the mapping f . We let X consist of all vertices whose distance is two or less to either boundary vertices, vertices near the start of an interval, or colour zero vertices.
To complete the proof, we show that so few vertices receive colour zero that they do not much affect the desired conclusions. Now the mapping f is in expectation balanced, and using Lemma 28 we can show that it is also with high probability close to balanced. It is also easy to check that, since H is D-degenerate, in the i-th section of L there are many vertices of degree at most 2D. In expectation these are distributed about evenly over the (i, j) j∈[k] by f , and again McDiarmid's inequality shows that with high probability the same holds. These two observations give us (H 1) and (H 6), while the other four desired conclusions hold by construction.
Proof of Lemma 25. For given D ≥ 1, set α = 1/(24D). Let k, r ≥ 1 and ξ, β > 0 be given, where ξ ≤ 1/(kr) and β ≤ 10 −10 ξ 2 /(Dk 4 r). Let H and K k r ⊆ B k r ⊆ R k r be graphs as in the statement of the lemma. Let L be the given labelling of V (H) of bandwidth at most βn. We denote the set of the first √ βn vertices of L by F . Let σ : V (H) → {0, . . . k} be the given proper (k + 1)-colouring of V (H) that is (10/ξ, β)-zero-free with respect to L and such that σ(F ) ⊆ [k]. Also, let z 1 , . . . , z n be vertices such that be the given k-equitable integer partition of n with n/(10kr) ≤ m i,j ≤ 10n/(kr) for every i ∈ [r] and j ∈ [k].
Let us now introduce the notation that we use in this proof. Recall that for every t ∈ 1/(4kβ) the i-th block is defined as Next we split the labelling L into r sections, where the first and the last block of each section are zero-free. Each section is partitioned into intervals, each of which but possibly the last one consists of b blocks.
Since σ is (10/ξ, β)-zero-free with respect to L, we can choose indices This means by the choice of the indices t 0 , . . . , t r that the first and last block of each section are zero-free. Since The last βn vertices of the blocks B ti and the first βn vertices of the blocks B ti+1 are called boundary vertices of H. Notice that colour zero is never assigned to boundary vertices by σ.
Using Equation (7), b = k/ √ β, and n/(10kr) ≤ m i,j ≤ 10n/(kr) we get, for every i ∈ [r], the following bounds on s i 1 We denote the intervals of the i-th section by I i,1 , . . . , I i,si . Let B sw i,ℓ denote the union of the first two blocks of each interval I i,ℓ . All of these blocks but B sw i,1 and B sw i,si will be used to switch colours within parts of H. Notice that we have |B sw i,ℓ | = 8kβn and, since σ is (10/ξ, β)-zero-free with respect to L, at least one of the two blocks of B sw i,ℓ is zero-free. We will not use B sw i,1 and B sw i,si to switch colours because we will need that the boundary vertices do not receive colour zero. For every i ∈ [r] and every ℓ ∈ {2, . . . , s i −1}, we choose a permutation π i,ℓ : [k] → [k] uniformly at random.
The next claim ensures that we can use zero-free blocks to obtain a proper colouring of the vertex set such that vertices before the switching block are coloured according to the original colouring and the colours of the vertices after the switching block are permuted as wished. A proof can be found in [13].
We use Claim 1 to switch colours at the beginning of each interval except for the first and last interval of each section. More precisely, we switch colours within the sets B sw i,ℓ so that the colouring of the remaining vertices in the interval I i,ℓ matches π i,ℓ . Note that we can indeed use B sw i,ℓ to do the switching since one of the two blocks in B sw i,ℓ is zero-free. In particular, we get a proper (k + 1)-colouring σ ′ = σ ′ π 1,2 , . . . , π r,sr −1 : V (H) → {0, . . . k + 1} of H that fulfils the following. For every x ∈ I 1,1 we have for each i ∈ [r] and ℓ ∈ {2, . . . , s i − 1} and every x ∈ I i,ℓ \ B sw i,ℓ we have that and for each i ∈ [r] and every x ∈ I i,si ∪ I i+1,1 (where I r+1,1 := ∅) we have that While σ ′ is well-defined on the sets B sw 1,2 , . . . , B sw r,sr−1 by Claim 1, the definition on these sets is rather complicated as it is depends on which of the two blocks in B sw i,ℓ is zero-free and on the colourings before and after the switching. However, the precise definition on these sets is not important for the remainder of the proof. Hence, we omit it here. Observe that σ ′ never assigns colour zero to boundary vertices.
Using σ ′ we now define f = f π 1,2 , . . . , π r,sr−1 : is the vertex defined in the statement of the lemma. Let X consist of all vertices at distance two or less from a boundary vertex of L, from a vertex in any B sw i,ℓ , or from a colour zero vertex. We now show that f and X satisfy Properties (H 2)-(H 5) with probability 1 and Properties (H 1) and (H 6) with high probability. In particular, this implies that the desired f and X exist.
We start with Property (H 1). For each i ∈ [r] let be the set of all vertices in S i except for the first and last interval and the first two blocks of each interval of S i . We will also make use of the following restricted function The basic idea of the proof of Property (H 1) is to determine bounds on |f * −1 (i, j)| that hold with positive probability and then deduce the desired bounds on |f −1 (i, j)|. Since the permutations π i,ℓ were chosen uniformly at random, we have by definition of f * that the expected number of vertices mapped to (i, j) {x ∈ S * ι : σ(x) = 0 and z ι = (i, j)} .
On the other hand, ≤ m i,j + ξn, which shows that Property (H 1) holds with positive probability. By definition of X, since L is a βn-bandwidth ordering, any vertex in X is at distance at most 2βn in L from a boundary vertex, a vertex of some B sw i,ℓ , or from a vertex assigned colour zero. Because there are r sections, the boundary vertices form r − 1 intervals each of length 2βn, and so at most 6rβn vertices of H are at distance 2 or less from a boundary vertex. There are i∈[r] s i intervals and hence i∈[r] s i switching blocks each of size 8kβn. As s i ≤ 10/(rk 2 √ β) for every i ∈ [r], there are at most (4 + 8k)βn · 10/(k 2 √ β) vertices at distance 2 or less from a vertex of some switching block. Similarly, because L is (10/ξ, β)-zero-free, in any consecutive 10/ξ blocks at most one contains vertices of colour zero, and hence at most (8 + 4k)βn vertices in any such 10/ξ consecutive blocks are at distance 2 or less from a vertex of colour zero. Thus we have |X| ≤ 6rβn + (4 + 8k)βn 10 k 2 √ βn + (8 + 4k)βn n 4kβn·10/ξ + 1 ≤ 6rβn + 1 4 ξn + 1 3 ξn ≤ ξn , which gives (H 2).
Since σ ′ is a proper colouring, and boundary vertices are not adjacent to colour zero vertices, by definition, f restricted to the boundary vertices is a graph homomorphism to B k r . On the other hand, on each section S i , again since σ ′ is a proper colouring and since (i, j) j∈[k] ∪ {z i } forms a clique in R k r , f is a graph homomorphism to R k r . Since L is a βn-bandwidth ordering, any edge of H is either contained in a section or goes between two boundary vertices, and we conclude that f is a graph homomorphism from H to R k r , giving (H 3). Now, given i ∈ [r] and j ∈ [k], and x ∈ f −1 (i, j) \ X, if {x, y} and {y, z} are edges of H, then y and z are at distance two or less from x in H. In particular, by definition of X neither y nor z is either a boundary vertex, in any B sw i,ℓ , or assigned colour zero. Since boundary vertices appear in intervals of length 2βn in L, and L is a βn-bandwidth ordering, it follows that y and z are both in S i . Furthermore, suppose x ∈ I i,ℓ for some ℓ. By definition x ∈ B sw i,ℓ . Because B sw i,ℓ and B sw i,ℓ+1 (if the latter exists) are intervals of length 8kβn, both y and z are also in I i,ℓ \ B sw i,ℓ , and in particular both y and z are in j ′ ∈[k] f −1 (i, j ′ ), giving (H 4).
Finally, we show that Property (H 6) holds with positive probability. Let i ∈ [r] and j ∈ [k]. We define the random variable E i,j := |{x ∈ f * −1 (i, j) : deg(x) ≤ 2D}|. Since H is D-degenerate and L is a labelling of bandwidth at most βn we have By applying Chernoff's Inequality (Theorem 19) and using Equations (7) and (8) as well as α = 1/(24D) we get with positive probability Taking the union bound over all i ∈ [r] and j ∈ [k] yields that Property (H 6) holds with positive probability.

The common neighbourhood lemma
In order to prove Lemma 26 we need the following version of the Sparse Regularity Lemma, allowing for a partition equitably refining an initial partition with parts of very different sizes.
Lemma 29. For each ε > 0 and s ∈ N there exists t 1 ≥ 1 such that the following holds. Given any graph G, suppose The proof is standard, following Scott's method [37]. We defer it to Appendix A. To prove Lemma 26, we work as follows. First, we choose a regularity parameter ε * * 0 and apply Lemma 29 with ε * * 0 and the initial partition V 1 \ W, . . . , V k \ W, W . From this partition, all we need is a part W ′ ⊆ W and parts , p)-lower-regular, which we find by averaging. We now choose our vertices w 1 , . . . , w ∆ sequentially (in Claim 2), such that the desired (W 1)-(W 4) hold for all subsets of the so far chosen vertices at each stage. This is in spirit very much like the usual dense case 'Key Lemma' sequential embedding of vertices using regularity, but in the sparse setting here we need to work somewhat harder and use the regularity inheritance lemmas to show that we can choose vertices which give us lower-regular pairs for future embedding (rather than this being automatic from the slicing lemma, as it is in the dense case).
Thus, the proof mainly amounts to showing that the number of vertices which break one of the desired properties and which we therefore cannot choose is always much smaller than |W ′ |. In order to show this for (W 1) we need to maintain some extra properties, specifically sizes of G-and Γ-neighbourhoods of chosen vertices within each V ′ i , and that these Γ-neighbourhoods of chosen vertices in each V ′ i form lower-regular pairs with W ′ . Note that the way we choose our various regularity parameters amounts to ensuring that, even after ∆ − 1 successive applications of regularity inheritance lemmas, we still have sufficient regularity for our argument. Furthermore, it is important to note that the choice of ε * * 0 does not have anything to to with ε * or ε 0 , rather it affects only the returned value of α. Next, given ε * > 0, let ε * ∆−1,∆−1 := ε * , and let ε * j,∆ = ε * ∆,j = 1 for each 1 ≤ j ≤ ∆. For each (j, j ′ ) ∈ [∆] 2 \ {(1, 1)} in lexicographic order sequentially, we choose ε * ∆−j,∆−j ′ ≤ min{ε * ∆−j+1,∆−j ′ , ε * ∆−j,∆−j ′ +1 , ε * ∆−j+1,∆−j ′ +1 } not larger than the ε 0 returned by Lemma 16 for both input ε * ∆−j+1,∆−j ′ and d, and for input ε * ∆−j,∆−j ′ +1 and d, and not larger than the ε 0 returned by Lemma 17 for input ε * ∆−j+1,∆−j ′ +1 and d.
Given p ≥ C * log n n 1/∆ , a.a.s. the good events of each of the above calls to Lemma 16 and 17, and to Proposition 18 and Lemma 29, occur. We condition from now on upon these events occurring for Γ = G(n, p).
and W satisfy the conditions of the lemma. We first apply Lemma 29, with the promised input parameters ε * * 0 and s = k + 1, We can do this because Cp −1 log n < 10 −10 ε 4 pn k 4 r 4 , so that the good event of Proposition 18 guarantees that the conditions of Lemma 29 are satisfied. This returns a partition refining each set of {V i \ W } i∈[k] ∪ {W } into 1 ≤ t ≤ t 1 clusters together with a small exceptional set. Let W ′ ⊆ W be a cluster which is in at most 2kε * * 0 t pairs with clusters in V 1 ∪ · · · ∪ V k \ W which are not (ε * * 0 , p) G -lower-regular. Such a cluster exists by averaging. By Proposition 18 and (V 1), at most 4(k + 1)ε * * 0 p 4n r |W ′ | edges lie in the pairs between W ′ and the V i which are not lower-regular, and by Proposition 18 and (V 3) at most 2p|W ||W ′ | < ε * * 0 p n r |W ′ | edges leaving W ′ lie in W . By (V 4), for each i ∈ [k] each w ∈ W ′ has at least dp|V i | neighbours in V i , and hence there are at least dp 2 |V i ||W ′ | edges from W ′ to V i \ W which lie in (ε * * 0 , p) G -lower-regular pairs. By averaging, for each i ∈ [k] there exists a cluster V ′ i of the partition such that (W ′ , V ′ i ) is (ε * * 0 , d/2, p) G -lower-regular. For the remainder of the proof, we will only need these k + 1 clusters from the partition.
Notice that for every i ∈ [k] we have both by the choice of C * and p.

Moving on to (L 2), let Λ ⊆ [ℓ] and i ∈ [k] be given. We have
By choice of ε 0 and ε * * |Λ| , we thus have , p G -lower-regular, and thus the number of vertices w ∈ W ′ such that and by choice of ε 0 and p, in particular j∈Λ N Γ (w j , V ′ i ) ≥ Cp −1 log n. Since the good event of Proposition 18 occurs, the number of vertices w ∈ W ′ such that Summing over the choices of Λ ⊆ [ℓ] and of i ∈ [k], we conclude that at most 2 ∆+1 kCp −1 log n vertices of W ′ violate (L 4). Since n ≥ |V i | ≥ |V ′ i |, the same calculation shows that a further at most 2 ∆+1 kCp −1 log n vertices of W ′ violate (L 5), and at most 2 ∆+1 kCp −1 log n vertices of W ′ violate (L 3).
Finally, we come to (L 6). Suppose we are given Λ, Λ ′ ⊆ [ℓ] and distinct i, i ′ ∈ [k]. Suppose that |Λ| ≤ ∆−2 and |Λ ′ | ≤ ∆−1. We wish to show that for most vertices w ∈ W ′ , the pair N Γ (w, By (L 5), and by choice of ε 0 , C and p, we have By (L 6), the pair Since the good event of Lemma 16 with input ε * |Λ|+1,|Λ ′ | and d occurs, there are at most Cp −1 log n ver- Observe that if ∆ = 2 the property (L 6) does not require this pair to be lower-regular. Summing over the choices of Λ, Λ ′ ⊆ [ℓ] and i, i ′ ∈ [k], we conclude that if ∆ = 2 then at most 2 2∆ k 2 Cp −1 log n vertices w of W ′ cause (L 6) to fail, while if ∆ ≥ 3, at most 2 2∆ k 2 C(p −1 + p −2 ) log n vertices w of W ′ violate (L 6). Summing up, if ∆ = 2 then at most 2 ∆ k 2 Cp −1 log n + 2 ∆ kε * * ∆ |W ′ | + 3 · 2 ∆+1 kCp −1 log n + 2 2∆ k 2 Cp −1 log n vertices w of W ′ cannot be chosen as w ℓ+1 . By choice of C * and ε * * ∆ , and by choice of p, this is at most 1 2 |W ′ |, so that there exists a vertex of W ′ which can be chosen as w ℓ+1 , as desired. If on the other hand ∆ ≥ 3, then at most vertices of W ′ cannot be chosen as w ℓ+1 . Again by choice of C * , ε * * ∆ and p, this is at most 1 2 |W ′ |, and again we therefore can choose w ℓ+1 satisfying (L 1)-(L 6) as desired.
Finally, let us argue why the lemma is a consequence of Claim 2. Let (w 1 , . . . , w ∆ ) ∈ W ′ ∆ be a tuple satisfying (L 1)-(L 6). By (L 2), for any Λ ⊆ [ℓ] and i ∈ [k] we have , with each n i,j close to |V i,j |, and with n i,j = |V i,j |. Our aim is to find a partition of V (G) with parts V ′ i,j i∈[r],j∈ [k] such that |V ′ i,j | = n i,j for each i, j. This partition is required to maintain similar regularity properties as the original partition, while not substantially changing common neighbourhoods of vertices.
There are two steps to our proof. In a first step, we correct global imbalance, that is, we find a partition V which maintains all the desired properties and which has the property that To do this, we identify some j * such that i |V i,j * | > i n i,j * and j ′ such that i |V i,j ′ | < i n i,j ′ . We move i |V i,j * |−n i,j * vertices from V 1,j * to some cluster V i ′ ,j ′ , maintaining the desired properties, and repeat this procedure until no global imbalance remains.
In a second step, we correct local imbalance, that is, for each i = 1, . . . , r − 1 sequentially, and for each j ∈ [k], we move vertices between V i,j and V i+1,j , maintaining the desired properties, to obtain the partition V ′ such that |V ′ i,j | = n i,j for each i, j. Observe that because V is globally balanced, once we know |V ′ i,j | = n i,j for each i ∈ [r − 1] and each j ∈ [k] we are guaranteed that |V ′ r,j | = n r,j for each j ∈ [k]. The proof of the lemma then comes down to showing that we can move vertices and maintain the desired properties. Because we start with a partition in which V i,j is very close to n i,j for each i and j, the total number of vertices we move in any step is at most the sum of the differences, which is much smaller than any n i,j . The following lemma shows that we can move any small (compared to all n i,j ) number of vertices from one part to another and maintain the desired properties.
Lemma 30. For all integers k, r 1 , ∆ ≥ 1, and reals d > 0 and 0 < ε < 1/2k as well as 0 < ξ < 1/(100kr 3 1 ), there exists C * > 0 such that the following holds for all sufficiently large n. Let Γ be a graph on vertex set [n], and let G be a not necessarily spanning subgraph. Let X, Z 1 , . . . , Z k−1 ⊆ V (G) be pairwise disjoint subsets, each of size at least n/(16kr 1 ), such that (X, Z i ) is (ε, d, p) G -lower-regular for each i. Then for each 1 ≤ m ≤ 2r 2 1 ξn, there exists a set S of m vertices of X with the following properties.
We now prove the balancing lemma.
First stage (global imbalance): We use the following algorithm.

Algorithm 1: Global balancing
In each step where we select S, we make use of Lemma 30 to do so, with input k, r 1 , ∆, d, and ε/4, with X = V 1,j * and with the Z 1 , . . . , Z k−1 being the V i ′ ,j ′′ with j ′′ = j ′ .
We claim that the algorithm completes successfully, in other words that each of the choices is possible. and that Lemma 30 is always applicable. In each While loop, since i,j |V i,j | − n i,j = 0 and since the While condition is satisfied, j * satisfies i∈[r] |V i,j * | − n i,j * > 0.
Observe that the While loop is run at most k times, since at the end of the While loop in which we selected some j = j * we have i∈[r] |V i,j * | − n i,j * = 0 and therefore we do not select j as either j * or j ′ in future iterations. It follows that the number of V i,j flagged as changed never exceeds 2k. Now the set V 1,j * has degree at least k − 1 + γk 2 r in R k r , and so there are at least γkr/2 indices i ∈ [r] such that V 1,j * is adjacent to each V i,j in R k r . Since γkr/2 > 3k, in particular we can choose i ′ such that V 1,j * is adjacent to each V i ′ ,j in R k r and no V i ′ ,j is flagged as changed. It follows that each pair (V 1,j * , V i ′ ,j ) is ε 4 , d, p G -lower-regular and thus it is possible to choose i ′ . It is possible to choose j ′ since the While condition holds. Finally, we need to show that Lemma 30 is always applicable with the given parameters. In each application, the sets denoted X, Z 1 , . . . , Z k−1 are parts of the partition V (so they were not changed by the algorithm yet). It follows that each set has size at least n/(8kr) > n/(16kr 1 ). Since V is ε 4 , d, p)-lower-regular on B r k , the pairs (X, Z 1 ), . . . , (X, Z k−1 ) are ε 4 , d, p)-lower-regular as required. Finally, by choice of j * we see that the sizes of the sets S we select in each step are decreasing, so it is enough to show that in the first step we have |S| ≤ rξn, which follows from (B 1). Thus Lemma 30 is applicable in each step, and we conclude that the algorithm indeed completes. We denote the resulting vertex Claim 3. We have the following properties.
Observe that vertices were removed from or added to each V i,j to form V i,j at most once in the running of Algorithm 1, and the number of vertices added or removed was at most rξn. Since |V i,j | satisfies (B 1), we conclude that (P 1) holds. Furthermore, the vertices added to or removed from V i,j satisfy (SM 2) and therefore (P 3) holds.
Since each set V i,j has size at least n/(8kr), we can apply Proposition 11 with µ = ν = 8kr 2 ξ to each edge of R k r , concluding that V is ε 2 , d, p G -lower-regular on R k r since ε 4 + 4 8kr 2 ξ < ε 2 . Now for any i ∈ [r] and j ∈ [k], consider v ∈ V i,j . If v ∈ V i,j , then we applied Lemma 30 to select v, and when we did so no V i,j ′ was flagged as changed by Algorithm 1. Thus by (SM 1) we have where the final inequality follows by choice of n sufficiently large and since We conclude that V is ε 2 , d, p -super-regular on K k r , giving (P 2).

Second stage (local imbalance):
We use the following algorithm to correct the local imbalances in V.

Algorithm 2: Local balancing
Again, in each step when we select S we make use of Lemma 30 to do so. If we select S from V i,j , then we use input k, r 1 , d, 3ε/4 and ξ with X = V i,j and the sets Z 1 , . . . , Z k−1 being V i+1,j ′ for j ′ = j. If on the other hand we select S from V i+1,j , then we use input k, r 1 , d and 3ε/4, with X = V i+1,j and the sets Z 1 , . . . , Z k−1 being V i,j ′ for j ′ = j.
We claim that Lemma 30 is always applicable. To see that this is true, observe first that the number of vertices which we move between any V i,j and V i+1,j in a given step is by (P 1) bounded by 2k 2 r 2 ξn. We change any given V i,j at most twice in the running of the algorithm, so that in total at most 4k 2 r 2 ξn vertices are changed. In particular, we maintain | V i,j | ≥ n/(16kr 1 ) throughout, and, by Proposition 11, with input µ = ν = 4r 2 ξn n/(16kr1) < 100r 3 1 kξ, and using (P 2), we maintain the property that any pair in R k r , and in particular any pair in B k r , is 3ε 4 , d, p -lowerregular throughout. This shows that Lemma 30 is always applicable, and therefore the algorithm completes and returns a partition V ′ . We claim that this is the desired partition. We need to check that (B 1')-(B 5') hold.
Since j for all i and j, giving (B 1'). For the first part of (B 3'), we have justified that we maintain 3ε 4 , d, p G -lower-regularity on R k r throughout the algorithm. For the second part, we need to show that for each i ∈ [r] and We change V i,j ′ at most twice to obtain V ′ i,j ′ , both times by adding or removing vertices satisfying (SM 2). As in the proof of Claim (P 3) above, using (B 4) and (P 3) we obtain deg If v ∈ V i,j , then it was added to the set V i,j by Algorithm 2, and V i,j ′ was changed at most twice thereafter. Again, using (SM 1), (SM 2), (B 4) and (P 3) we obtain deg i,j | as desired. Now (B 2') holds since the total number of vertices moved in Algorithm 1 is at most k 2 rξn, in Algorithm 2 at most 4k 2 r 2 ξn vertices are changed in each cluster, and by choice of ξ. To see that (B 4') holds, observe that by (B 4), (P 3) and (SM 2) we have where the final inequality follows by choice of p and of n sufficiently large. Using (B 3), we can apply Proposition 11, with µ = ν = ε 2 50 , to deduce (B 4'). For (B 5'), observe that for any given i ∈ [r] and j ∈ [k] we change V i,j at most twice in the running of Algorithm 2, both times either adding or removing a set satisfying (SM 2). By (P 3) and choice of ξ, we conclude that (B 5') holds.
Finally, suppose that for any two disjoint vertex sets A, A ′ ⊆ V (Γ) with |A|, |A ′ | ≥

The Bandwidth Theorem in random graphs
Before embarking on the proof, we first recall from the proof overview (Section 3.1) the main ideas. Given G, we first use the lemma for G (Lemma 24) to find a lower-regular partition of V (G), with an extremely small exceptional set V 0 , and whose reduced graph R k r contains a spanning backbone graph B k r , on whose subgraph K k r the graph G is super-regular and has oneand two-sided inheritance. Given this, and H together with a (z, β)-zero-free (k + 1)-colouring, we use the lemma for H (Lemma 25) to find a homomorphism f from V (H) to R k r almost all of whose edges are mapped to K k r and in which approximately the 'right' number of vertices of H are mapped to each vertex of R k r . At this point, if V 0 were empty, and if the 'approximately' were exact, we would apply the sparse blow-up lemma (Lemma 15) to obtain an embedding of H into G.
Our first aim is to deal with V 0 . We do this one vertex at a time. Given v ∈ V 0 , we choose x ∈ V (H) from the first βn vertices of the supplied bandwidth order L which is not in any triangles. We embed x to v. We then embed the neighbours of x to carefully chosen neighbours of v, which we obtain using Lemma 26. Here we use the fact that N H (x) is independent. This then fixes a clique of K k r to which N 2 H (x) must be assigned, and gives image restrictions in the corresponding parts of the lower-regular partition for these vertices. Since N 2 H (x) may have been assigned by f to some quite different clique in K k r , we have to adjust f to match. This we can do using the fact, which follows from our assumptions on L, that x is far from vertices of colour zero. Now the idea is simply to repeat the above procedure, choosing vertices of V (H) to pre-embed which are widely separated in H, until we pre-embedded vertices to all of V 0 . We end up with a homomorphism f * from what remains of V (H) to R k r . It is easy to check that this homomorphism still maps about the right number of vertices of H to each vertex of R k r , simply because V 0 is small. We now apply the Balancing Lemma (Lemma 27) to correct the sizes of the clusters to match f * , and complete the embedding of H using the Sparse Blow-up Lemma (Lemma 15).
There are two difficulties with this idea, the 'subtleties' mentioned in the proof overview (Section 3.1). First, if ∆ = 2 we might have |V 0 | ≫ pn, so that we should be worried that at some stage of the pre-embedding we choose v ∈ V 0 and discover most or all of its neighbours have already been pre-embedded to. It turns out to be easy to resolve this: we choose each v ∈ V 0 not arbitrarily, but by taking those which have least available neighbours first. We will show that this is enough to avoid the problem.
More seriously, because we perform the pre-embedding sequentially, we might use up a significant fraction of N G (w) for some w ∈ V (G) in the pre-embedding, destroying super-regularity of G on K k r , or we might use up a significant fraction of some common neighbourhood which defines an image restriction for the sparse blow-up lemma. In order to avoid this, before we begin the pre-embedding we fix a set S ⊆ V (G) whose size is a very small constant times n, chosen using Lemma 21 to not have a large intersection with any N G (w) or with any Γ-common neighbourhood of at most ∆ vertices of Γ (which could define an image restriction). We perform the pre-embedding as outlined above, except that we choose our neighbours of each v within S. This procedure is guaranteed not to use up neighbourhood sets guaranteed by super-regularity or image restriction sets, since these sets are all contained in V \ V 0 and even using up all of S would not be enough to do damage.
We set C = 10 10 k 2 r 2 1 ε −2 ξ −1 ∆ 2r1+20 µ −∆ C * , and z = 10/ξ. Given p ≥ C log n n 1/∆ , a.a.s. G(n, p) satisfies the good events of Lemma 15, Lemma 24 and Lemma 26, and Proposition 18, with the stated inputs. Suppose that Γ = G(n, p) satisfies these good events. Suppose G ⊆ Γ is any spanning subgraph with δ(G) ≥ k−1 k + γ pn. Let H be a graph on n vertices with ∆(H) ≤ ∆, and L be a labelling of vertex set V (H), of bandwidth at most βn, such that the first βn vertices of L include Cp −2 vertices that are not contained in any triangles of H, and such that there exists a (k + 1)-colouring that is (z, β)-zero-free with respect to L, and the colour zero is not assigned to the first √ βn vertices. Applying Lemma 24 to G, with input γ, k, r 0 and ε, we obtain an integer r with 10γ −1 ≤ kr ≤ Given i ∈ [r], because δ(R k r ) > (k − 1)r, there exists v ∈ V (R k r ) adjacent to each (i, j) with j ∈ [k]. This, together with our assumptions on H, allow us to apply Lemma 25 to H, with input D, k, r, 1 10 ξ and β, and with m i,j := |V i,j | + 1 kr |V 0 | for each i ∈ [r] and j ∈ [k], choosing the rounding such that the m i,j form a k-equitable integer partition of n. Since ∆(H) ≤ ∆, in particular H is ∆-degenerate. Let f : V (H) → [r] × [k] be the mapping returned by Lemma 25, let W i,j := f −1 (i, j), and let X ⊆ V (H) be the set of special vertices returned by Lemma 25. For every i ∈ [r] and j ∈ [k] we have for every x ∈ f −1 (i, j) \ X and xy, yz ∈ E(H), and (H 5a) f (x) = 1, σ(x) for every x in the first √ βn vertices of L. Lemma 25 actually gives a little more, which we do not require for this proof. We let F be the first βn vertices of L. By definition of L, in F there are at least Cp −2 vertices whose neighbourhood in H is independent.
Next, we apply Lemma 21, with input εµ 2 and ∆, to choose a set S ⊆ V (G) of size µn. We let the T i of Lemma 21 be all sets which are common neighbourhoods in Γ of at most ∆ vertices of Γ, together with the sets V i,j for i ∈ [r] and j ∈ [k]. The result of Lemma 21 is that for any 1 ≤ ℓ ≤ ∆ and vertices u 1 , . . . , u ℓ of V (G), we have where we use the fact p ≥ C log n n 1/∆ and choice of C to deduce C * log n < εµp ∆ n.
Our next task is to create the pre-embedding that covers the vertices of V 0 . We use the following algorithm, starting with φ 0 the empty partial embedding. Suppose this algorithm does not fail, terminating with t = t * . The final φ t * is an embedding of some vertices of H into V (G) which covers V 0 and is contained in V 0 ∪ S. Before we specify how exactly we choose vertices at line 2, we justify that the algorithm does not fail. In other words, we need to justify that at every time t there are vertices of F whose neighbourhood is independent and which are not close to any vertices in dom(φ t ), and that at every time t, the set N G (v) ∩ S \ im(φ t ) is big. For the first, observe that since |V 0 | ≤ C * p −2 , we have dom(φ t ) ≤ C * ∆p −2 at every step. Thus the number of vertices at distance less than 2r + 20 from dom(φ t ) is at most 1 + ∆ + · · · + ∆ 2r+19 C * ∆p −2 < 2C * ∆ 2r+20 p −2 which by choice of C is smaller than the number of vertices in F with N H (x) independent. For the second part, suppose that at some time t we pick a vertex v such that N G (v)∩S \im(φ t ) < 1 4 µpn. 3 10 µpn, yet at each of these times v is not picked, so that the vertex picked at each time has at most as many uncovered neighbours in S as v. Let Z be the set of vertices chosen at line 1 in each of these time steps. Then for each z ∈ Z we have N G (v) ∩ S \ im(φ t ) ≤ 3 10 µpn. But since δ(G) > 1 2 pn, by (16) and choice of ε we have N G (z) ∩ S ≥ 2 5 µpn, so N G (z) ∩ im(φ t ) ≥ 1 10 µpn for each z ∈ Z. By choice of C, we have |Z| = 1 100(∆+1) µpn ≥ C * p −1 log n. Since |im(φ)| ≤ (∆ + 1)|V 0 | ≤ 1 100 µn, by choice of C, this contradicts the good event of Proposition 18.
We have justified that Algorithm 3 completes, and indeed that at each time we reach line 2 there are at least 1 4 µpn vertices of N G (v) ∩ S \ im(φ) to choose from. In order to specify how to choose these vertices, we need the following claim.
Proof. First let Y ′ be obtained from Y by removing all vertices y ∈ Y such that either |N Γ (y, V 0 )| ≥ εpn, or for some i ∈ [r] and j ∈ [k] we have N Γ (y, V i,j ) = (1 ± ε)p|V i,j |. Because the good event of Proposition 18 occurs, the total number of vertices removed is at most 2krC * p −1 log n < 1 2 |Y |, where the inequality is by choice of C. Now given any y ∈ Y ′ , if for each i ∈ [r] there is j ∈ [k] such that N G (y, V i,j ) < dp|V i,j |, then, since the {V i,j } are k-equitable, we have |N G (y)| ≤ εpn + dpn + (1 + ε) k−1 k pn + r < k−1 k + γ pn, a contradiction. We conclude that for each y ∈ Y ′ there exists i ∈ [r] such that |N G (y, V i,j )| ≥ dp|V i,j | for each j ∈ [k]. We let W be the vertices of Y ′ giving a majority choice of i. Now at each time t, in line 2 of Algorithm 3, we choose the vertices w 1 , . . . , w ℓ as follows. Let be an index, and W ⊆ Y be a set of size 1 8r µpn, such that N G (w, V it ,j ) ≥ dpn|V it,j | for each j ∈ [k], whose existence is guaranteed by Claim 4. By construction, and by our choice of µ, we can apply Lemma 26 with input d, k, ∆, ε * , r and ε, with the clusters V it,j j∈[k] as the V i i∈ [k] , and inputting a subset of W of size 10 −10 ε 4 pn k 4 r 4 as requried for (V 3). This last is possible by choice of µ. To verify the conditions of Lemma 26, observe that (V 1) follows from (G 1a), (V 2) from (G 2a), and (V 4) from Claim 4. We obtain a ∆-tuple of vertices in W satisfying (W 1)-(W 4). We let w 1 , . . . , w ℓ be the first ℓ vertices of this tuple.
Let H ′ = H − dom(φ t * ). We next define image restricting vertex sets and create an updated homomorphism f * : . Now, since the vertices {x t } t∈[t * ] are by construction at pairwise distance at least 2r + 20, in particular for each y ∈ V (H ′ ) with J y = ∅ the vertex y is at distance two from one x t , and at distance greater than r +10 from all others. Let j ∈ [k] such that f (y) = (1, j). Then we set f * (y) := (i t , j). Next, for each t ∈ [t * ] and each z ∈ V (H) at distance 3, . . . , i t + 1 from x t , we set f * (z) as follows. Recall that f (z) = (1, j) for some j ∈ [k]. We set f * (z) = i t + 2 − dist(x t , z), j . Because the {x t } are at pairwise distance at least 2r + 20, no vertex is at distance r + 5 or less from any two x t and x t ′ , so that f * is well-defined. Because R k r contains B k r , the f * we constructed so far is a graph homomorphism. Furthermore, for each x t the set of vertices z at distance i t + 1 from x t are in the first √ βn vertices of L, and so by (H 5a) satisfy f * (z) = f (z). We complete the construction of f * by setting f * (z) = f (z) for each remaining z ∈ V (H) \ dom(φ t * ). Because f is a graph homomorphism, f * is also a graph homomorphism whose domain is V (H ′ ). For each i ∈ [r] and j ∈ [k], let W ′ i,j be the set of vertices w ∈ V (H ′ ) with f * (w) ∈ V i,j , and let X ′ consist of X together with all vertices of H ′ at distance r + 10 or less from some x t with t ∈ [t * ]. The total number of vertices z ∈ V (H) at distance at most r + 10 from some x t is at most 2∆ r+10 |V 0 | < 1 100 ξn. Since W i,j △W ′ i,j contains only such vertices, we have With Theorem 23 in hand, we can now present the proof of Theorem 6.
Proof of Theorem 6. Given γ, ∆, and k, let β > 0, z > 0, and C > 0 be returned by Theorem 23 with input γ, ∆, and k. Set β * := β/2 and C * := C/β. Let H be a k-colourable graph on n vertices with ∆(H) ≤ ∆ such that there exists a set W of at least C * p −2 vertices in V (H) that are not contained in any triangles of H and such that there exists a labelling L of its vertex set of bandwidth at most β * n. By the choice of C * we find an interval I ⊆ L of length βn containing a subset F ⊆ W with |F | = Cp −2 . Now we can rearrange the labelling L to a labelling L ′ of bandwidth at most 2β * n = βn such that F is contained in the first βn vertices in L ′ . Then, by Theorem 23 we know that Γ = G(n, p) satisfies the following a.a.s. if p ≥ C(log n/n) 1/∆ and in particular if p ≥ C * (log n/n) 1/∆ . If G is a spanning subgraph of Γ with δ(G) ≥ (k − 1)/k + γ pn, then G contains a copy of H, which finishes the proof.

Lowering the probability for degenerate graphs
As with Theorem 6, we deduce Theorem 7 from the following more general statement.
Theorem 31. For each γ > 0, ∆ ≥ 2, D ≥ 1 and k ≥ 1, there exist constants β > 0, z > 0, and C > 0 such that the following holds asymptotically almost surely for Γ = G(n, p) if p ≥ C log n n 1/(2D+1) . Let G be a spanning subgraph of Γ with δ(G) ≥ k−1 k + γ pn and let H be a graph on n vertices with ∆(H) ≤ ∆ and degeneracy at most D, that has a labelling L of its vertex set of bandwidth at most βn, a (k + 1)-colouring that is (z, β)-zero-free with respect to L and where the first √ βn vertices in L are not given colour zero and the first βn vertices in L include Cp −2 vertices that are not in any triangles or copies of C 4 in H. Then G contains a copy of H.
The proof of Theorem 31 is quite similar to that of Theorem 23. We provide only a sketch, highlighting the differences. The most important of these are that we do not use Lemma 26 in the pre-embedding, and that we use a version of Lemma 15 whose performance is better for degenerate graphs. In order to state this, we need the following definitions. Given an order τ on V (H) and a family J of image restricting vertices, we define π τ (x) := |J x | + {y ∈ N H (x) : τ (y) < τ (x)} . Now the condition on τ we need for our enhanced blow-up lemma is the following.
To obtain the best probability bound, one should choose τ to minimiseD. In the proof of Theorem 31 we will take τ to be an order witnessing D-degeneracy, W e will contain all image restricted vertices, and we will choose buffer sets containing vertices of degree at most 2D + 1. One can easily check that this allows us to chooseD = 2D + 1. We choose ε = min ε 0 , d, 1 4D ε * , 1 2k . Putting ε into Lemma 24 returns r 1 . Next, Lemma 27, for input k, r 1 , ∆, γ, d and 8ε, returns ξ > 0. We assume without loss of generality that ξ ≤ 1/(10kr 1 ), and set β = 10 −12 ξ 2 /(∆k 4 r 2 1 ). Let µ = ε 2 100000kr . Finally, suppose C * is large enough for each of these lemmas, for Lemma 15, for Proposition 18 with input ε, and for Lemma 21 with input εµ 2 and ∆.
We set C = 10 10 k 2 r 2 1 ε −2 ξ −1 ∆ 2r1+20 µ −1 C * , and z = 10/ξ. Given p ≥ C log n n 1/(2D+1) , a.a.s. G(n, p) satisfies the good events of Lemma 33, Lemma 24, Lemma 16 and Lemma 17, and Proposition 18, with the stated inputs. Suppose that Γ = G(n, p) satisfies these good events. Let G be a spanning subgraph of Γ with δ(G) ≥ k−1 k + γ pn. Let H be any graph on n vertices with ∆(H) ≤ ∆, and let L be a labelling of V (H) of bandwidth at most βn whose first βn vertices include Cp −2 vertices that are not contained in any triangles or four-cycles of H, and such that there exists a (k + 1)-colouring that is (z, β)-zero-free with respect to L, and the colour zero is not assigned to the first √ βn vertices. Furthermore, let τ be a D-degeneracy order of V (H). Next, as in the proof of Theorem 23, we apply Lemma 24 to G, obtaining a partition of V (G) with the properties (G 1a)-(G 4a). Note that if D = 1, in place of (G 3a) we will ask only for the weaker condition (G 3') N Γ (v, V i,j ), V i ′ ,j ′ is an (ε, d, p) G -lower-regular pair for every (i, j), (i ′ , j ′ ) ∈ E(R k r ) and v ∈ V \ V 0 , and thus for D = 1 we have |V 0 | ≤ C * p −1 , while for D ≥ 2 we have |V 0 | ≤ C * p −2 .
Next, we apply Lemma 25 to obtain a partition of V (H). We use the same inputs as in the proof of Theorem 23, with the exception that D is now given in the statement of Theorem 31 rather than being set equal to ∆. The result is a function f : V (H) → V (R k r ) and a special set X with the same properties (H 1a)-(H 5a), and in addition We now continue following the proof of Theorem 23, using Lemma 21 with input εµ 2 and D + 1 (rather than εµ 2 and ∆), to choose a set S satisfying (16) for each 1 ≤ ℓ ≤ D + 1 and vertices u 1 , . . . , u ℓ of V (G). We use the same pre-embedding Algorithm 3, with the exception that we choose vertices at line 2 differently. As before, given v t+1 ∈ V 0 \ im(φ t ), we use Claim 4 to find a set W ⊆ N G (v t+1 ) of size at least 1 8r µpn and an index i ∈ [r] such that for each w ∈ W we have N G (w, V i,j ) ≥ dp|V i,j | for each j ∈ [k]. However, rather than applying Lemma 26, we let w 1 , . . . , w ℓ be distinct vertices of W which satisfy (G 5a)-(G 8a). We now justify that this is possible. We choose the w 1 , . . . , w ℓ successively. Since x t+1 is not contained in any triangle or four-cycle of H, we have |J x | ≤ 1 for each x ∈ V (H), so that (G 5a) is automatically satisfied. By Proposition 18, (G 6a) and (G 8a) are satisfied for all but at most 2C * kr 1 p −1 log n vertices of W . It remains to show that we can obtain (G 7a), which we do as follows. For s ∈ [ℓ], when we come to choose w s , we insist that for any (i, j), (i ′ , j ′ ) ∈ E(R k r ), the following hold. First, The conditions of respectively Lemma 16, Lemma 17, and Lemma 16 are in each case satisfied (in the last case by choice of w t ) and thus in total at most 3C * k 2 r 2 1 max{p −2 , p −1 log n} vertices of W are prohibited. Since 5C * k 2 r 2 1 max{p −2 , p −1 log n} < |W | 2 < ℓ by choice of C, at each step there is a valid choice of w s . Since for each x ∈ V (H ′ ) we have |J x | ≤ 1, this construction guarantees (G 7a).
We now return to following the proof of Theorem 23. We obtain V ′ by removing the images of pre-embedded vertices, and V ′′ by applying Lemma 27. Note that here (B 5') may be trivial, that is, the error term C * log n may dominate the main term when s is large, but we only require it for s = 1 to obtain (G 1c)-(G 7c).
Finally, we are ready to apply Lemma 33 to complete the embedding. We define (I, J ) as in the proof of Theorem 23. We however let W i,j consist of the vertices of W ′ i,j \ X whose degree is at most 2D. By (H 6a) there are at least 1 100D |W ′ i,j | of these, so that W is a (ϑ, K k r )-buffer, giving (DBUL 1). Now (DBUL 2) follows from (G 2c) and (G 3c). Finally, (I, J ) is a (ρ, 1 4 α, ∆, ∆)restriction pair, giving (DBUL 3), exactly as in the proof of Theorem 23. However now we need to give an order τ ′ on V (H ′ ) and a set W e ⊆ V (H ′ ). The former is simply the restriction of τ to V (H ′ ), and the set W e consists of all vertices x ∈ V (H) with |J x | > 0.
The last condition we must verify is (DBUL 4), that τ ′ is a (D, p, εn/r 1 )-bounded order. For any vertex x of H ′ , we have π τ ′ (x) ≤ π τ (x) + 1 ≤ D + 1, and furthermore for all vertices not in W e we have π τ ′ (x) = π τ (x) ≤ D. To verify (ORD 1), first note that by construction the vertices of W have degree at most 2D ≤D. Further, observe that if D = 1 then H ′ contains no triangles, andD = 3 = D + 2. Since vertices in N W are by construction not image restricted, so are not in W e , this is as required for (ORD 1). If on the other hand D ≥ 2 thenD ≥ D + 3, and again the conditions of (ORD 1) are met. Next, if x ∈ W e then π τ ′ (x) ≤ D, so that (ORD 2) holds. Finally, observe that max z ∈W e π τ ′ (z) ≤ D, and vertices x ∈ N W by construction have π τ ′ (x) = π τ (x) ≤ D, so that (ORD 3) holds.
We can thus apply Lemma 33 to embed H ′ into G ′ , completing the embedding of H into G as desired.
The proof of Theorem 7 from Theorem 31 follows the deduction of Theorem 6 from Theorem 23, and we omit it.

The Bandwidth Theorem in bijumbled graphs
Again, Theorem 8 is a consequence of the following.
Theorem 34. For each γ > 0, ∆ ≥ 2, and k ≥ 1, there exists a constant c > 0 such that the following holds for any p > 0. Given ν ≤ cp max(4,(3∆+1)/2) n, suppose Γ is a p, ν -bijumbled graph, G is a spanning subgraph of Γ with δ(G) ≥ k−1 k + γ pn, and H is a k-colourable graph on n vertices with ∆(H) ≤ ∆ and bandwidth at most cn. Suppose further that H has a labelling L of its vertex set of bandwidth at most βn, a (k + 1)-colouring that is (z, β)-zero-free with respect to L, and where the first √ βn vertices in L are not given colour zero, and the first βn vertices in L include c −1 p −6 ν 2 n −1 vertices in V (H) that are not contained in any triangles of H. Then G contains a copy of H.
The proof of Theorem 34 is a straightforward modification of that of Theorem 23. Rather than repeating the entire proof, we sketch the modifications which have to be made.
Since we are working with bijumbled graphs, we need to work with regular pairs, rather than lower-regular pairs, at all times. In order to use this concept, and to work with bijumbled graphs, we need versions of Lemmas 15, 16, and 17, and Proposition 18, which work with regular pairs and with Γ a bijumbled graph rather than a random graph. We also need the following easy proposition, which lower bounds the possible ν for a (p, ν)-jumbled graph with p > 0. Proof. Suppose that Γ is a (p, ν)-bijumbled graph on n vertices with p ≤ 1 2 . If Γ contains 1 2 n vertices of degree at least 4pn, then we have e(Γ) ≥ pn 2 , and letting A, B be a maximum cut of Γ, by bijumbledness we have 1 2 pn 2 ≤ e(A, B) ≤ p|A||B| + ν |A||B| ≤ 1 4 pn 2 + 1 2 νn , and thus ν ≥ pn/2 ≥ pn/32.
Let R be a graph on r ≤ r 1 vertices and let R ′ ⊆ R be a spanning subgraph with ∆(R ′ ) ≤ ∆ R ′ . Let H and G ⊆ Γ be graphs given with κ-balanced, size-compatible vertex partitions X = {X i } i∈ [r] and V = {V i } i∈[r] , respectively, which have parts of size at least m ≥ n/(κr 1 ).  ) is an R-partition, and X is an (α, R ′ )-buffer for H, (JBUL 2) (G, V) is an (ε, d, p)-regular R-partition, which is (ε, d, p)-super-regular on R ′ , and has one-sided inheritance on R ′ , and two-sided inheritance on R ′ for X , (JBUL 3) I and J form a (ρp ∆ , ζ, ∆, ∆ J )-restriction pair. Then there is an embedding ψ : V (H) → V (G) such that ψ(x) ∈ I x for each x ∈ H.
There are three differences between this result and Lemma 15. First, we assume a bijumbledness condition on Γ, rather than that Γ is a typical random graph. Second, we require regular pairs in place of lower-regular pairs. Third, the number of vertices we may image restrict is much smaller. We will see that these last two restrictions do not affect our proof substantially.
Next, in [3], the following regularity inheritance lemmas for bijumbled graphs are proved.

Lemma 37 ([3, Lemma 3]).
For each ε ′ , d > 0 there are ε, c > 0 such that for all 0 < p < 1 the following holds. Let G ⊆ Γ be graphs and X, Y, Z be disjoint vertex sets in V (Γ). Assume that Then, for all but at most at most ε ′ |Z| vertices z of Z, the pair N Γ (z)∩X, Y is (ε ′ , d, p) G -regular.
The following two lemmas, which more closely resemble Lemmas 16 and 17, are corollaries.
Note that the bijumbledness requirements of this lemma are such that if Y and Z are sets of size Θ(n), then X must have size Ω p −6 ν 2 n −1 . This is where the requirement of Theorem 34 for vertices of H not in triangles comes from.
Finally, we give a bijumbled graphs version of Proposition 18. We defer its proof, which is standard, and similar to that of Proposition 18, to Appendix A.
Proposition 41. For each ε > 0 there exists a constant C > 0 such that for every p > 0, any graph Γ which is (p, ν)-jumbled has the following property. For any disjoint X, Y ⊆ V (Γ) with |X|, |Y | ≥ ε −1 p −1 ν, we have e(X, Y ) = (1±ε)p|X||Y |, and e(X) ≤ 2p|X| 2 . Furthermore, for every Now, using these lemmas, we can prove bijumbled graph versions of Lemmas 24 and 26, and use these to complete the proof of Theorem 34. All these proofs are straightforward modifications of those in the previous sections. Briefly, the modifications we make are to replace 'lower-regular' with 'regular' in all proofs, to replace applications of lemmas for random graphs with the bijumbled graph versions above, and to recalculate some error bounds.
The only one of our main lemmas which changes in an important way is the following Lemma for G.
Lemma 42 (Lemma for G, bijumbled graph version). For each γ > 0 and integers k ≥ 2 and r 0 ≥ 1 there exists d > 0 such that for every ε ∈ 0, 1 2k there exist r 1 ≥ 1 and c, C * > 0 such that the following holds for any n-vertex (p, ν)-bijumbled graph Γ with ν ≤ cp 3 n and p > 0. Let G = (V, E) be a spanning subgraph of Γ with δ(G) ≥ k−1 k + γ pn. Then there exists an integer kr, and such that the following is true.
The change here, apart from replacing 'lower-regular' with 'regular', and working in bijumbled graphs, is that V 0 may now be a much larger set. Nevertheless, the proof is basically the same.
Sketch proof of Lemma 42. We begin the proof as in that of Lemma 24, setting up the constants in the same way, with the exception that we replace Lemmas 16 and 17 with Lemmas 39 and 40, and Proposition 18 with Proposition 41. We require C to be sufficiently large for Lemmas 39 and 40, and for Proposition 41. We define C * = 100k 2 r 3 1 C/ε * as in the proof of Lemma 24, and set c = 10 −5 (ε * ) 3 (kr 1 ) −3 (C * ) −1 . We now assume Γ is (p, ν)-bijumbled rather than random, with ν ≤ cp 3 n. In particular, by choice of c this implies that 10k 2 r 2 1 Cp −2 ν 2 n −1 ≤ ε * pn and 10k 2 r 3 1 Cp −6 ν 2 n −1 ≤ ε * n .
We obtain a regular partition, with a reduced graph containing B k r , exactly as in the proof of Lemma 24, using Proposition 41 in place of Proposition 18 to justify the use of Lemma 12. The next place where we need to change things occurs in defining Z 1 , where we replace 'lower-regular' with 'regular', and in estimating the size of Z 1 . Using Lemmas 39 and 40, and Proposition 18 with Proposition 41, we replace (1) with |Z 1 | ≤ kr 2 1 Cp −6 ν 2 n −1 + kr 2 1 Cp −3 ν 2 n −1 + 2kr 1 Cp −2 ν 2 n −1 ≤ 4kr 2 1 Cp −6 ν 2 n −1 (17) ≤ ε * kr1 n . Note that the final conclusion is as in (1).
We can now continue following the proof of Lemma 24 until we come to estimate the size of Z 2 , where we use Proposition 41 and replace (2) with |Z 2 | ≤ r 1 + kr 1 Cp −2 ν 2 n −1 (17) ≤ ε * kr1 pn . Again, the final conclusion is as in (2).
The next change we have to make is in estimating the size of V 0 , when we replace (6) with |V 0 | ≤ |Z 1 | + |Z 2 | ≤ 4kr 2 1 Cp −6 ν 2 n −1 + r 1 + kr 1 Cp −2 ν 2 n −1 ≤ C * p −6 ν 2 n −1 . Finally, we need regular pairs in (G 2) and (G 3). We obtained regular pairs from Lemma 12 and in the definition of Z 1 , so that we only need Proposition 11 to return regular pairs. We always apply Proposition 11 to pairs of sets of size at least ε * pn r1 , altering them by a factor ε * . Now Proposition 41 shows that if X and Y are disjoint subsets of Γ with |X|, |Y | ≤ (ε * p) −1 ν, then e Γ (X, Y ) ≤ (1 + ε * )p|X||Y |, as required. By choice of c, we have (ε * p) −1 ν ≤ (ε * ) 2 pn/r 1 , so that the condition of Proposition 11 to return regular pairs is satisfied.
The other one of our main lemmas which requires change, Lemma 26, only requires changing 'lower-regular' to 'regular' and replacing the random graph with a bijumbled Γ. This does require some change in the proof, as we then use the bijumbled graph versions of various lemmas, whose error bounds are different.
Lemma 43 (Common neighbourhood lemma, bijumbled graph version). For each d > 0, k ≥ 1, and ∆ ≥ 2 there exists α > 0 such that for every ε * ∈ (0, 1) there exists ε 0 > 0 such that for every r ≥ 1 and every 0 < ε ≤ ε 0 there exists c > 0 such that the following is true. For any n-vertex (p, cp ∆+1 n)-bijumbled graph Γ the following holds. Let G = (V, E) be a (not necessarily spanning) subgraph of Γ and {V i } i∈[k] ∪ {W } a vertex partition of a subset of V such that the following is true for every i, i ′ ∈ [k].
Then there exists a tuple (w 1 , . . . , w ∆ ) ∈ W ∆ such that for every Λ, Λ * ⊆ [∆], and every The main modifications we make to the proof of Lemma 26 are to replace Lemmas 16 and 17 with Lemmas 39 and 40, and Proposition 18 with Proposition 41, and to replace all occurrences of 'lower-regular' with 'regular'. We sketch the remaining modifications below.
Sketch proof of Lemma 43. We begin the proof by setting constants as in the proof of Lemma 26, but appealing to Lemmas 39 and 40, and Proposition 41, rather than their random graph equivalents.
In order to apply Lemma 29 to G, we need to observe that its condition is satisfied by Proposition 41 and because ε −1 p −1 ν < 10 −10 ε 4 pn k 4 r 4 by choice of c. The same inequality justifies further use of Proposition 41 to find the desired W ′ . Estimating the size of W ′ , we replace (12) with |W ′ | ≥ 10 −11 ε 4 pn t 1 k 4 r 4 ≥ 10 5 Cp −2 ν , where the final inequality is by choice of c.
We only need to change the statement of Claim 2 by replacing 'lower-regular' with 'regular' in (L 1) and (L 6). However we need to make rather more changes to its inductive proof. The base case remains trivial. In the induction step, we need to replace (13) where the final inequality is by choice of c. This, together with |W ′ | ≥ 10 5 Cp −2 ν from (18), justifies that we can apply Lemma 39. We obtain that at most 2 ∆ k 2 Cp −3 ν 2 8krt1 n vertices w in W violate (L 1).
The estimate on the number of vertices violating (L 2) does not change. For (L 4), we need to observe that j∈Λ N Γ (w j , V ′ i ) = (1 ± ε 0 ) |Λ| p |Λ| |V ′ i |, and in particular by choice of ε 0 and c this quantity is at least Cp −1 ν. Then Proposition 41 then gives that at most 2 ∆+1 kCp −2 ν 2 8krt1 n vertices destroy (L 4), and the same calculation gives the same bound for the number of vertices violating (L 5) and (L 3).
Finally, for (L 6), we need to use the inequality (1 − ε 0 ) ∆−1 p ∆−1 n 4kr ≥ Cp −2 ν, which holds by choice of c, to justify that Lemmas 39 and 40 can be applied as the corresponding random graph versions are in Lemma 26. We obtain quite different bounds from these lemmas, however. If ∆ = 2, then we only use Lemma 39, with an input regular pair having both sets of size at least n 4kr , so that the number of vertices violating (L 6) in this case is at most 2 2∆ k 2 Cp −3 ν 2 4kr n . If ∆ ≥ 3, we use both Lemma 39 and 40. The set playing the rôle of X in Lemma 39 has size at least (1 − ε 0 ) ∆−2 p ∆−2 n 4kr , while we apply Lemma 40 with both sets of the regular pair having at least this size. As a consequence, the number of vertices violating (L 6) is at most 2 2∆+1 k 2 Cp −6 ν 2 (1 − ε 0 ) 2−∆ p 2−∆ 4kr n for the case ∆ ≥ 3.
Putting this together, for the case ∆ = 2 we replace (14) with the following upper bound for the number of vertices w ∈ W ′ which cannot be chosen as w ℓ+1 .
The proof of Theorem 34 is similar to that of Theorem 23. Again, we sketch the modifications.
Sketch proof of Theorem 34. We begin as in the proof of Theorem 23, setting up constants as there, but replacing Lemma 24 with Lemma 42, Lemma 26 with Lemma 43, Lemma 15 with Lemma 36, and Proposition 18 with Proposition 41. In addition to the constants defined in the proof of Theorem 23 we require 0 < c ≤ 10 −50 ε 8 µρξ 2 (∆kr 1 C) −10 to be small enough for Lemmas 42 and 43. Now, instead of assuming Γ to be a typical random graph, suppose ν ≤ cp max{4,(3∆+1)/2} n, and let Γ be an n-vertex (p, ν)-bijumbled graph. By Proposition 35 we have p ≥ C * log n n 1/2 .
We continue following the proof of Theorem 23. We now assume the first βn vertices of L include Cp −6 ν 2 n −1 vertices that are not contained in any triangles of H. We appeal to Lemma 42 rather than Lemma 24 to obtain a partition of V (G). This partition has |V 0 | ≤ C * p −6 ν 2 n −1 (which is different to the upper bound in the proof of Theorem 23), but still satisfies (G 1a) and (G 4a), and (G 2a) and (G 3a) when 'lower-regular' is replaced by 'regular' in both statements.
The application of Lemma 25 is identical. The application of Lemma 21 is also identical, and the deduction of (16) is still valid by (19). The pre-embedding is also identical, except that we replace each occurrence of C * max{p −2 , p −1 log n} with C * p −6 ν 2 n −1 , and that we replace the application of Proposition 18 justifying that at each visit to Line 1 we have at least 1 4 µpn choices with an application of Proposition 41. To verify the condition of the latter, and to see that this yields a contradiction we use the inequality |Z| ≥ 1 100(∆+1) µpn ≥ 2C * p −2 ν 2 8r εn , which holds by choice of c. Moving on, we justify Claim 4 by observing that εn 4kr1 ≥ Cp −1 ν, which allows us to apply Proposition 41 in place of Proposition 18, and that 2krC * p −2 ν 2 4kr1 εn ≤ |Y | 2 , both inequalities following by choice of c. Now Lemma 43, in place of Lemma 26, finds w 1 , . . . , w ℓ . Our construction of f * , and its properties, is identical, while Lemma 43 gives (G 1a)-(G 8a), with 'lower-regular' replaced by 'regular' in (G 2a), (G 3a) and (G 7a). The deduction of (G 1b)-(G 8b) is identical, except that we use the 'regular' consequence of Proposition 11. To justify this, observe that each time we apply Proposition 11, we apply it to a regular pair with sets of size at least (1 − ε * )p ∆−1 n 4kr by (G 1a) and (G 6a), and we change the set sizes by a factor (1±2µ), so that Proposition 41 gives the required condition. To check this in turn, we need to observe that 2µ(1 − ε * )p ∆−1 n 4kr ≥ 100µ −1 p −1 ν, which follows by choice of c. We can thus replace 'lower-regular' with 'regular' in (G 2b), (G 3b) and (G 7b).
Finally, we verify the conditions for Lemma 36. The only point where we have to be careful is with the number of image restricted vertices. The total number of image restricted vertices in H ′ is at most ∆ 2 |V 0 | ≤ ∆ 2 C * p −6 ν 2 n −1 , which by choice of c and by (G 1c) is smaller than ρp ∆ |V i,j | for any i ∈ [r] and j ∈ [k], justifying that (I, J ) is indeed a (ρp ∆ , 1 4 α, ∆, ∆)-restriction pair. The remaining conditions of Lemma 36 are verified as in the proof of Theorem 23, and applying it we obtain an embedding φ of H ′ into G \ im(φ t * ), so tha φ ∪ φ t * is the desired embedding of H into G.
Finally, the deduction of Theorem 34 is essentially the same as that of Theorem 6 from Theorem 23, and we omit it. 11. Concluding remarks 11.1. General spanning subgraphs. Our main theorems place restrictions on the graphs H with respect to whose containment random or pseudorandom graphs have local resilience. As was shown by Huang, Lee and Sudakov [22], such restrictions are necessary. Given ε > 0, if Γ is either a typical random graph G(n, p) or a pseudorandom graph with density p, and p is sufficiently small, then one can delete edges from Γ in order to remove all triangles at a given vertex v, without deleting more than εpn edges at any vertex. Thus if H is any graph all of whose vertices are in triangles, if p = o(1) the local resilience of Γ with respect to containment of H is o(1).
This leads to the question: if we instead restrict G, requiring in addition to the conditions of Theorem 6 that G contains a positive proportion of the copies of K ∆+1 in Γ at each vertex, is it true that G will contain any k-colourable, bounded degree spanning subgraph H with sublinear bandwidth without further restriction? We study this question in a forthcoming companion note to this paper, together with Schnitzer [1]. 11.2. Optimality of Theorem 6. Recall that Huang, Lee and Sudakov [22] proved that the restriction on H that C * p −2 vertices should not be in triangles is necessary for all p. For p constant, they proved a version of Theorem 6, but the number of vertices in H they require to have independent neighbourhood grows as a tower type function of p −1 , and they also require these vertices to be well-distributed in the bandwidth order, so that our result is strictly stronger than theirs.
On the other hand, we do not believe that the lower bound on p in Theorem 6 is optimal. For ∆ = 2, the statement is certainly false for p ≪ n −1/2 , since then G(n, p) has a.a.s. local resilience o(1) with respect to containing even one triangle. It seems likely that the statement is true down to this point, a log factor improvement on our result. For ∆ = 3, the statement as written is false for p ≪ n −1/3 . Briefly, the reason for this is that in expectation a vertex is in O p 6 n 3 copies of K 4 in G(n, p), and (with some work) this implies that there is a.a.s. a subgraph of G(n, p) with minimum degree very close to pn and p −5 n −1 vertices not in copies of K 4 . For p ≪ n −1/3 , p −5 n −1 ≫ p −2 , so that we would also have to insist on many vertices of H not being in copies of K 4 to accommodate this. Generalising this, we obtain the following conjecture. Conjecture 1. For each γ > 0, ∆ ≥ 2, and k ≥ 1, there exist constants β * > 0 and C * > 0 such that the following holds asymptotically almost surely for Γ = G(n, p) if p ≥ C * n −2/(∆+2) . Let G be a spanning subgraph of Γ with δ(G) ≥ k−1 k + γ pn and let H be a k-colourable graph on n vertices with ∆(H) ≤ ∆, bandwidth at most β * n, there are at least C * p −2 vertices in V (H) that are not contained in any triangles of H, and at least C * p −(∆+2)(∆−1)/2 n 2−∆ vertices in V (H) which are not in K ∆+1 . Then G contains a copy of H.
This conjecture seems to be hopelessly out of reach with our current state of knowledge. We cannot even prove that G(n, p) itself is universal for graphs on n 2 vertices with maximum degree ∆. The best current result in this direction is due to Conlon, Ferber, Nenadov andŠkorić [14], who show that for ∆ ≥ 3, if p ≫ n −1/(∆−1) log 5 n then G(n, p) is a.a.s. universal for graphs on 1 − o(1) n vertices of maximum degree ∆, finally breaking the n −1/∆ barrier which is reached by several papers, but still far from the conjectured truth. It is possible that their methods could be used to prove a version of Theorem 6 for almost-spanning H in sparser random graphs, but this does not appear to be straightforward. 11.3. Optimality of Theorem 7. The 'extra' restriction we place in Theorem 7, of having many vertices of H which are neither in triangles nor four-cycles, is an artifact of our proof. It would be possible to remove the stipulation regarding four-cycles-one can prove a version of Lemma 26 capable of embedding vertices in a degeneracy order. However this comes at the cost of a worse lower bound on p. It seems likely that one would be able to obtain a result for p ≫ log n n 1/(2D+2) , but we did not check the details.
Next, we prove the Sparse Regularity Lemma variant Lemma 29, whose proof follows [37].
We define the energy of a pair of disjoint sets P, P ′ contained in respectively V i and V i ′ to be E(P, P ′ ) := |P ||P ′ | min d p (P, P ′ ) 2 , 2Ld p (P.P ′ ) − L 2 |V i ||V i ′ | .
Note that this quantity is convex in d p (P, P ′ ). Now given a partition P refining {V i } i∈[s] , we define the energy of P to be E(P) :=