C O ] 3 0 A pr 2 01 8 Improved bounds for the Erdős-Rogers function

The Erdős-Rogers function fs,t measures how large a Ks-free induced subgraph there must be in a Kt-free graph on n vertices. While good estimates for fs,t are known for some pairs (s, t), notably when t = s+1, in general there are significant gaps between the best known upper and lower bounds. We improve the upper bounds when s+2 ≤ t ≤ 2s− 1. For each such pair we obtain for the first time a proof that fs,t ≤ n αs,t+o(1) with an exponent αs,t < 1/2.


Introduction
Let G be a graph with n vertices that contains no K 4 . How large a triangle-free induced subgraph must G have? The standard proof of Ramsey's theorem implies that G contains an independent set of size n 1/3 , but can we do better?
A simple argument shows that the answer is yes. Indeed, each vertex in G has a triangle-free neighbourhood, and either there is a vertex of degree n 1/2 or one can find an independent set of size roughly n 1/2 by repeatedly choosing vertices and discarding their neighbours.
This stronger argument still feels a little wasteful, because in the second case one finds an independent set rather than a triangle-free subgraph. Moreover, there is no obvious example that yields a matching upper bound, so it is not immediately clear whether 1/2 is the correct exponent.
The problem above is an example of a general problem that was first considered by Erdős and Rogers. Given positive integers s < t and n, define f s,t (n) to be the minimum over all K t -free graphs G with n vertices of the largest induced K s -free subgraph of G. We have just been discussing the function f 3,4 . The function f s,t is known as the Erdős-Rogers function. It has been studied by several authors: for a detailed survey covering many of the known results on the subject, see [5]. For a more recent exposition, see also section 3.5.2 of [2].
The first bounds were obtained by Erdős and Rogers [7] who showed that for every s there exists a positive constant ǫ(s) such that f s,s+1 (n) ≤ n 1−ǫ(s) . About 30 years later, Bollobás and Hind [1] improved the estimate for ǫ(s) and established the lower bound f s,t (n) ≥ n 1/(t−s+1) . In particular, f s,s+1 (n) ≥ n 1/2 (by the obvious generalization of the argument for f 3,4 ).
Subsequently, Krivelevich [8,9] improved these lower bounds by a small power of log n and also gave a new general upper bound, which is (1)

Our results
In this paper, we prove that the answer to the first question above is yes. We also establish (2) for the families of pairs t = s + 3 ≥ 7 and t = s + 2 ≥ 5. We obtain these results by proving a significant improvement for the upper bound on f s,t when s + 2 ≤ t ≤ 2s − 1. The previous best upper bound for these parameters appeared in [3] and was f s,t (n) ≤ cn 1/2 (except for the pair s = 3, t = 5, where this bound was not established). We do not just obtain bounds of the form o(n 1/2 ), but we improve the exponents throughout the range. Our construction is probabilistic, and has some similarities to the constructions that established the previous best upper bounds. However, an important difference is that we do not make use of algebraic objects such as projective planes.
Theorem 1. For any s ≥ 3, s + 2 ≤ t ≤ 2s − 1, there exists some constant c = c(s, t) such that f s,t (n) ≤ n α (log n) c It is not hard to check that α < 1/2 for all pairs (s, t) in the given range. Thus, as mentioned above, we obtain a strong answer to the question of Dudek, Retter and Rödl.
The simplest case where our result is new is the case s = 3, t = 5. There we obtain an upper bound of n 6/13 (log n) c . For comparison, Sudakov's lower bound is cn 5/12 . Since the exponent when t = s + 1 is 1/2, our result also implies a positive answer to the question of Erdős in the following family of cases.    That is, (2) holds for t = s + 3 ≥ 7.
In the following table, we compare the exponent of n in the best known lower bound with that in our new upper bound (both rounded to three decimal places). In the case t = s+2, our bound is f s,s+2 (n) ≤ n α+o(1) for α = 1/2− s−2 8s 2 −18s+8 ≈ 1/2− 1 8s while Sudakov's lower bound is f s,s+2 (n) ≥ n β+o(1) for β = 1/2 − 1 6s−6 ≈ 1/2 − 1 6s . It would be very interesting to know whether either of these two estimates reflects the true asymptotics of f s,s+2 . It would be particularly interesting to know whether either of the exponents 5/12 or 6/13 is the correct one for (s, t) = (3,5). We have made some effort to optimize our construction, whereas there appear to be places where Sudakov's argument is potentially throwing information away, so our guess is that 6/13 is correct, but this guess is very tentative and could easily turn out to be wrong.

An overview of the argument
We will now sketch the key steps in our argument. For simplicity, we will focus on the s = 3, t = 5 case. Then, as mentioned above, Theorem 1 says that f 3,5 (n) ≤ n 6/13 (log n) c . That is, we construct a K 5 -free graph G in which every subset of size roughly n 6/13 induces a triangle.
The basic idea is very simple. We are looking for a graph that contains "triangles everywhere" but does not contain any K 5 s. The obvious way to create a large number of triangles without creating K 5 s is to take a complete tripartite graph. Of course, this on its own does nothing, since a complete tripartite graph has a huge independent set, but we can use it as a building block by taking a union of many complete tripartite graphs. In previous constructions, such as Wolfovitz's graph that gives an upper bound for f 3,4 (n), the vertex sets of these tripartite graphs are chosen algebraicallyin Wolfovitz's case they are the lines of a projective plane. The main difference in our approach is that we simply choose them at random, where the number we choose and the size of each one are parameters that we optimize at the end of the argument. This creates difficulties that are not present in the earlier approaches, but in the end allows us to prove stronger bounds.
Thus, we begin by taking a graph G 0 , which is a union of roughly n 9/13 complete tripartite graphs with parts having size roughly n 6/13 each, these parts being randomly chosen subsets of V (G 0 ). It is not hard to prove that G 0 contains a triangle in every set of vertices of size roughly n 6/13 . However, G 0 also contains many K 5 s, so we have to delete some edges. It is here that the proof becomes less simple: while random constructions followed by edge deletions are very standard, in this case we need rather delicate arguments in order to prove that it can be done without removing all the triangles from a set of size around n 6/13 .
First, let us check that every set of size roughly n 6/13 does indeed induce a triangle in G 0 . Let A be a subset of V (G 0 ) of size n 6/13 . A given tripartite copy will intersect A in at least 3 vertices with probability roughly n −3/13 . Thus, as we place n 9/13 tripartites, the expected number of those tripartites that give a triangle in A is roughly n 6/13 . Hence, by the Chernoff bound, the probability that A does not contain a triangle is roughly e −n 6/13 . But the number of subsets of V (G 0 ) of size n 6/13 is very roughly n n 6/13 . Modifying the parameters by log n factors suitably, a union bound shows that almost surely every subset A will contain a triangle. Now let us specify which edges get deleted. We shall delete them in two stages. The first stage consists of what we call Type 1 deletions. Given any two of our random tripartite graphs, with vertex sets A = A 0 ∪ A 1 ∪ A 2 and B = B 0 ∪ B 1 ∪ B 2 , we remove all edges xy such that x, y ∈ A ∩ B. We do not insist that xy is an edge of both tripartite graphs: if, for example, x, y ∈ A 0 , x ∈ B 0 and y ∈ B 1 , then the edge xy will be removed. Let G 1 be the resulting graph when all such edges have been deleted. The reason for these deletions is that each of our tripartite graphs contains many copies of K 3,1,1 , which are somewhat "dangerous" for us, since all it takes to convert a K 3,1,1 into a K 5 is the addition of a further triangle. If we do not do Type 1 deletions, then we will obtain K 5 s in this way too frequently, with the result that most edges in the graph are contained in a K 5 . Indeed, the expected number of edges in G 0 is roughly n 9/13 (n 6/13 ) 2 = n 21/13 and the expected number of K 5 s of the above form is roughly n 5 (n 9/13 ) 2 (n −7/13 ) 8 = n 27/13 . Type 1 deletion is feasible in the sense that it destroys only a small proportion of the edges of G 0 . That is because it is significantly less likely for a pair of vertices to be contained in two tripartite copies than for it to be contained in one tripartite copy.
Thanks to Type 1 deletions, it has become "difficult" for K 5 s to appear in G 1 , since now none of our random tripartite graphs can intersect a K 5 in more than 3 vertices. Indeed, if one of them intersects a K 5 in say 4 vertices, then there exist two of those vertices between which this tripartite does not provide an edge, and if one of the other tripartites gives an edge in G 0 between those two vertices, that edge is deleted.
Thus, it is easy to check that if a K 5 appears in G 1 , then it has to do so in one of the following ways.
(i) All 10 edges of the K 5 come from distinct tripartites.
(ii) There is one tripartite giving a triangle in the K 5 but all the other 7 edges come from distinct tripartites.
(iii) There are two tripartites that each give a triangle in the K 5 , these two triangles sharing a single vertex, and all the other 4 edges come from distinct tripartites.
We now delete at least one edge from each of these remaining K 5 s. This will be done probabilistically and the precise method will be explained later. The deletions in this second round we call Type 2 deletions. Once they have been performed, the resulting graph is our final graph G. The graph G is K 5 free, by definition, but we now have to show that we have not inadvertently destroyed all the triangles in some set of n 6/13 (log n) c vertices. We begin by checking the more basic requirement that the Type 2 deletions destroy only a small proportion of the edges. That is, we check that the expected number of K 5 s in G 1 is less than the expected number of edges (which is already computed to be n 21/13 ). To do this, we split into the three cases mentioned above. To calculate the expected number of K 5 s of type (i), observe that there are at most n 5 choices for the vertex set, and (n 9/13 ) 10 choices for the copies of tripartites giving an edge (since there are n 9/13 tripartites to choose from and we need 10 of them), and the probability that the vertices of the K 5 are in these tripartites as prescribed is (n −7/13 ) 20 (since the probability that a given vertex is in a given tripartite is n −7/13 ), giving that the expected number of these K 5 is n 15/13 . Similarly, the expected number of K 5 s of type (ii) is n 5 (n 9/13 ) 8 (n −7/13 ) 17 = n 18/13 . Finally, the expected number of K 5 s of type (iii) is n 5 (n 9/13 ) 6 (n −7/13 ) 14 = n 21/13 . This last number is roughly equal to the expected number of edges, therefore we will need to modify the parameters by log n factors. However, the main point is that after this second round of deletions, most edges of the original graph are still present.
In order to finish off the proof, there are two main difficulties to overcome. The first one is that even though we have made sure that globally not too many edges are deleted, this is, as we have already mentioned, just a necessary condition for the argument to have a chance of working. What we actually need is the stronger statement that every induced subset of size n 6/13 (log n) c still contains a triangle. We can hope that the small set of edges we have removed is "sufficiently random" for this to be the case, but actually proving that takes some work. Let us sketch how we do it. From now on, it will be convenient to think of each tripartite as having a colour: accordingly, we call the tripartites "colour classes". If a vertex belongs to, say, the red tripartite, then we say that that vertex is red.
Let us now fix a set A of size n 6/13 (log n) c . As shown above, we can take it for granted that G 0 contains a big set T of triangles in A, all coming from different colour classes. Moreover, these triangles will be uniformly distributed over A. Let T C ∈ T be a triangle coming from the colour class C. (Note that not every colour gives a triangle, and not every triangle in A comes from just one colour class.) Let us first deal with Type 1 deletions. An edge of some T C gets deleted by the Type 1 deletions if the endpoints of this edge share a colour other than C. So intuitively we can imagine that G 0 has already been constructed, and then we place these triangles T C randomly inside A and hope that most triangles will not have any edge contained in another colour class. It is not too hard to show, under suitable assumptions, that with very high probability the density of pairs of vertices in A sharing a colour is fairly low (this essentially comes from the fact that the typical sizes of the tripartites are smallerafter adjusting the parameters by suitable log factors -than the size of A). Therefore for a fixed T C it is indeed true that with fairly high probability its edges will not be deleted by Type 1 deletions. However, these events are not independent for different colours C. To overcome this difficulty, we define a set Π of roughly log n partitions with the property that for any pair of distinct colours C, D there is a π ∈ Π such that D is in the first part of π and C is in the second part. We now define a π-dangerous pair to be a pair of vertices that share a colour from the first part of π. If an edge xy of a T C gets deleted (by Type 1 deletions) then x and y share a colour D = C and there is some π ∈ Π such that D is in the first part of π and C is in the second part of π and therefore (x, y) is a π-dangerous pair. But note that, as indicated above, the density of π-dangerous pairs will be fairly low, so the probability that an edge of T C is deleted because of a colour in the part of π is low, and these events are now independent for all C in the second part of π. We can therefore conclude that only a small proportion of these T C s will lose an edge thanks to colours in the first part of π. Thus, since Π is small, we deduce that most triangles T C will not lose an edge. That is, we can find many triangles in A even after the Type 1 deletions. Now let us define Type 2 deletions. Given the graph G 1 , we order its edges randomly and keep each edge provided that it does not form a K 5 when combined with the edges that we have already decided to keep. We remark that this construction is a variant of the so called K 5 -free process. The edges we keep will form our final graph G. As shown above, the number of K 5 s in G 1 is less than the number of edges, that is, on average an edge is contained in less than one K 5 . In fact, one can show that almost surely every edge will be contained in a relatively small number of K 5 s. It is not hard to see that this means that any triangle in G 1 is also present in G with probability not very close to 0. Since the number of triangles in G 1 [A] is large, standard concentration inequalities will imply that with very high probability G[A] still contains a triangle. Using the union bound over all A, we conclude that almost surely every G[A] (with |A| = a) contains a triangle, finishing the proof.
Let us briefly discuss how we determined the parameters of our construction. Let n δ be the number of tripartite copies placed, let n β be the size of each part of each of these copies, and let n α be the set size that will guarantee an induced triangle. The parameters δ, β have been chosen to optimize the result: that is, to allow α to be as small as possible. There are three main conditions that we need to impose on these parameters.
The first one is that we need enough triangles in G 0 inside every A of size n α . It is not hard to see that this condition is equivalent to The second one comes from the fact that the parts of the tripartites will not contain a triangle in G (since every edge inside a part of a tripartite gets deleted by Type 1 deletions), so we trivially need α ≥ β Finally, we want the expected number of K 5 s in G 1 to be less than the expected number of edges in G 1 which gives (only considering those K 5 s which are type (iii) in the sense described a few paragraphs above) It is not hard to see that these conditions force α ≥ 6/13 and that equality is achieved by taking δ = 9/13, β = 6/13. This leads us to the other main difficulty, which arises only when we consider more general values of s, t. While (3) is essentially the same but with 3 replaced by s, and (4) is exactly the same, (5) becomes completely different. Indeed, it will be crucial to analyse all possible ways that a K t can occur in G 1 in some systematic way, rather than writing down the three possibilities (i),(ii),(iii) as we did above in the s = 3, t = 5 case, since in general there are many ways that a K t can be formed from the contributions of the various s-partite graphs. Analysing these decompositions of K t , which we shall refer to as colour schemes (again by imagining that each s-partite graph has its own colour), is necessary to determine the best parameters δ, β, and also to prove Theorem 1 for these parameters. The complicated formula for α is obtained by solving the system of inequalities (3),(4),(5) that we obtain in the general case.
The organization of this paper is as follows. In Section 2 we present our construction. In Section 3 we give the main part of the proof conditional on three lemmas. These lemmas are proved in Section 4. The first one, which asserts that each edge in G 1 is contained in a small number of K 5 s, is proved in Subsection 4.1, conditional on a lemma about colour schemes that is proved in Subsection 4.3. The result that says that G 1 [A] contains many triangles is proved in Subsection 4.2. Finally, there is an appendix that contains some tedious computations and the source code of a program relevant to some results in Subsection 4.3.

The precise construction and the main result
Remark. We will not be concerned with floor signs, divisibility, and so on. Also, we will tacitly assume that n is sufficiently large whenever this is needed. Moreover, throughout the rest of the paper, it is to be understood that s ≥ 3 and that s + 2 ≤ t ≤ 2s − 1. Recall that a pair (s, t) is regular if s ≥ 11 and s + 3 ≤ t ≤ 2s − 4 or if (s, t) ∈ {(10, 14), (10, 15)}, and otherwise it is exceptional.
where c 1 , c 2 , c 3 are positive constants, to be specified, that depend on s and t. (We will need c 2 to be suitably large and c 3 to be sufficiently larger than c 1 , c 2 .) We construct the graph G 0 as follows. Let V = V (G 0 ) = {1, 2, ..., n}. Define independent random subsets S 1 , ..., S m of V in such a way that each S i contains each v ∈ V independently with probability γ. We call S i the ith colour class. If v ∈ S i , we say that v has colour i. Now randomly partition each S i into s sets, S i1 , S i2 , ..., S is and use these sets to define a complete s-partite graph. Let G 0 be the union of these s-partite graphs. We say that a pair of vertices has colour i if both its members have colour i. We do not require the pair to form an edge in G 0 . Remove all edges of G 0 that have at least two colours to obtain the subgraph G 1 . Again, we do not require both colours to give an edge. Another way to state the condition is that if xy is an edge of colour i and x and y both have colour j for some j = i, then we remove the edge xy even if x and y belong to the same set S jr . Finally, for every K t in G 1 we randomly remove a certain edge, which we shall specify in a moment. The resulting graph is called G. The graph G is obviously K t -free. We shall prove that for suitable choices for the constants c 1 , c 2 , c 3 , we have the following result, which is our main theorem.
Theorem 5. For n sufficiently large, there is a positive probability that every subset A of G with |A| = a contains a K s .
Let us now specify which edges are removed from G 1 . Suppose that x 1 , ..., x t form a K t in G 1 . Then necessarily any two distinct vertices x i and x j share precisely one colour. Indeed, they must share at least one colour since x i x j ∈ E(G 0 ) but they cannot share more than one since then x i x j would have been removed from G 0 during the first round of deletions. Definition 6. A colour scheme for K t with parameter s, or scheme for short, is a set X of t nodes and a set D of subsets of X, which we call colours, or blocks, such that (i) For any x, y ∈ X, there is a unique D ∈ D such that x, y both belong to D.
(ii) Every colour appears on at least two nodes.
(iii) Every colour appears at most s times.
A pair of nodes is called an edge and the colour of an edge is the unique colour that contains both endpoints. (Note that a node may have several colours.) If a node x belongs to a colour D, we shall say that D labels x. We also define a label to be a pair (x, D) such that x is a node and D labels x. The number of labels in a scheme is thus the sum of the sizes of all the colours.
2 ) colours such that X is a colour scheme with respect to those colours, and no other colour labels more than one vertex in X. Indeed, we have already observed that property (i) holds. Choosing the colours suitably, (ii) can clearly be achieved. For property (iii), observe that if some colour D labels at least s + 1 vertices, then there must exist distinct vertices x i and x j that belong to the same part of the complete s-partite graph of colour D. Then D does not provide an edge between x i and x j , so some other colour must, but then x i and x j share at least two colours, which contradicts (i).
Thus, any K t in G 1 can be viewed as a scheme in a natural way. A simple upper bound for the expected number of K t s associated with a scheme Q is n t m b γ l , where l is the number of labels of Q and b is the number of colours of Q. Indeed, the number of ways choosing the t nodes is at most n t , the number of ways of choosing the b colours (from the m colours used to construct G 1 ) is at most m b , and the probability that any given choice of nodes and colours realizes the scheme is γ l , since for each label the probability that the given node receives the given colour is γ, and all these events are independent.
Similarly, given an edge in G 1 , the expected number of K t s associated with Q that contain that edge is at most This motivates the following definition.
Thus, the expected number of K t s associated with a scheme Q that contain a given edge is at most n v(Q) up to log factors. The following lemma -proved in Subsection 4.3 -shows that this number is small.
We shall also need a generalization of the notion of a scheme where a pair of nodes does not need to have a colour, if it does have a colour then that colour does not have to be unique, and a colour is allowed to label more than s nodes. Definition 9. A colour configuration consists of a set of nodes and a set of colours labelling the nodes such that every colour appears on at least two nodes.
Given a colour configuration W and a subset S of its nodes, we define the subconfiguration induced by S to be the configuration whose nodes are the elements of S and whose colours are the colours of W that appear at least twice on S (which then label the nodes in S that they labelled in W ).
The value of a configuration W is defined to be where h is the number of nodes, b is the number of colours and l is the number of labels in W (where a label is again a pair (x, D) where x is a node labelled by the colour D).
The same argument as for schemes shows that the expected number of occurrences of a colour configuration W that contain a given edge is at most n v(W ) up to log factors.
Definition 10. The core of a scheme Q, denoted C(Q), is the induced subconfiguration S on at least two nodes for which v(S) is minimal. If several subconfigurations have the same value then the core is the one with the maximum number of nodes. If this is still not unique, then we simply pick an arbitrary one with the given properties.
Remark. We can in fact prove that C(Q) = Q for every scheme Q. Although using that fact would simplify the argument in this paper slightly, this gain does not compensate for the extra work needed to establish it, so we shall avoid using it. Nevertheless, the reader is encouraged to think of a core just as a scheme: that is, as a K t in the graph G 1 with the colours given by the s-partite graphs with vertex sets that contain at least two of its vertices.
Lemma 11. Let Q be a scheme. Then C(Q) has at least 3 nodes, v(C(Q)) ≤ 0, and v(S) ≥ v(C(Q)) for every induced subconfiguration S of C(Q) with at least two nodes.
Proof. The first two assertions follow from Lemma 8, since an induced subconfiguration of Q with two nodes has value 0. The third assertion follows immediately from the definition of the core.
We can now define G precisely. Following an idea in [13], we assign independently to each edge e of G 1 a birthtime β e , chosen uniformly randomly from [0, 1]. Equivalently, we order the edges of G 1 uniformly at random from all the possible orderings. To define the edge set E(G), which will be a subset of E(G 1 ), we recursively decide for each e ∈ E(G 1 ) whether e ∈ E(G), as follows. Suppose that the decision has been made for every e ′ ∈ E(G 1 ) with β e ′ < β e . Then let e ∈ E(G) unless there is a K t in G 1 , which we view as a scheme Q, for which the edges of C(Q) all have birthtime at most β e and they all (apart from e) already belong to E(G).
For any K t in G 1 there is an edge in the core of that K t that is not an edge of G, since if all the edges in the core apart from the last one are chosen to belong to E(G 1 ), then the last one is not. Thus, G is K t free. It remains to prove that with positive probability every set of a vertices still contains a K s , which was Theorem 5 above.

The proof of Theorem 5
In this section, we shall prove Theorem 5 conditional on two lemmas, which we shall prove in Section 4 and which are where most of the work will be. The first one says, roughly speaking, that for any A of size a, the induced subgraph G 1 [A] of G 1 contains many copies of K s . Lemma 12. Almost surely, for every A of size a there is a set of Ω(ma s γ s ) monochromatic copies of K s inside G 1 [A], each with a different colour.
The second tells us that any edge in G 1 is contained in few cores. Here, and in what follows, we use the word "core" to refer to the core of a K t in G 1 .
Lemma 13. Almost surely, any edge in G 1 is contained in at most (log n) 2t cores.
We shall use McDiarmid's inequality [10] in the next proof, which for convenience we recall here. Let Y 1 , . . . , Y N be independent random variables, taking values in a set S, and let X = g(Y 1 , . . . , Y N ) for some g : S N → R with the property that if y, y ′ ∈ S N only differ in their ith coordinate, then |g(y) − g(y ′ )| ≤ c i . Then the inequality states that The following lemma, together with Lemmas 12 and 13 and a union bound, implies Theorem 5.
Lemma 14. Suppose that G 1 is such that any edge in G 1 is contained in at most (log n) 2t cores. Let A be a set of vertices of size a such that the induced subgraph G 1 [A] contains Ω(ma s γ s ) monochromatic copies of K s , each with a different colour. Then the probability, conditional on the graph G 1 , that G[A] does not contain any K s is o 1 ( n a ) .

Proof.
Choose Ω(ma s γ s ) monochromatic copies of K s in G 1 [A], all of distinct colours. Let the set of these copies be T . Then by the definition of the first deletion process, the elements of T are edge disjoint. Let T ∈ T . Let E T be the set of all edges of cores that have at least one edge that belongs to T , together with the edges of T itself. Clearly, |E T | ≤ s 2 + s 2 (log n) 2t t 2 ≤ (log n) 3t . Let B T be the event that the birthtimes of the edges of T precede the birthtimes of all other edges in E T . If B T occurs, then the only way an edge of T could be deleted from G 1 and therefore fail to be present in G is if T itself contains a core. But there is no colour that labels every vertex in a core, since then the core, considered as a colour configuration, would have value h − 2 + (h − 2)(α − 1) = (h − 2)α (where h is the number of nodes in the core), which contradicts Lemma 11. It follows that if B T occurs, then every edge of T is present in G.
For a fixed G 1 , let X be the number of events B T that occur over all T ∈ T . Then X is a random variable with the property that if X = 0, then there is some T ∈ T that belongs to G[A]. It therefore suffices to prove that P[ .
To do this, we apply McDiarmid's inequality when Y i is the birthtime of the ith edge. Since the T ∈ T are edge disjoint, and any edge e in G 1 is contained in at most (log n) 2t cores, it follows that e is contained in at most 1 + (log n) 2t t 2 ≤ (log n) 3t of the graphs E T . Hence, changing the birthtime β e of e influences at most (log n) 3t of the events B T . Also, if e ∈ ∪ T ∈T E T , then β e does not influence any event B T . Thus, by McDiarmid's inequality (with some N ≤ |T |(log n) 3t ), we get Finally, note that n a ≤ ( ne a ) a = exp(O(a log n)). To finish the proof we just need to verify that |T | (log n) 6st+9t = ω(a log n). Since we are done provided that (s − 1)c 3 − sc 2 − c 1 > 6st + 9t + 1.

The proofs of the auxiliary lemmas
In this section we shall prove Lemmas 8, 12 and 13, which are the results we used in the proof of Theorem 5 but have not yet proved.

The proof of Lemma 13
Let e be an edge in G 1 . We would like to show that it belongs to at most (log n) 2t cores. Any core that contains e can be viewed as a core in a scheme that contains e, and as such it has nonpositive value. But for any colour configuration W (with more than two labels), the expected number of occurrences of that colour configuration in G 0 containing a fixed edge is at most n v(W ) (log n) −c 2 (as we remarked slightly less precisely after Definition 9), which is at most (log n) −c 2 if v(W ) ≤ 0. In particular, the probability that an edge e is contained in r cores that are pairwise disjoint apart from their intersection on e is at most (log n) −rc 2 . If r = (log n) 2 then this is much less than 1/n 2 , and therefore almost surely no edge is contained in (log n) 2 cores of the above form.
In general, the cores containing e need not be disjoint. This adds a complication, and we need to introduce a few definitions to handle it, but the main reason Lemma 13 holds is the one given in the previous paragraph. The next definition describes the kind of colour configuration which -if it occurs in G 0 -can produce many cores in G 1 (that is, cores of K t s in G 1 that we view as schemes) that contain a given edge xy. Soon we shall argue that almost surely no such large configuration occurs in G 0 .
Definition 15. An abstract core container W is a colour configuration whose nodes are {x} ∪ {y} ∪ Z and in which every z ∈ Z is contained in at least one abstract core, where an abstract core is defined as follows.
An abstract core in a core container is an induced subconfiguration S containing x and y such that for any induced subconfiguration S ′ ⊂ S containing x, y, we have v(S ′ ) ≥ v(S) and such that for any two distinct u, v ∈ S there is a unique colour that labels both u and v.
The size of a core container is the number of nodes it contains. A core container is irreducible if it is not possible to remove a label or colour and still have a core container.
Note that as the vertices of G 0 are coloured, we can naturally talk about G 0 containing various colour configurations. We shall now establish that: 1. If an edge in G 1 is contained in many cores, then there is a large irreducible core container in G 0 .
2. There are not too many irreducible abstract core containers of fixed size.
3. The expected number of occurrences in G 0 of any large abstract core container is small.
The last two points will imply that almost surely there is no large irreducible core container in G 0 , which in turn implies that there is no edge in G 1 that is contained in many cores.
Lemma 16. If the edge e = uv is contained in at least r cores of K t s in G 1 , then there is an irreducible core container W in G 0 with x = u, y = v (as in Definition 15) and with size between 1 2 r 1/t and tr.
Proof. Define a colour configuration W 0 as follows. Arbitrarily pick r cores that contain e. The set of nodes of W 0 is the set of vertices of G 1 that are in one of these r cores. The set of colours is the set of those colours in G 0 that appear at least twice on this set of nodes. This does indeed define a core container, since any core of a K t in G 1 that contains e satisfies the two properties required of an abstract core in W 0 : the minimality of v follows from the definition of a core, and the condition about the colours follows from the fact that the K t belongs to G 1 . How many nodes does W 0 have? Any core consists of between 2 and t nodes, so if the number of nodes of W 0 is h, then r ≤ 2≤j≤t h j ≤ (2h) t . Thus, h ≥ 1 2 r 1/t . On the other hand, h ≤ rt, since the vertex set of W 0 is a union of r cores. Now remove labels or colours as long as we still get a core container; the object we end up with is an irreducible core container of the required size.
Lemma 17. The number of distinct irreducible abstract core containers of size h is O(ht 2 · e ht 2 · h ht 2 ).
Proof. First we shall prove that the number of labels in an irreducible core container of size h is at most 2h t 2 ≤ ht 2 . For any occurrence of a colour D at some node u (that is, for any label (u, D)), there must exist v ∈ {x} ∪ {y} ∪ Z such that every core containing v contains u and the colour D, or else we could remove the occurrence of D at u and still have a core configuration. But for any v, there are at most 2 t 2 such pairs (u, D), since u must belong to the intersection of the vertex sets of the cores containing v, and in a given core there are at most 2 t 2 labels (because in the K t that contains the core, each colour contains at least two vertices and no two colours contain the same pair of vertices).
So there are at most ht 2 choices for the total number of labels. Since the partition function p(n) is O(e n ), it follows that for each possibility for the number of labels, there are O(e ht 2 ) choices for the number of occurrences for each colour class. Suppose we have b colours and the numbers of times that they occur are l 1 , . . . , l b . Then the number of choices for the vertices labelled by these colours is at most h Next, we shall investigate how many copies we expect to have in G 0 of a given abstract core container. Let W be more generally any colour configuration with h nodes, b colours and l labels. Then the expected number of occurrences of such a configuration is at most n h m b γ l . Indeed, the number of ways of choosing the h nodes is at most n h . The number of ways of choosing the b colours is at most m b . And for each label, the probability that the given node receives the given colour is γ, and all these events are independent, so the probability that any given choice of nodes and colours realizes the scheme is γ l .
Definition 18. We call n h m b γ l the frequency of the configuration W and denote it by ω(W ).
Lemma 19. Let W be an abstract core container of size h. Then To prove this result, we will kill some of the nodes and colours and remove some of the labels of the core container in steps. To keep track of which nodes and colours have been killed, we introduce the following definition.
Definition 20. A partial configuration P consists of four pairwise disjoint sets {x}, {y}, Z 0 and Z 1 of nodes, and two disjoint sets B 0 , B 1 of colours that label those nodes in such a way that any B ∈ B 1 labels at least two nodes. We write B for B 0 ∪ B 1 and Z for Z 0 ∪ Z 1 .
We now generalize the notion of frequency to this setting, which can be thought of as the expected number of occurrences of the colour configuration for given choices of the nodes in Z 0 and colours in B 0 , which represent the nodes and colours that have already been killed. Thus, we let r = |{x} ∪ {y} ∪ Z 1 | be the number of nodes yet to choose, we let g = |B 1 | be the number of colours yet to choose, and we let u be the total number of labels, including the labels on nodes in Z 0 and of colours in B 0 . Then we can choose the remaining nodes in at most n r ways and the remaining colours in at most m g ways, and for each label there is a probability γ that the given node receives the given colour. So we define the frequency ω(P ) to be n r m g γ u .
We shall define the P j recursively. In what follows we use the notation of Definition 15 and Definition 20. When there is ambiguity, we will write Z 0 (P ) to mean Z 0 in the partial configuration P , and similarly for Z 1 , B 0 , B 1 . The set of all nodes (respectively, colours) for every P j will be the same as the set of all nodes (respectively, colours) of W , namely Z (respectively, B). However, B 0 , B 1 , Z 0 , Z 1 and the labels will be different for the various P j .
Let us define P 0 to be the partial configuration whose nodes, colours and labels are the same as those of W and which has Z 0 = B 0 = ∅. Then ω(P 0 ) = ω(W ).
Given P j−1 with Z 1 (P j−1 ) = ∅, we define P j as follows. Pick some z ∈ Z 1 (P j−1 ) arbitrarily. As W is a core container, we can choose an abstract core S in W that contains z. Let S 1 = S ∩ Z 1 (P j−1 ). Let D be the set of those colours B ∈ B 1 (P j−1 ) that occur at least twice on S in P j−1 . Then let the sets of nodes of P j be Z 0 (P j ) = Z 0 (P j−1 ) ∪ S 1 and Z 1 (P j ) = Z 1 (P j−1 ) \ S 1 , and let the sets of colours be B 0 (P j ) = B 0 (P j−1 ) ∪ D and B 1 (P j ) = B 1 (P j−1 ) \ D. The labels of P j are those of P j−1 except that all occurrences of colours in B 0 (P j ) are removed from S. It is clear that P j is a partial configuration.
ω(S) , where S and S \ S 1 are identified with their induced subconfigurations from W .
Proof of Claim. The contribution of the nodes is (a factor of) n −|S 1 | to both and ω(S\S 1 ) ω(S) . Hence it suffices to prove that the contribution of any colour (and its labels) to ω(P j ) ω(P j−1 ) is at least as much as its contribution to ω(S\S 1 ) ω(S) . There are two cases to consider. Case 1. If B is a colour that occurs at most once on S in W , then its contribution to ω(S\S 1 ) ω(S) is 1, whereas its contribution to ω(P j ) ω(P j−1 ) is at least 1. (Indeed, since mγ 2 < 1, the contribution of any colour to ω(P j ) ω(P j−1 ) is at least 1). Case 2. Suppose, then, that B is a colour that occurs at least twice on S in W . Case 2a. If B ∈ B 0 (P j−1 ), then let d be the number of occurrences of B on S 1 in W . The contribution of B to ω(S\S 1 )

ω(S)
is at most γ −d . Note that any node in S 1 (and in fact more generally in Z 1 (P j−1 )) that is labelled by B in W is also labelled by B in P j−1 . Therefore, the contribution of B to ω(P j ) ω(P j−1 ) is at least γ −d . Case 2b. If B ∈ B 1 (P j−1 ), then let d be the number of occurrences of B on S in W . The contribution of B to ω(S\S 1 )

ω(S)
is at most m −1 γ −d . Note that any node that is labelled by B in W is also labelled by B in P j−1 . Therefore, the contribution of B to This completes the proof of the claim. Since S is an abstract core in W , we have v(S) ≤ v(S \ S 1 ), by the minimality of S. Because S 1 = ∅, and every node in a core has a label on it, it follows that, considering S and S \ S 1 as induced subconfigurations of W , we have ω(S \ S 1 ) ≥ ω(S)(log n) c 2 . Using the claim above, the inequality ω(P j ) ≥ ω(P j−1 )(log n) c 2 follows.
Eventually we obtain a partial configuration P j with Z 1 (P j ) = ∅. When this happens, we set k = j. By definition, we have in that case that ω(P k ) = n 2 m g γ u where g = |B 1 (P k )| and u is the number of labels in P k . Since any B ∈ B 1 (P k ) labels at least two nodes in P k and mγ 2 ≤ 1, we find that ω(P k ) ≤ n 2 . Also note that |Z 1 (P j )| ≥ |Z 1 (P j−1 )| − t for any j, and |Z 1 ( We are now in a position to complete the proof of Lemma 13. Proof of Lemma 13. By Lemma 16, it suffices to prove that in G 0 the expected number of irreducible core containers of size between log n and (log n) 3t is o(1).
Claim. If log n ≤ h ≤ (log n) 3t , then the expected number of irreducible core containers of size h in G 0 is at most cn 2 (log n) −t 3 h for some absolute constant c.
But h≥log n cn 2 (log n) −t 3 h = o(1), and the proof is complete.

The proof of Lemma 12
Our proof is based on the following two observations.
1. For any set of vertices A of size a, G 0 [A] contains many monochromatic s-cliques with pairwise distinct colours.
2. If a monochromatic s-clique is present in G 0 , then it is present also in G 1 with high probability, and, crucially, the events that various s-cliques are preserved are "sufficiently independent".
First, we shall construct a small set of bipartitions of the set of colours with a suitable property. In a moment it will become clear why we need this. We will refer to the two parts of a bipartition as the first part/first half and the second part/second half.
Lemma 21. There exists a constant c and a set Π of c log n partitions of the set of m colours, each into two sets of size m/2, such that for any two distinct colours C and D there is a π ∈ Π such that D is contained in the first part of π and C is contained in the second part of π.
Proof. Take l = c log n random partitions. For any C, D, the probability that none of the partitions is suitable is less than (1 − 1 5 ) l = n −c log(5/4) . For c sufficiently large this is less than n −2 , which is in turn less than m −2 and the result follows from the union bound over all choices of C, D.
Let xy be an edge in G 0 . Recall that it is not an edge in G 1 if x, y have at least two colours in common. Suppose that this is the case. Then there exists some π ∈ Π such that x and y have a colour in common from the first half of π and also a colour in common from the second half of π.
Remark. From now on, when we say "the first m/2 colours", we will mean "the m/2 colours in the first part of π" provided it is clear which π we are talking about.
Definition 22. A pair (x, y) of vertices is π-dangerous for some π ∈ Π if there is a colour class among the first m/2 colours that contains both x and y.
Fix a set A of vertices with |A| = a. Let D be the collection of colours D such that at least one K s inside A is entirely coloured with colour D in G 0 . (We require that every edge is given by this colour: that is, the vertices of the K s are in different parts of the complete s-partite graph with colour D.) For each π ∈ Π, let D π be the set of all D ∈ D such that D is one of the last m/2 colours.
To make sense of the statement of the next lemma, the reader should recall that aγ is significantly less than 1. (See the beginning of Section 2 for their precise values.) ), |D π | = Ω(ma s γ s ) for every π ∈ Π.
Proof. Let C be any colour class. The probability that C intersects A in exactly s elements is where the last inequality follows from the fact that aγ = n 2α−1 (log n) c 3 −c 2 = o(1).
Hence P[C ∈ D] = Ω(a s γ s ). Moreover, the events {C ∈ D} are independent. Thus, for any π, by the Chernoff bound we get P |D π | = o(ma s γ s ) ≤ e −Ω(ma s γ s ) . Therefore, using the union bound over all π ∈ Π, it suffices to prove that (log n)e −Ω(ma ).
But n a ≤ ( en a ) a = e a log( en a ) = e O(a log n) . Hence, we need (log n)e −Ω(ma s γ s ) = o(e −O(a log n) ). For this, it is enough to prove that a log n = o(ma s γ s ), ie. log n = o(ma s−1 γ s ). Since Therefore, using the union bound over all sets A of size a, we may assume that |D π | = Ω(ma s γ s ) for every π ∈ Π and every such set A.
Lemma 24. With probability 1 − o(1) the following holds. For every A of size a and for every π ∈ Π, the density of π-dangerous pairs in A is o( 1 log n ). This result, which we shall prove later, allows us to assume for our fixed set A that the following statement holds.
(⋆) For any π ∈ Π, the density of π-dangerous pairs in A is o( 1 log n ).
For each C ∈ D, pick a K s uniformly at random in G 0 [A] of colour C, and call it T C . We can now prove that with sufficiently high probability, most T C will be present in G 1 .
Lemma 25. Let π ∈ Π. Then with probability 1 − o( ), the number of colours C ∈ D π for which T C has a π-dangerous pair of vertices is o( |Dπ| log n ).
Proof. We condition everything on the already chosen first m/2 colour classes. Now let C ∈ D π . (Recall that this means that there is a K s in A in the graph G 0 with all its edges of colour C, and moreover that C is one of the last m/2 colours with respect to π.) Label the vertices of T C by 1, 2, ..., s. Note that any pair of vertices in A is chosen with equal probability and, by condition (⋆), at most o( |A| 2 log n ) of them are π-dangerous. So the probability that the first two vertices of T C form a π-dangerous pair is o( 1 log n ). Hence, for any C ∈ D π , the probability that T C has a pair of vertices which form a πdangerous pair is bounded above by some p = o( 1 log n ). Moreover, this holds for all such C independently of the others. Thus, the probability that T C contains a π-dangerous pair for more than Ω( |Dπ| log n ) choices of C ∈ D π is at most P Bin(|D π |, p) = Ω( |Dπ| log n ) .
But this is e −Ω( |Dπ | log n ) . So it remains to show that (log n) n a = o(e Ω( |Dπ | log n ) ). Since n a ≤ ( en a ) a = e O(a log n) , it suffices to prove that a log n = o( |Dπ| log n ). But |D π | = Ω(ma s γ s ) so it is enough to prove that (log n) 2 = o(ma s−1 γ s ). This holds provided that (s − 1)c 3 − sc 2 − c 1 > 2.
), for all but o(|D|) colours C ∈ D, all the edges of T C are present in G 1 .
Proof. Suppose that C ∈ D and T C has an edge e which is not present in G 1 . Then there exists some π ∈ Π such that C is in the second half of π (so C ∈ D π ) and e is π-dangerous. But by the previous lemma, with ) the number of such colours C is o(|Π| · |D| log n ) = o(|D|). Using Lemma 23 and the union bound over all A, Lemma 12 follows. We now return to proving Lemma 24. Recall that we want to show that almost surely for every A and every π, the density of π-dangerous pairs in A is o( 1 log n ). This is essentially best possible, since if we choose A to contain one of our colour classes entirely (for a colour chosen from the first part of π), then the pairs of vertices in that colour class will all be π-dangerous. Moreover, as the typical size of a colour class is nγ = n α (log n) −c 2 = a(log n) −c 2 −c 3 , the set of these pairs will have density roughly (log n) −2c 2 −2c 3 .
Accordingly, the next lemma is to make sure that no colour class is exceptionally large. So we may assume that all colour classes have size at most 2nγ. After applying the union bound over all π ∈ Π and A, the next result completes the proof of Lemma 12.
Lemma 28. Fix π ∈ Π and a set A of size a. With probability 1 − o( ), the number of pairs in A which are π-dangerous is at most 4 a 2 (log n) 2 .
Proof. The number of π-dangerous pairs in A is at most Let h = a m 1/2 log n . Note that log h = (α− 1 2 δ) log n+O(log log n) and recall that α > 1 2 δ. Now let p = P (Bin(a, γ)  Therefore we may assume that at most m 1/2+ρ of the random variables Bin(a, γ) take value more than h.
The total contribution to (6) of the terms with Bin(a, γ) ≤ h is at most mh 2 ≤ a 2 (log n) 2 . The total contribution of the terms with Bin(a, γ) ≥ h is bounded above by and we just need to show that this sum is less than 3 a 2 (log n) 2 with probability 1 − o( ).

The proof of Lemma 8
It is convenient to introduce the parameter if (s, t) is exceptional Remark. −η is the contribution of a block of size two to the value of a scheme. It is easy to check that η > 0.
The next lemma follows easily from Definition 7 and is a convenient way to look at the value of a scheme.
Lemma 29. Let Q be a scheme. Then where D is the set of colours in Q and |D| is the number of nodes in Q that are coloured with D.
We shall now identify a scheme for which equality in Lemma 8 will hold: the value of α was chosen so that the value of this scheme would be 0. This is the (in)equality that generalizes equation (5) from the introduction. This "extremal scheme" turns out to be different in the regular and the exceptional case, which is why the formula for α also differs in the two cases.
Definition 30. Let Q 1 be the scheme where one colour gives a block of size s and the rest of the edges are given by pairwise distinct colours.
Let Q 2 be the scheme where one colour gives a block of size s, another gives a block of size t − s + 1 sharing a single vertex with the previous block and the rest of the edges are given by pairwise distinct colours.
and (a) follows by direct substitution. We also have v(Q 2 ) = t + (δ + s(α − 1)) + (δ + (t − s + 1)(α − 1)) and (b) follows by direct substitution. The difference between Q 1 and Q 2 is that the former contains t−s+1 2 edges of distinct colours where the latter contains a block of size t − s + 1. Using Lemmas A.1 and A.2 (a) from the appendix, we obtain statements (c) and (d).
Definition 32. We call a block in a scheme large if it has size at least 3 and small otherwise. We call it an s-block if it has size s.
We shall begin by proving Lemma 8 in the special case when there is an s-block in the scheme.
Lemma 33. If Q is a scheme and it has an s-block then v(Q) ≤ 0.
Proof. Assume that Q is such that v(Q) is maximal. It is enough to show that Q = Q 1 or Q = Q 2 . Since Q has an s-block, any other block must have size at most t − s + 1. By Lemmas A.1 and A.2 (c) from the appendix, any large block of size smaller than t − s gives a smaller contribution to the value than one obtains if the corresponding edges have pairwise distinct colours. Therefore, we may assume that Q has no such block. So every block in Q, other than the one of size s, has size 2, t − s or t − s + 1. If there is a block of size t − s + 1, then Q = Q 2 . If there are no large blocks, then Q = Q 1 . Otherwise, there is a block of size t − s ≥ 3.
We may therefore assume that there are at least two large blocks other than the one of size s, and that both have size t − s. This forces t − s to equal 3. Moreover, by Lemmas A.1 and A.2 (b), we have that t = 2s − 1. It follows that s = 4 and t = 7. So Q consists of a 4-block and several 3-blocks (there can be at most 3) and the rest of the edges are given by distinct colours. It is easy to check that in this case v(Q) ≤ 0.
Using the previous result, to prove Lemma 8, it is sufficient to prove the following statement.
Lemma 34. Suppose that Q is a scheme with v(Q) as large as possible. Assume also that Q does not contain a block of size s. Then v(Q) ≤ 0.
To prove Lemma 34, we shall introduce the following definition.
Definition 35. Let P be a node in a scheme. The local value at P , which we denote by v(P ), is defined by the formula v(P ) = 1 + D:P ∈D (δ/|D| + (α − 1)), where the summation is over all blocks containing P .
The next result is the key part in the proof of Lemma 8.
Lemma 37. Suppose that Q is a scheme such that v(Q) is maximal. Let P be a node and assume that every block containing P has size less than t/2. Then v(P ) < 2δ/t.
Proof. Let the blocks of Q that contain P have sizes r 1 , ..., r u . Then i r i = t+ u −1. Let k be the minimal integer greater than 2 that is equal to some r i (or, if no such integer exists, then let k be large enough that δ/k − δ/(k + 1) < η/2). Let R = ⌊ t−1 2 ⌋. By assumption, r i ≤ R for all i. Moreover, by the maximality of v(Q) and Lemma A.1, we have the inequality kη ≥ δ and therefore δ/k −δ/(k +1) = δ k(k+1) ≤ η k+1 < η/2.

Claim 1.
There exist positive integers w and q 1 , ..., q w such that (i) 2 ≤ q j ≤ R for all j (ii) j q j = t + w − 1 (iii) There is at most one j for which 2 < q j < k and if there is any i with q i = 2, then there is no j with 2 < q j < k.
This completes the proof of Claim 1.
We are ready to complete the proof of Lemma 8.
Proof of Lemma 8. We may assume that v(Q) is maximal possible among all schemes Q. If Q has a block of size s, then we are done by Lemma 33. Otherwise, by Lemma 38, there is no block of size greater than or equal to t/2. But then Lemma 36 and Lemma 37 together imply that v(Q) ≤ t 2δ t − (δ + 2α) = δ − 2α < 0.