A tight Erd\H{o}s-P\'osa function for planar minors

Let $H$ be a planar graph. By a classical result of Robertson and Seymour, there is a function $f:\mathbb{N} \to \mathbb{R}$ such that for all $k \in \mathbb{N}$ and all graphs $G$, either $G$ contains $k$ vertex-disjoint subgraphs each containing $H$ as a minor, or there is a subset $X$ of at most $f(k)$ vertices such that $G-X$ has no $H$-minor. We prove that this remains true with $f(k) = c k \log k$ for some constant $c=c(H)$. This bound is best possible, up to the value of $c$, and improves upon a recent result of Chekuri and Chuzhoy [STOC 2013], who established this with $f(k) = c k \log^d k$ for some universal constant $d$. The proof is constructive and yields a polynomial-time $O(\log \mathsf{OPT})$-approximation algorithm for packing subgraphs containing an $H$-minor.


Introduction
In 1965, Erdős and Pósa [15] proved that there is a function f (k) = O(k log k) such that for every graph G and every k ∈ N, either G contains k vertex-disjoint cycles, or there is a set X of at most f (k) vertices such that G − X is a forest. Many variants and generalizations of this theorem have been developed over the years, such as for cycles satisfying various constraints [4, 6, 7, 17, 29, 31-35, 40, 41, 49-51], directed cycles [28,44], matroid circuits [24], and immersions [25,36]; see [42,43] for surveys.
In this paper, the objects of interest are graph minor models. A graph H is a minor of a graph G if H can be obtained from a subgraph of G by contracting edges. If H is not a minor of G, then G is said to be H-minor free. A preliminary version of this paper will appear as an extended abstract in the Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '19) [8].
build on their approach. Our main technical contribution is a series of lemmas which allowed us to develop the 'right' generalization of the objects used in [1]. An overview of the proof will be given shortly but first let us mention some combinatorial and algorithmic consequences of our result.

Consequences of our results
We describe in this section several consequences of our results. Their proofs are given in Sections 9 and 10.
Approximation algorithms for packing and covering models. Our proof of Theorem 1.1 is constructive, in the sense that it can be turned into a polynomial-time algorithm computing both a collection C of k disjoint H-models in the input graph G, and a subset X of at most c k log(k + 1) vertices such that G − X has no H-model, for some constant c depending on the constant c in Theorem 1.1 and for some k ∈ N. Note that C, X together witness the fact that (1) |X| is within a O(log τ H (G)) factor of τ H (G) (since k ≤ τ H (G)), and (2) |C| is within a Ω 1 log ν H (G) factor of ν H (G) (since k ≤ ν H (G) ≤ c k log(k + 1)). Thus, we get O(log(OPT))-approximation algorithms for both the packing and covering problems associated to planar H-models. The result for covering is already known. In fact, for every planar graph H, there is even a constant factor approximation algorithm for computing τ H (G). Indeed, a randomized constant factor approximation was first developed by Fomin, Lokshtanov, Misra, and Saurabh [21], and very recently a deterministic one was obtained by Gupta, Lee, Li, Manurangsi, and Włodarczyk [27].
On the other hand, the result for packing is new. It is also close to best possible in the following sense: When H = K 3 the packing problem corresponds to the well-studied problem of packing cycles, which is known to be quasi-NP-hard to approximate to within a ratio of O(log 1 2 −ε OPT) [23]. We note also that when H is a forest, ν H (G) can be approximated to within a constant factor [19].
Large treewidth graph decompositions. A second consequence of our main theorem is the following partitioning corollary.

Corollary 2.2.
There is a function s 2.2 : N → N such that for all integers r, k ≥ 1, every graph G of treewidth at least s 2.2 (r) · k log(k + 1) has k vertex-disjoint subgraphs G 1 , . . . , G k , each of treewidth at least r.
In particular, the treewidth of every graph not containing k disjoint copies of a fixed planar graph H as a minor is O(k log k), where the hidden constant depends on H. This is best possible when H contains a cycle (see the paragraph following Theorem 1.1). A similar result with an s(r) · k log d (k + 1) bound for some universal constant d was obtained by Chekuri and Chuzhoy [10,Theorem 1.1]. Again, we remark that, while the poly-logarithmic dependency on k in their bound is not optimal, their theorem has the extra advantage that s can be taken as a polynomial, which is not the case in our proof of Corollary 2.2.
Computing minor-closed bidimensional parameters. Let π be a graph parameter, that is, a function mapping graphs to integers and that is constant within each isomorphism class. We say that π is minor-closed if π(H) ≤ π(G) for every minor H of every graph G. In [10], Chekuri and Chuzhoy gave algorithms to compute graph parameters satisfying certain conditions. Theorem 2.3 ([10, Theorem 5.3]). Let π be a minor-closed parameter that is positive on all graphs with treewidth at least p, is at least the sum over the components of a disconnected graph, and can be computed in time h(w)n O(1) given a tree-decomposition of width w of the graph.
Then there is a constant d and an algorithm that, given an n-vertex graph G and an integer k, decides whether π(G) ≤ k in time 2 O(p 2 k·log d pk) + h(O(p 2 k · log d pk)) n O (1) .
Note that the requirements of Theorem 2.3 are satisfied by several well-studied parameters such as feedback vertex set, vertex cover, and more generally any packing or covering problem of models of a fixed planar graph (as described at the beginning of the section). By plugging the improved bounds of our partitionning result Corollary 2.2 in the proof of Theorem 2.3 from [10], we obtain the following result.
Corollary 2.4. Let π be a minor-closed parameter that is positive on all graphs with treewidth at least p, is at least the sum over the components of a disconnected graph, and can be computed in time h(w)n O(1) given a tree-decomposition of width w of the graph.
Then there is an algorithm that, given an n-vertex graph G and an integer k, decides whether π(G) ≤ k in time 2 O(s 2.2 (p)k log k) + h(O(s 2.2 (p)k log k)) n O (1) .
Observe that Corollary 2.4 improves the dependence on k of the algorithm from Theorem 2.3, at the cost of a worse dependence on p. However, in the natural setting where π is fixed and we want to check π(G) ≤ k for various pairs (G, k), p is a constant so its contribution is less relevant. As noted in [10], the requirements on π can also be stated as follows.
Corollary 2.5. Let π be a minor-closed parameter that is positive on some t-vertex planar graph H, is at least the sum over the components of a disconnected graph, and can be computed in time h(w)n O(1) given a tree-decomposition of width w of the graph.
Then there is a function s 2.5 and an algorithm that, given an n-vertex graph G and an integer k, decides whether π(G) ≤ k in time Erdős-Pósa property in minor-closed classes. For a graph H and a class G of graphs, we say that the Erdős-Pósa property holds for H-models in G if there exists a bounding function f : N → R such that τ H (G) ≤ f (ν H (G)) holds for every graph G ∈ G. Restricting the class G sometimes yields improved bounding functions. For instance, while the bounding function in the classic Erdős-Pósa theorem is Θ(k log k), it can be improved to O(k) when restricted to planar graphs [3]. In fact, this is true more generally for H-models for any fixed planar graph H when restricted to any proper minor-closed class G, as shown by Fomin, Saurabh, and Thilikos [22]. Theorem 2.6 (Fomin, Saurabh, and Thilikos [22]). Let G be a proper minor-closed graph class and let H be a planar graph. Then there exists a constant c := c(G, H) such that the Erdős-Pósa property holds for H-models in G with bounding function f (k) = ck.
As it turns out this theorem also follows directly from our main technical theorem (stated in the next section).
Packing cycles with modularity constraints. In 1988, Thomassen obtained the following modularity-constrained variant of the Erdős-Pósa theorem: Theorem 2.7 (Thomassen [49]). For every m ∈ N there is a function f such that, for every k ∈ N and every graph G, either G contains k vertex-disjoint cycles of length 0 modulo m, or there is a subset X of at most f (k) vertices such that G − X has no such cycle.
Wollan [51] obtained a similar statement for cycles with non-zero length modulo m, when m is odd. As proved by Dejter and Neumann-Lara [14], the same statement does not hold in general for cycles of length m modulo m, when m ∈ [m − 1]. Thomassen's upper-bound f (k) = 2 2 O(k) (for fixed m) has later been improved to f (k) = O(k log d k) for some d by Chekuri and Chuzhoy [10], who used a partitioning theorem similar to our Corollary 2.2. As a consequence of our main theorem, we obtain a O(k log k) bounding function for cycles of length 0 modulo m, which is the same as in the original Erdős-Pósa Theorem.
Corollary 2.8. For every positive integer m there is a constant c := c(m) such that, for every k ∈ N and every graph G, either G contains k vertex-disjoint cycles of length 0 modulo m, or there is a subset X of at most c · k log(k + 1) vertices such that G − X has no such cycle.
Extremal graphs showing that this bound is tight (up to the value of c) can be obtained from extremal graphs for the original Erdős-Pósa Theorem by subdividing every edge m − 1 times. We actually prove a stronger statement about modularity-constrained subdivisions of planar subcubic graphs, whose proof we postpone to Section 10.

Overview of the proof
In this paper, all logarithms are binary. Unless otherwise specified, the graphs we consider are finite, simple, and undirected. In particular, when contracting edges of a graph, we subsequently delete resulting loops and parallel edges. Let G be a graph. We use |G| and G as shorthand for |V (G)| and |E(G)|, respectively.
A separation of a graph G is a pair (A, B) of subsets of V (G) such that A ∪ B = V (G) and G has no edge from A \ B to B \ A. Observe that our definition allows A or B to be empty. The order of the separation is |A ∩ B|.
The heart of our proof is the following technical theorem.
Theorem 3.1 (Main technical theorem). For every p ∈ N, every planar graph H, and every non-decreasing function g with g(0) = 1, there is a constant σ ∈ N such that for every graph G, at least one of the following holds.
(i) G contains an H-model of size at most σ; (ii) G contains a K p -model of size at most σ log |G|; (iii) G has a separation (A, B) of order at most σ such that G[A] does not contain H as a minor and |A| ≥ g(|A ∩ B|).
Theorem 1.1 follows quickly from Theorem 3.1 using previous results. We give the derivation in Section 5. Thus, it only remains to prove Theorem 3.1.
To give a high-level idea of our proof strategy for Theorem 3.1, we sketch it for the case H = K 3 . Note that every cycle in our graph G is a K 3 -model. First, we consider a maximum-size collection P of paths of length ω, for some large enough constant ω. Assume for simplicity that these paths cover all vertices of G. If one of the paths in P is not induced, we find a cycle of length at most ω. Similarly, if two of these paths are connected by at least two edges, we get a cycle of length at most 2ω. In both cases, we find a K 3 -model of size at most 2ω ≤ σ for a suitable choice of the constant σ, and (i) is satisfied. Thus we may assume this does not happen.
Then, we consider the auxiliary graph G on vertex set P where two vertices are adjacent if the corresponding paths are connected by an edge in G. If G has large enough minimum degree (as a function of p), then a known result (see Theorem 4.4 in the next section) yields a K p -model of size O(log |G |) in G , which translates into a K p -model of size O(ω log |G|) in G, which is outcome (ii).
Hence, we may assume that G has a vertex of degree bounded by some function of p.
Then the corresponding path P ∈ P has neighbors in only a few other paths of P. By letting A := V (P ) and letting B be the rest of the graph plus the vertices of A with a neighbor in G − A, we obtain outcome (iii) (assuming ω has been chosen large enough).
While the arguments leading to outcomes (ii) and (iii) above work for all planar graphs H, this approach fails in general as the existence of many edges between two paths of P does not always yield a small H-model (outcome (i)).
The aforementioned result of [1] for the case where H is a wheel avoids this difficulty by packing paths and cycles instead of just paths. However, this technique breaks down when trying to pack subgraphs having a vertex of degree at least 3.
In our proof, we addressed this difficulty by introducing a family of objects called orchards and considering orchard packings as a counterpart to the family P of paths/cycles. Roughly speaking, orchards have the property that two disjoint orchards connected by many edges either can be combined into more desirable structures (in the same sense that two paths connected by two edges induce a cycle in the proof sketch above), or the orchards can be separated in a 'clean way' from each other using a small set of vertices. This allows us to conclude similarly as above. However, the proof is more involved.
The rest of the paper is organized as follows. The next section contains the general definitions and results we use. In Section 5 we prove Theorem 1.1 assuming Theorem 3.1.
Orchards and orchard packings are introduced in Section 6 and Section 7, along with some key separation lemmas. Using these results we finally prove Theorem 3.1 in Section 8. The proofs of the algorithmic and combinatorial consequences of our results stated in Section 2 are given in Sections 9 and 10, respectively.

Preliminaries
A tree-decomposition of a graph G is a tree T together with subsets B t of V (G) for each t ∈ V (T ) satisfying • for each uv ∈ E(G), there exists t ∈ V (T ) such that u, v ∈ B t , and • for each v ∈ V (G), the set of all w ∈ V (T ) such that v ∈ B w induces a subtree of T .
The width of the tree-decomposition is max t∈V (T ) {|B t |−1}. The treewidth of G, denoted tw(G), is the minimum width taken over all tree-decompositions of G.
Theorem 4.1 (Robertson and Seymour [46]). There exists a function f 4.1 : N → N such that for every t ∈ N, every graph of treewidth at least f 4.1 (t) contains every t-vertex planar graph as a minor.
By the results of Chekuri and Chuzhoy [11,12], f 4.1 can be bounded from above by a polynomial function.
We do not directly use tree-decompositions in this paper. Instead, we use the following dual notion. A bramble B in a graph G is a collection of vertex sets of connected subgraphs of G, called bramble sets of B, such that for all B, B ∈ B, |B ∩ B | ≥ 1 or there is an edge between B and B . The order of B is the minimum size of a set W ⊆ V (G) such that W intersects all bramble sets.
Theorem 4.2 (Seymour and Thomas [47]). Let k ≥ 0 be an integer. A graph has treewidth at least k if and only if it contains a bramble of order at least k + 1.
We also require the following two theorems.
Theorem 4.4 (Fiorini, Joret, Theis, and Wood [18], see also [37,48]). There is a function f 4.4 : N → N such that, for every n, p ∈ N, if an n-vertex graph has average degree at least f 4.4 (p), then it contains a K p -model on O(log n) vertices.

From the main technical theorem to the main theorem
In this section, we show how Theorem 1.1 can be deduced from Theorem 3.1. We follow the same line of proof as in [1] by considering a minimal counterexample and showing that the outcomes of Theorem 3.1 contradict its minimality. By minor-minimal we mean minimal with respect to the minor ordering. We rely on the following results.
Theorem 5.2 as originally stated in [19] does not guarantee that f 5.2 is non-decreasing. We can however easily obtain this property by defining f 5.2 (k) = max i∈{0,...,k} f (k), with f the function given in [19], and clearly f 5.2 then has the properties claimed in Theorem 5.2.  We are now ready to prove Theorem 1.1, assuming Theorem 3.1.
Proof of Theorem 1.1. Let us first assume that H is connected. We explain at the end of the proof how the result extends to disconnected graphs.
Let α and β be positive integers such that for every integer k ≥ 1, we have p 5.1 (k) ≤ αk β , where p 5.1 is the function of Theorem 5.1 for H. Such numbers exist as this function is a polynomial.
Let f 5.2 be the function of Theorem 5.2 for the graph H. Clearly we can assume f 5.2 (0) = 1. Let σ be the constant of Theorem 3.1 for p = |H| and for the function f 5.2 . We prove Theorem 1.1 for f (k) = ck log(k + 1), where c is a positive integer such that c ≥ σ(log α + β log c + 2β).
Towards a contradiction, suppose τ H (G) > f (ν H (G)) for some graph G. Among all such graphs, we choose G such that the tuple (ν H (G), |G|, G ) is lexicographically minimum. Let k = ν H (G) ≥ 1.
We apply Theorem 3.1 on G with p = |H| and g = f 5.2 . According to Theorem 5.2, the outcome (iii) of Theorem 3.1 implies the existence of a graph G such that This would however contradict the minimality of G. Therefore we may now assume that one of the first two outcomes of Theorem 3.1 holds. Which of the two outcomes holds is not important for the rest of the proof, as we will only use the fact that G contains a model M of H of size at most σ · log |G|, which is true in both cases. Using properties of G we will show that |V (M)| ≤ σ · log |G| ≤ c log(k + 1). Once this is established, using that the graph G − V (M) is not a counterexample to Theorem 3.1, we will conclude that G cannot be a counterexample either.
The definition of G implies that it is minor-minimal with the property τ H (G) > f (ν H (G)). Thus, if G is a proper minor of G, In particular, G is minor-minimal with the property τ H (G) > f (k). Now, observe that τ H (G) ≤ τ H (G − v) + 1 for any vertex v ∈ V (G) (simply add v to an optimal hitting set for G − v).
By minimality of G, we have τ H (G ) ≤ f (ν H (G )). Then Therefore, G is not a counterexample, a contradiction.
We now consider the case where H is not connected. Let H be a planar connected graph with V (H ) = V (H) and E(H ) ⊇ E(H). Such a graph can be obtained from planar drawings of the components of H by adding edges between their external faces in a planar way. As shown in the first part of the proof, there is a bounding function f (k) = c k log(k + 1) for H -models, for some constant c depending on H only. By applying Lemma 5.3 to H , f , and H, we obtain a bounding function f for H-models which is of the same order of magnitude as f , as desired.

Orchards
We prove in this section a series of lemmas about bramble-like objects that we call orchards. Given positive integers a, b, an a × b-orchard R in G is a collection P 1 , . . . , P a of a pairwise vertex-disjoint paths, called horizontal paths, and a collection T 1 , . . . , T b of b pairwise vertex-disjoint trees, called vertical trees, such that • P i ∩T j is non-empty and connected (and thus a path) for each i ∈ [a] and j ∈ [b], and • each leaf of T j is on some horizontal path, for each j ∈ [b].
With a slight abuse of notation we also write R for the subgraph formed by the union of the horizontal paths and vertical trees of R. It should be clear from the context whether R means the orchard itself or the corresponding subgraph of G. Orchards are similar to brambles in the sense that they can serve as certificates for large treewidth. In fact, every large enough orchard contains a bramble of large order (see the proof of Lemma 6.1). However, they are more structured, which makes them easier to handle. We note that grids are particular examples of orchards. Thus, in this sense orchards lie somewhere in between grids and brambles. We note that a concept similar to orchards is that of grid-like minors, introduced by Reed and Wood [45]. Grid-like minors are collections of paths whose intersection graphs are bipartite and contain a large clique minor. While orchards and grid-like minors have common features (note that the intersection graph of the horizontal paths and vertical trees of an orchard is a complete bipartite graph), in general they are incomparable objects.
The main result of this section is a separation lemma for orchards, Lemma 6.7, which will be used in the proof of Theorem 3.1.
)-orchard, then G contains every t-vertex planar graph as a minor.
Proof. Let R be an (f 4.1 (t) + 1) × (f 4.1 (t) + 1)-orchard with a collection of horizontal paths P and a collection of vertical trees T . Consider the bramble B := {T ∪ P | (T, P ) ∈ T × P} in G. Since the vertical trees are vertex-disjoint, the horizontal paths are vertex-disjoint, and |T | = |P| = f 4.1 (t) + 1, it follows that the order of B is at least f 4.1 (t) + 1. By Theorem 4.2, G has treewidth at least f 4.1 (t), and therefore by Theorem 4.1, G contains every t-vertex planar graph as a minor.
Let R be an a × b-orchard with horizontal paths P 1 , . . . , P a and vertical trees Note that the set of all horizontal sections is a collection of vertex-disjoint paths whose union covers all vertices of P. Let W be the set of vertices w such that for some We say that s is a section if s is a horizontal or a vertical section. Note that the set of all sections is a collection of vertex-disjoint paths whose union covers all vertices of R. In the proofs of Lemmas 6.4 and 6.6 below, we will use several times that if R has a ≥ 2 horizontal paths, then each of its vertical trees defines at most 1 + 3 · (a − 2) < a 2 vertical sections. 1 We define a myriapod C to be a tree of maximum degree at most 3 such that all its degree 3 vertices are on a single path P , called the spine of C. The components of C − V (P ) will be called the legs of C.
We show that the sections of an orchard can be covered by few myriapods.
1 Proof. We proceed by induction on a. If a = 2, then every vertical tree has one vertical section. Let a ≥ 3. Let R be an orchard. In the following, whenever we speak of a neighbor, it is with respect to R viewed as a graph. Given a vertical tree T , let P be a horizontal path such that exactly one vertex of V (P ) ∩ V (T ) has a neighbor v 1 in V (T ) \ V (P ), and this neighbor is unique. Let T be the tree obtained from T − V (P ) by iteratively deleting the unique leaf that is not on any horizontal path other than P . Let v 1 , v 2 , . . . , v k−1 be the sequence of such leaves, and let v k denote the neighbor of v k−1 in V (T ). Let R be the orchard obtained from R by deleting P and replacing T with T . Let s be the unique section of R containing v k . Every vertical section of T in R which is not a vertical section of T in R must be one of the following. (i) the path v 1 v 2 . . . v k−1 or (ii) v k or (iii) one of the at most two components of s − v k . (We remark that situations (ii) and (iii) only apply if s is a vertical section of R and s = v k .) By induction, there are at most 1 + 3(a − 3) vertical sections of R on T . By the discussion above, T has at most three more vertical sections. Lemma 6.2. Let R be an a × b-orchard. There is a collection C of at most a 2 subgraphs of R such that: • every element of C is a myriapod whose spine is a horizontal path of R and each of whose legs is contained in some vertical tree; • every section of R is contained in some element of C.
Proof. For each ordered pair of distinct horizontal paths (P i , P j ) in R, we take P i and extend it to a myriapod by adding to it the following legs. For each vertical tree T in R, we add the (unique) subpath P (T, i, j) of T that has endpoints in P i respectively P j but has no vertex of V (P i ) ∪ V (P j ) in its interior. By the uniqueness of the paths P (T, i, j) and because vertical trees are vertex-disjoint, the resulting graph is a myriapod. There are less than a 2 ordered pairs of horizontal paths. Since each horizontal section of R is contained in some horizontal path and each vertical section is contained in some connecting subpath P (T, i, j), it follows that the constructed myriapods together cover all sections of R.
Recall that each vertical tree intersects each horizontal path in a subpath and that these subpaths are disjoint. Thus, each horizontal path P defines two symmetric total orders on the vertical trees, which are given by the order in which we meet these trees when following P from one endpoint to the other.
We say that an a × b-orchard R is tame if its vertical trees appear in the same order along every horizontal path. Formally, R is tame if there is a permutation π of [b] such that for every i ∈ [a], we meet the horizontal trees of R in the order T π(1) , . . . , T π(b) , or the reverse order, when following P i from one endpoint to the other.
Given a horizontal section t of a horizontal path P and a vertical tree T in an orchard R, we say that t is bordered by T if t does not intersect T and, with respect to P viewed as a graph, one of the endpoints of t has a neighbor which is a vertex of T . If additionally (given an ordering of P 'from left to right') there is such a neighbor to the left (right) of t, then we say that t is bordered by T on its left (right).
An orchard R is a suborchard of an orchard R if R is obtained from R by selecting a subset of its horizontal paths and a subset of its vertical trees.
Proof. We claim that we may take f 6.3 (a, b) := b 2 a−1 . The proof is by induction on a. Note that every 1 × f 6.3 (1, b)-orchard is tame, and f 6.3 (1, b) = b, so the claim holds for a = 1.
For the inductive step, let P 1 , . . . , P a be the horizontal paths of R and let us consider the orchard obtained from R by ignoring P a and contracting some edges of the vertical trees so that the leaves of each vertical tree lie on V (P 1 ) ∪ · · · ∪ V (P a−1 ). More precisely, from each vertical tree we iteratively delete the leaves that are not in V (P 1 ) ∪ . . . ∪ V (P a−1 ).
be the vertical trees of R − , named according to the order in which they intersect P 1 . Since R − is tame, this is also the order in which they intersect P i for all i ∈ [a − 1]. Let T 1 , . . . , T b 2 be the corresponding trees in R. Choose one of the two possible orientations for P a arbitrarily and let g(1), . . . , g(b 2 ) be the order in which T 1 , . . . , T b 2 intersect P a . By Theorem 4.3, g(1), . . . , g(b 2 ) contains an increasing or decreasing subsequence g (1), . . . , g (b). Let T 1 , . . . , T b be the vertical trees of R corresponding to g (1), . . . , g (b). By reversing the orientation of P a if necessary, we obtain a tame a × b suborchard R of R, as required.
Using Lemmas 6.2 and 6.3, we now derive separation lemmas that will be key tools in the main proof. These lemmas and those in Section 7 are all parameterized by some positive integer m. In Section 8 we will apply these lemmas with the value m = f 4.1 (|H|) + 1.
Given two disjoint subsets A, B of vertices of a graph G, we say that A sees B if there is an edge in G linking a vertex of A to one of B.
Then, for each c ≥ 1 at least one of the following holds. Thus in both cases, G bip contains a matching M of size f 6.4 (c, m). From this fact we will derive that G[V (R) ∪ V (R )] contains 2 m pairwise vertex-disjoint (a + 1) × c-orchards.
By Lemma 6.2 applied to R and by the pigeonhole principle, there is a myriapod C R in R such that: • the spine of C R is a horizontal path of R and each leg of C R is a subgraph of a vertical tree of R; and • at least 1 a 2 f 6.4 (c, m) sections matched by M are contained in C R .
If a ≥ 2, then each leg of C R is a subgraph of a vertical tree of R and hence contains at most 1 + 3(a − 2) < a 2 sections. (If a = 1, then C R has no legs.) It follows that there is a submatching M 2 ⊆ M of size at least 1 a 4 · f 6.4 (c, m) such that the sections of R matched by M 2 are (a) all on the spine of C R , or (b) on distinct legs of C R .
By (b) we mean that each section matched by M 2 is on some leg of C R and no two such sections are on the same leg of C R .
We can apply a similar reduction to the vertices of R matched by M 2 . By Lemma 6.2, the vertices of R can be covered with at most a 2 (recall a is the number of horizontal paths of R ) myriapods whose spines are horizontal paths of R and each of whose legs is a subgraph of a vertical tree of R . Thus there is such a myriapod C R in R such that at least 1 Next, we claim that we can find a submatching M 3 ⊆ M 2 of size at least such that the vertices of R matched by M 3 are (1) all on the spine of C R , or (2) on distinct legs of C R , or (3) all on a single leg of C R .
This can be seen as follows. Let y := 1 m 6 · f 6.4 (c, m). As a, a ≤ m, the myriapod C R has at least y vertices matched by M 2 . A part is the spine or a leg of C R . If some part of C R contains at least √ y matched vertices, then (1) or (3) holds, and we are done.
Otherwise, strictly more than y/ √ y = √ y parts have at least one matched vertex. Since √ y is an integer, there are at least √ y + 1 such parts. By possibly discarding the spine, we obtain √ y distinct legs each having a matched vertex, and thus (2) holds.
We now extend the a × b-orchard R to an (a + 1) × (2 m · c) 2 m -orchard, as follows. As the (a + 1)-th horizontal path of the new orchard we take (in case (1) and (2)) the spine of C R or (in case (3)) the leg of C R that is matched by M 3 . For each edge e = vs in M 3 we choose an edge in the original graph G, which has endpoints v ∈ V (R ) and some vertex on section s. In case (1) and (3) we call this edge r(e). In case (2), by using the leg of C R it intersects, we extend this edge to a path with an endpoint on the spine of C R and all its internal vertices on . We also call this new path r(e). After this, r(e) has one endpoint on the chosen (a + 1)-th horizontal path.
Given two subgraphs F and F of G, we write F ∪ F for the graph with vertices In case (b), the other endpoint of r(e) is on a vertical tree T (e) of R. We extend T (e) to a larger vertical tree T (e) ∪ r(e). For distinct edges e 1 , e 2 ∈ M 3 , the vertical trees T (e 1 ) and T (e 2 ) are distinct and thus the extended vertical trees T (e 1 ) ∪ r(e 1 ) and T (e 2 ) ∪ r(e 2 ) are still vertex-disjoint. In this way we obtain |M 3 | ≥ (2 m · c) 2 m extended vertex-disjoint vertical trees that each intersect our chosen extra horizontal path. Thus we have constructed an (a + 1) × |M 3 |-orchard, which contains an (a + 1) × (2 m · c) 2 msuborchard.
In case (a), we do almost the same. The difference is that the C R -endpoint v(e) of r(e) is possibly not on a vertical tree. In that case, in order to appropriately extend the vertical trees, we need to add some subpaths of the spine P * of C R . In doing that, we need to take care that the extended vertical trees are still vertex-disjoint. One can do this by ordering the vertices of P * 'from left to right'. If v(e) intersects a vertical tree, then we extend the tree as before. If v(e) does not intersect a vertical tree, then we consider a tree T (e) that has the closest intersection point with P * to the left of v(e). There may exist (at most) one e ∈ M 3 such that v(e) has no vertical tree strictly to its left. In that case we drop e from M 3 . Next, we extend T (e) to T (e) ∪ r(e) ∪ p(e), where p(e) is the smallest subpath of P * containing both v(e) and a vertex of T (e) ∩ V (C R ). Since each r(e) meets a unique section of the horizontal path P * and since for every vertical tree T there exist at most two horizontal sections of P * that intersect T or are bordered by T on their left, this ordering guarantees that at least half of the extended vertical trees remain pairwise vertex-disjoint. We thus obtain an (a + 1) × |M 3 |−1 2 -orchard, which contains a suborchard of the desired size since |M 3 | ≥ 2 · (2 m · c) 2 m + 1.
We say that a subset A of vertices of a graph G reaches a section s of an orchard R in G if G has a path from A to s having no internal vertex in the orchard R.
Then, for each c ≥ 1, at least one of the following holds.
Proof. Let S denote the set of sections of R. Recall that they are by definition vertexdisjoint. Let G be the minor of G obtained by contracting each path s ∈ S into a single vertex, which we denote bys. Correspondingly, we writeS := {s | s ∈ S} for the set of contracted vertices in V (G ).
Case 1: There are f 6.4 (c, m) + g 6.4 (c, m) + 1 vertex-disjoint paths betweenS and V (R ) in G . Then G has a collection M of f 6.4 (c, m) + g 6.4 (c, m) + 1 vertex-disjoint paths, each having one endpoint in V (R) and the other endpoint in V (R ) and having no internal vertices in these two sets, such that the endpoints in V (R) all belong to distinct sections of R.
Let G * be obtained from the subgraph R ∪ R ∪ P ∈M P of G by contracting each path P ∈ M into an edge joining its two endpoints. Let M * denote the set of edges resulting from the contractions of the paths. Thus M * is a matching. Now, apply Lemma 6.4 on G * with orchards R and R . Since |M * | ≥ f 6.4 (c, m) + g 6.4 (c, m) + 1, the matching M * shows that the second outcome of that lemma is not possible. Hence, we deduce that G * contains 2 m pairwise vertex-disjoint (a + 1) × c-orchards. Replacing each edge of M * used in these orchards by the corresponding path in M, we see that G also has 2 m pairwise vertex-disjoint (a + 1) × c-orchards. Lemma 6.6. Let m ∈ N. Suppose that R is an a × b-orchard, with a ∈ [m], in a graph G and that s is a section of R. Then, for each c ≥ 1, at least one of the following holds.
Proof. First, note that if b = 1, then R has at most 3a horizontal sections and at most a 2 vertical sections, and the second outcome holds trivially with X = ∅ since 3a + a 2 ≤ 3m + m 2 ≤ 5f 6.4 (c, m) + 5g 6.4 (c, m). Thus we may assume b ≥ 2 in what follows.
Suppose first that s is a section of some vertical tree T of R. Then we discard T from R to obtain an a × (b − 1)-suborchard R 1 . Since s is disjoint from every horizontal path, R 1 is vertex-disjoint from s, while having the same horizontal paths as R. Noting that s can be seen as a 1 × |s|-orchard R , we can apply Lemma 6.5 to R 1 and R .
Either we obtain 2 m pairwise vertex-disjoint (a + 1) × c-orchards in G (in which case we are done), or there is a subset X ⊆ V (G) \ V (R 1 ) such that V (R ) \ X reaches at most f 6.4 (c, m)+g 6.4 (c, m) sections of the orchard R 1 in G−X, and |X| ≤ f 6.4 (c, m)+g 6.4 (c, m). Note that on each horizontal path, there are at most three horizontal sections of R that are not a section of R 1 , namely: the unique section that has a non-empty intersection with T and at most two sections that are bordered by T . Since T contains at most a 2 vertical sections, it follows that V (R ) \ X (which is V (s) \ X) reaches at most f 6.4 (c, m)+g 6.4 (c, m)+a 2 +3a ≤ 5f 6.4 (c, m)+5g 6.4 (c, m) sections of R in G−X. Therefore X has the desired property.
Next, suppose that s is a section of some horizontal path P of R. Decompose P = P 0 sP 1 , where P 0 (respectively P 1 ) is the graph induced by the vertices of P strictly to the left (respectively right) of s. For each k ∈ {0, 1}, let R k be the orchard obtained from R by discarding all vertical trees that intersect P 1−k or s, and truncating the horizontal path P to P − (V (P 1−k ) ∪ V (s)). Note that possibly R k contains no vertical tree, in which case it has at most a ≤ m sections.
As before, we note that s forms a 1 × |s|-orchard R that is vertex-disjoint from R k . We apply Lemma 6.5 to R k and R , for each k ∈ {0, 1}. If G does not contain 2 m pairwise vertex-disjoint (a + 1) × c-orchards, then for each k ∈ {0, 1}, we obtain a subset Observe that if R k has no vertical tree, then we do not need to apply Lemma 6.5 since we can just take X k = ∅. We now choose X := X 0 ∪ X 1 .
Possibly s is the intersection of a vertical tree T and a horizontal path P . In that case we denote by R T the orchard formed by T and the horizontal sections of R that intersect T or are bordered by T . Each vertical section of R is a vertical section of R T , R 0 or R 1 . Note that R T contains at most a 2 ≤ m 2 vertical sections and at most 3a ≤ 3m horizontal sections.
Suppose a horizontal section t of R is not a horizontal section of R 0 , R 1 or R T (if defined). Then t must be bordered in R by a vertical tree of R 0 and a vertical tree of R 1 , and therefore we call t of mixed type. Suppose V (s) \ X reaches t in G − X. Then it also reaches some horizontal section t * of R 0 or R 1 in G − X such that t is contained in t * . Note that every horizontal section of R 0 or R 1 contains at most two horizontal sections of R that are of mixed type.
Using Lemmas 6.5 and 6.6, we derive the following lemma.
Lemma 6.7. Let m ∈ N. Suppose that R is an a × b-orchard in a graph G and R is an a × b -orchard in G vertex-disjoint from R, with a, a ∈ [m]. Then for each c ≥ 1 at least one of the following holds.
Proof. Let S denote the set of sections of R. Assume that (2) does not hold (otherwise, we are done). Then, for every section s ∈ S, Lemma 6.6 yields a subset Y s ⊆ V (s) ∪ (V (G) \ V (R)) of size |Y s | ≤ 5f 6.4 (c, m) + 5g 6.4 (c, m) such that s reaches at most 5f 6.4 (c, m) + 5g 6.4 (c, m) sections of R in G − Y s . Also, Lemma 6.5 gives a set Y r ⊆ V (G) \ V (R) of size at most f 6.4 (c, m) + g 6.4 (c, m) such that V (R ) reaches at most We construct an auxiliary directed graph G * with vertex set S ∪{r}, where r is a dummy element representing R , and adjacencies are defined as follows. For each s ∈ S r , there is a directed edge from the vertex r to s. For two distinct sections s, s ∈ S, there is a directed edge from s to s if and only if s reaches s in G − Y s . It follows that the maximum outdegree of a vertex of G * is at most max{f 6.4 (c, m) + g 6.4 (c, m), 5f 6.4 (c, m) + 5g 6.4 (c, m)} = 5f 6.4 (c, m) + 5g 6.4 (c, m).
In what follows, vertices of G * will be classified by their depth, defined as the minimum length (number of directed arcs) in a directed path from r to the vertex (or +∞ in case no such directed path exists). Let T * be an out-arborescence obtained by performing a breadth-first search tree in G * from vertex r using outgoing directed edges: For each section s ∈ S at finite depth d, choose an in-neighbor of s with depth d − 1 and add the corresponding directed edge to T * . Note that T * only contains vertices of G * reachable from r by a directed path, which might not be all vertices of G * . Define the height of T * as the maximum depth of a vertex of T * . Let S ≤m denote the set of sections s ∈ S with depth at most m.
As a warm-up, suppose that the height of T * is less than m. Let X := Y r ∪ s∈S ≤m Y s . Now, consider a path P in G − X having one endpoint in V (R ) but no other vertex in V (R ), and the other endpoint in a section s ∈ S. We claim that s ∈ S ≤m . To see this, let us map the vertices of P to vertices of G * in the expected way: Replace the endpoint of P in V (R ) by r, replace each maximal sequence of consecutive vertices of P belonging to a section s ∈ S by the vertex s of G * , and remove all vertices of P not in V (R). This results in a sequence r, s 1 , s 2 , . . . , s k of vertices of G * with s i ∈ S for each i ∈ [k], some of which possibly appear multiple times. Now, observe that V (R ) reaches , so G * contains the directed edge (s i , s i+1 ). Hence, r, s 1 , s 2 , . . . , s k is a directed walk in G * , and therefore s k = s ∈ S ≤m , since all vertices of G * with finite depth have depth less than m by our assumption.
It follows from the previous discussion that each component of G − X intersecting V (R ) intersects at most |S ≤m | ≤ g 6.7 (c, m) sections of R, so that (3) holds. Indeed, this number of sections is at most the number of vertices of T * , which is bounded from above by ∆ out (T * ) height(T * )+1 ≤ (5f 6.4 (c, m) + 5g 6.4 (c, m)) m = g 6.7 (c, m), where ∆ out (T * ) denotes the maximum outdegree of T * .
We may thus assume that the height of T * is at least m.
Let Q ⊂ V (T * ) denote the set of sections with depth m. Let V Q denote the set of vertices in V (R) that are in a section in Q. We now consider a maximum-size collection Q of vertex-disjoint paths that join V (R ) with V Q in G − Y r ∪ s∈S ≤m Y s and we proceed with a case distinction on |Q|, the number of these disjoint paths.
Then by Menger's Theorem, there is a set C of vertices of size at most z(c, m) separating Observe that, by the definition of the directed graph G * , every path in G − Y r ∪ s∈S ≤m Y s connecting a vertex of V (R ) to a vertex belonging to a section of depth larger than m must meet a section of depth exactly m. Thus, in the graph G − Y r ∪ s∈S ≤m Y s , the set C also separates V (R ) from every vertex belonging to a section of depth larger than m. We then let X := C ∪ Y r ∪ s∈S ≤m Y s , which has size |X| ≤ z(c, m) + (5f 6.4 (c, m) + 5g 6.4 (c, m)) · (g 6.7 (c, m) + 1) ≤ f 6.7 (c, m).
As before, we find that each component of G − X intersecting V (R ) intersects at most |S ≤m | ≤ g 6.7 (c, m) sections of R.
Next, assume that |Q| > z(c, m). That is, there are many disjoint paths between V (R ) and V Q . From this we will derive that G contains a bramble of order at least m. For each path P ∈ Q, let the signature sign(P ) ⊆ S of P denote the set of the first m different sections of S that P intersects, starting from its endpoint in V (R ).
Note that sign(P ) ⊆ S ≤m and that it contains exactly m elements, by construction. Thus at most |S ≤m | ≤ g 6.7 (c, m) different sections can appear in signatures, and the number of distinct signatures is at most g 6.7 (c,m) m . By the pigeonhole principle it then follows that there is a set P ⊆ Q of z(c, m) • all orchards in m i=1 R i are pairwise vertex-disjoint; • every orchard R ∈ R 1 is a path with exactly ω(1) vertices.
We say that D is optimal if it has maximum grade among all orchard (m, ω)-packings in G.
Lemma 7.1. Let m and ω be as above, let D = (R 1 , . . . , R m ) be an optimal orchard (m, ω)-packing in a graph G, and let R ∈ R i and R ∈ R j for some i, j ∈ [m]. Then Proof. Suppose G does contain 2 m pairwise vertex-disjoint (i + 1) × ω(i + 1)-orchards R 1 , . . . , R 2 m . Then we can obtain a new orchard (m, ω)-packing D from D by removing R and R from R i and R j , respectively, and adding R 1 , . . . , R 2 m to R i+1 . Any other orchard of D is vertex-disjoint from G and is therefore unaffected by this replacement. It follows that the grade has been increased by 2 i+1 · 2 m ≥ 2 m+2 and has been decreased by 2 i + 2 j ≤ 2 m+1 . Thus D has a higher grade than D, a contradiction.
Proof. First we control the length of paths in G[Z]. If q = 0, then G[Z] does not contain a 1 × ω(1)-orchard. Otherwise, this orchard would not intersect any other orchard of D, so we could add it to R 1 , which would contradict the maximality of the grade of D. So each path in G[Z] has order smaller than ω(1) when q = 0. Suppose, on the other hand, that q ≥ 1 and G[Z] contains a path P of order q2 m+1 · ω(1). Then P can be split into q2 m+1 vertex-disjoint paths of order ω(1), each of which can be viewed as a 1 × ω(1)-orchard. We add these q2 m+1 orchards to R 1 after first deleting from m i=1 R i the q orchards intersected by Z, thus obtaining a new orchard (m, ω)-packing. This replacement increases the grade by q2 m+1 and decreases it by at most q2 m . Hence the new packing has higher grade than D, which contradicts the maximality of the grade of D. Thus each path in G[Z] has order smaller than q2 m+1 · ω(1). , it follows that it has size |V (M )| < H · max(q, 1) · 2 m+1 · ω(1).

Proof of Theorem 3.1
Now that optimal orchard packings are defined, we will use a strategy adapted from the proof for wheel minors in [1] to show our main technical theorem, Theorem 3.1. (For readers familiar with [1], our orchards will play the roles of the bounded-size paths and cycles in that proof.) We start with a brief overview of the proof. First, we will define several constants and functions, among which are the constant m and the function ω, that only depend on the given parameters p, H and g. Next, we choose an arbitrary graph G and we consider an optimal (m, ω)-orchard packing D = (R 1 , . . . , R m ) of G. We also need to take into account the components of G − V (D), but for this proof sketch we will assume that there are no such components. We construct two auxiliary graphs G b and G s to derive either a small K p -model (in which case we are done, having obtained outcome (ii)) or: an orchard K in i R i that only sees a small number of other orchards of i R i . Next, for each orchard K that is seen by K, we consider the graph G K induced by V (K) ∪ V (K ) ∪ (V (G) \ V (D)). Using Lemma 6.7 and the optimality properties of D, we find a small set X K of vertices such that each component of G K − X K intersecting V (K ) only intersects a small number of sections of K (otherwise we obtain a small model of H, satisfying outcome (i)). We define the cutset X := K X K and we finish the argument by deriving a suitable separation (A, B) with X = A ∩ B, satisfying outcome (iii). This concludes the proof sketch.
Proof of Theorem 3.1. We use the following functions or constants from previously stated lemmas and theorems: • ϕ, ϕ ∈ R are constants depending only on p such that every n-vertex graph of average degree at least ϕ has a K p -model on at most ϕ log n vertices (see The- and ω(i) := (q(i) + 1) · max { g(q(i)) + 1, g 6.7 (ω(i + 1), m) + 1 } .
We remark that ϕ and ϕ depend only on p, while m = ω(m) depends only on H. Furthermore (for i = m) each of q(i), ω(i), α and σ depends on p, H and g.
Let G be a graph. Let us assume that G does not contain an H-model of size at most σ. We show that one of the two other outcomes of the theorem holds. Let D = (R 1 , . . . , R m ) be an optimal orchard (m, ω)-packing in G (this is well-defined because ω is a decreasing function). We call a graph a piece if it is an orchard from one of the collections R 1 , . . . , R m , or if it is a component of the graph G − V (D).
Suppose some piece K contains a model of H. By an application of Lemma 7.2 with Z = V (K) and q ∈ {0, 1}, it follows that K (and hence G) contains a model of H of size at most H · α ≤ σ, contradicting our initial assumption. Therefore every piece is H-minor free. We recall that the pieces of D are orchards and thus subgraphs of G that are not necessarily induced, whereas the other pieces are induced subgraphs.
Suppose D contains at most one piece. Then from Lemma 7.2 applied with Z = V (G) and q ∈ {0, 1}, we know that if G has a model of H, then it has one of size at most H · α ≤ σ. This cannot happen, because of our initial assumption. If G has no such model, then we can take (A, B) with A = V (G) and B = ∅ as a trivial separation, which satisfies the outcome (iii) of the theorem because |A| = |V (G)| ≥ 1 = g(0) = g(|A ∩ B|). Thus we may assume from now on that D contains at least two pieces.
A piece is said to be central if it belongs to D, or if it sees at least 2ϕ other pieces. (Note that pieces not in D do not see each other, by definition.) In the next paragraph, we define two auxiliary graphs G s (for small degrees) and G b (for big degrees) that model how the central pieces are connected through the noncentral pieces. To keep track of the correspondence between the edges of G s and the noncentral pieces, we put labels on some of these edges.
Initialize both G s and G b to the graph whose set of vertices is the set of central pieces and whose set of edges is empty. For each pair of central pieces that see each other in G, add an (unlabeled) edge between the corresponding vertices in both G s and G b .
Next, while there is some noncentral piece N that sees two central pieces that are not yet adjacent in G b , do the following two operations: (1) Add all (unlabeled) edges to G b between pairs of central pieces seeing N (not already present in G b ). This creates a clique on the set of central pieces seeing N in G b , some of whose edges might have already been there before.
(2) Then, among the central pieces seeing N , choose one such piece K such that the number of newly added edges of G b incident to K is maximum. Add to G s every edge that links K to another central piece seeing N (not already present in G s ), and label it with N . This creates a star centered at K in G s with all its edges labeled with the noncentral piece N .
By construction, G s is a subgraph of G b (if we forget about labels). These graphs have the following two crucial properties.
Claim 8.1. If G s has a K p -model of size , then G has a K p -model of size at most α p 2 .
Proof. Suppose that G s has a K p -model of size . Then there exists a subgraph M s ⊆ G s with vertices that can be contracted to K p . Let Z be the union of V (K) over all central pieces K ∈ V (M s ) and all pieces K not in D. It follows from the construction of G s that G[Z] contains a graph isomorphic to M s as a minor, and thus has a K p minor. As Z intersects at most central pieces, Lemma 7.2 implies that G[Z] has a model of K p of order at most 2 m+1 ω(1) · · K p ≤ α p 2 .
For a graph F , we denote by d(F ) the average degree of F .
Moreover, the degree in G s of each central piece not in D is at least 2ϕ.
Proof. First, note that edges that appear in G b but not in G s must not be labeled. Let N be a noncentral piece, and let r be the number of pieces in D it sees. By definition of noncentral pieces, r < 2ϕ. When N is treated in the algorithm used to construct G b and G s , if new edges are added to G b , then one of the pieces seen by N is incident to at least 2 /r > /ϕ of these new edges and thus at least /ϕ new edges are added to G s . This proves the first part of the claim.
By definition, a piece K not in D is central if it sees at least 2ϕ other pieces. As pieces not in D do not see each other, K sees at least 2ϕ pieces from D, that is, at least 2ϕ other central pieces. Then in the first step of the construction of G s , all edges have been added from K to these pieces.
If d (G s ) ≥ ϕ, then by definition of ϕ and ϕ at the beginning of the proof, G s has a K p -model of size at most ϕ log |G s |. By Claim 8.1, this gives a K p -model of size at most ϕ αp 2 log |G s | ≤ σ log |G| in G and we are done (outcome (ii)).
Thus, we assume in the rest of the proof that d (G s ) < ϕ. Then strictly more than half of the central pieces have degree less than 2ϕ in G s (otherwise at least half of the vertices of G s have degree at least 2ϕ, a contradiction to the fact that d (G s ) < ϕ). By Claim 8.2, we obtain d (G b ) < ϕ 2 and we similarly get that more than half of the central pieces have degree less than 2ϕ 2 in G b . Since D is nonempty, it follows that there is a central piece whose degree in G s is less than 2ϕ, and whose degree in G b is less than 2ϕ 2 . Choose such a piece K. By Claim 8.2 (second part of the statement), K is in R i for some i ∈ [m]. That is, K is an i × ω(i)-orchard. As observed in the beginning of the proof, every piece is H-minor free. By Lemma 6.1 this implies i < m; in particular ω(i + 1) is defined.
The rest of the proof relies on the fact that K has degree less than 2ϕ 2 in G b . We will not use the graph G s anymore.
Recall that D contains at least two pieces. For each piece K in D adjacent to K in ). Apply Lemma 6.7 with orchards K and K on the graph G K with c = ω(i + 1). According to Lemma 7.1, the outcome (2) of Lemma 6.7 is not possible. If outcome (1) holds, that is if G K contains a bramble of order at least m, then by Theorem 4.2 and Theorem 4.1 G K contains a model of H. By Lemma 7.2 (applied with Z = V (G K ) and q = 2), there is such a model of size at most 2 H · α ≤ σ, a contradiction. Therefore we may assume that we get outcome (3) when applying Lemma 6.7. So, there is a set X K of vertices of size at most f 6.7 (ω(i + 1), m) such that each component of G K − X K that intersects V (K ) intersects at most g 6.7 (ω(i + 1), m) sections of the orchard K.
Let X := K X K , where the union is taken over all pieces K in D adjacent to K in G b . Then |X| ≤ 2ϕ 2 f 6.7 (ω(i + 1), m) =: q. Note that q coincides with q(i) defined at the beginning of the proof.
Now, consider some horizontal path of K. Recall that there are at least ω(i) horizontal sections on that path, since every vertical tree defines one such section. By the pigeonhole principle, we can find (ω(i) − q)/(q + 1) consecutive horizontal sections that are avoided by X. Let P denote the subpath of the horizontal path induced by the vertices of these sections.
Let C be the component of G − X that contains P . We claim that no orchard K in D distinct from K has a vertex in C. Suppose for a contradiction that an orchard K does, and let Q be a path in C having one endpoint in P and the other endpoint in K . By choosing K appropriately, we can moreover ensure that Q does not intersect any other orchard distinct from K and K . It follows that P ∪ Q is a subgraph of G K − X K . Since P ∪ Q is connected, it is contained in some component of G K − X K . Hence, that component intersects K and at least (ω(i) − q)/(q + 1) > g 6.7 (ω(i + 1), m) sections of K, contradicting Lemma 6.7.
Let A := X ∪ V (C) and B := V (G) \ V (C). Since C intersects no orchard from D other than K, it follows that A intersects at most |X| + 1 ≤ q + 1 orchards from D.  (A, B) is a separation of G with the desired properties, since |A| ≥ |P | ≥ ω(i)−q q+1 ≥ g(q) and |A ∩ B| = |X| ≤ q.

Approximation algorithm
The statements and proofs of Theorem 1.1 and Theorem 3.1 were described without mentioning algorithmic aspects. In this section, we briefly explain how the different steps of the proofs can be made algorithmic, and thus obtain Corollary 2.1.
First, let us address one subtlety, namely that the constant c in our ck log(k + 1) bound in Theorem 1.1 is not known to be computable. This is because c depends on the polynomial p 5.1 corresponding to H in Theorem 5.1, which is not known to be computable (see the remarks at the end of the kernelization section in [20]). Nevertheless, this does not prevent us from deriving the approximation algorithm, as we will explain. (We also note that variants of Theorem 5.1 have recently been developed in [30] with computability of the constants as an explicit goal; however, these results need extra assumptions on the graph and are not applicable in our context.) First we explain how to obtain the algorithm in Corollary 2.1, assuming we have an algorithm for Theorem 3.1, and then we explain how the proof of Theorem 3.1 can be made algorithmic. That is, we assume that for every p, H, and g as in the statement of Theorem 3.1, there is a constant σ ∈ N and a polynomial-time algorithm that, given a graph, returns one of the three objects promised by Theorem 3.1.
Our algorithm will use the following algorithmic version of Theorem 5.1 given in [20] (see the paragraph before Theorem 15 in this paper for details). We remark that the polynomial p 9.1 appearing in the statement is in fact the same as p 5.1 but we distinguished them to avoid confusion.  ) as input, where G is a graph and t ≥ 0 is an integer, • either produces a minor G of G such that τ H (G ) = τ H (G) and |G | ≤ p 9.1 (t), together with the sequence of operations (edge/vertex deletions, edge contractions) used to obtain G from G, • or (correctly) answers that τ H (G) > t.
Moreover, in the first case, given any set X ⊆ V (G ) such that G −X is H-minor-free, the algorithm can compute a corresponding set X ⊆ V (G) of the same size such that G − X is H-minor-free in polynomial time.
We call the operation of replacing X by X as lifting X to G. Note that, given any packing P of H-models in G , one can also easily compute a corresponding packing P of H-models in G of the same size in polynomial time, because the algorithm above provides the sequence of operations used to obtain G from G. We call this lifting the packing P to G.
We will also need an algorithmic version of Theorem 5.2, which is provided in [19]: There is a polynomial-time algorithm B which, given the graph G and the separation (A, B) of bounded order, computes the graph G with ν H (G ) = ν H (G) and τ H (G ) = τ H (G) guaranteed by that theorem. Furthermore, a given packing P of H-models in G can be lifted to G in polynomial time, and the same is true for a given subset X ⊆ V (G ) such that G − X is H-minor-free. We refer to the first paragraph of Section 5 of [19] for more details about these algorithms. Fix a planar graph H. Let us assume for now that H is connected, we will comment on the disconnected case later. Let g denote the function f 5.2 of Theorem 5.2 for the graph H. Our algorithm for Corollary 2.1 is a recursive algorithm R which, given the input graph G, outputs a packing P of k H-models in G and a subset X of vertices of G such that G − X has no H-model, of size at most c k log(k + 1), where c := c (H) is a constant depending on the constant c in Theorem 1.1 and k is some positive integer. The algorithm is as follows.
• If G is empty, stop and output (∅, ∅) for the pair (P, X). -Apply algorithm B on G with separation (A, B), producing a graph G with ν H (G ) = ν H (G ) and τ H (G ) = τ H (G ). -Run algorithm R on G , let P and X denote the packing and subset of vertices it outputs. -Lift P to a packing P in G .
-Lift X to a set X of vertices of G .
• Lift P to a packing P in G.
• Lift X to a set X of vertices of G.
• Output P and X.
Note that above we did not distinguish between the first two outcomes of Theorem 3.1 because in each case we obtain an H-model of order O(log |G |). To avoid confusion, let us write G i , G i and G i for respectively the graphs G, G and G after the i-th recursive call, thus G 0 is our original graph G. Let also τ := τ H (G). Observe that for all i, we have τ H (G i ) ≤ · · · ≤ τ H (G 0 ) = τ , and thus |G i | ≤ p 9.1 (τ H (G i )) ≤ p 9.1 (τ ).
Hence, when an H-model M of G i is considered in the algorithm, it satisfies |V (M )| ≤ σ log |G i | ≤ σ log(τ +1), for some constant σ depending on σ and p 9.1 . Letting k := |P|, it follows that |X| ≤ σ k log(τ +1), where P and X are the packing and subset of vertices output after the initial call of the algorithm. Therefore, it only remains to show that |X| ≤ c k log(k + 1) for some constant c . This clearly holds when X is empty, so we assume from now on that |X| ≥ 1. Note also that |X| ≥ τ .
We first observe that for every real x > 0 we have 2x ≥ 3 log x. By substituting x by log(x + 1), multiplying by σ k, and rearranging the terms we deduce that for every x > 0 the following holds: (1) 0 ≤ 2σ k log(x + 1) − 3σ k log log(x + 1).
Coming back to the size of X, we have: ≤ 3σ k log(σ (k + 1)).
We obtained (3) by adding (1) (for x = |X|) to (2). The step from (3) to (4) follows by replacing the first occurrence of |X| in (3) with the upper-bound given in (2). The last line is then obtained by breaking the first logarithm in (5) and simplifying. Hence we have |X| ≤ c k log(k + 1) for some constant c depending on σ (and thus which can be bounded from above by a function of c), as desired.
If H is not connected, then we reduce to the connected case similarly as in the end of the proof of Theorem 1.1, as follows. Let H be a connected planar graph on the same vertex set as H containing H as subgraph. First, we run the above algorithm for H -models in G, which outputs a packing P of k H -models in G and a subset X of vertices of G such that G − X has no H -model, of size at most c k log(k + 1), where c := c (H ). Observe that P readily gives a packing of k H-models in G, since H ⊆ H . Next, we use a theorem of Robertson and Seymour [46,Theorem 8.8] stating that for every graph J of treewidth at most w, one can find a packing Q of H-models in J and a subset Y of vertices of J such that J − Y has no H-model, of size at most (b|Q| − 1)(w + 1), where b is the number of components of H. We apply this result to the graph J = G−X , which has treewidth at most w = f 4.1 (|H |) by Theorem 4.1. Note that f 4.1 (|H |) = f 4.1 (|H|) is a constant depending only on H. In particular, an optimal tree decomposition of G − X can be found in linear time using an algorithm of Bodlaender [5]. Given this tree decomposition of G − X , one can check that the proof given in [46] can be turned into a polynomial-time algorithm that finds the packing Q and the set Y in polynomial time. Alternatively, the problem of finding a largest packing of vertex-disjoint H-models is expressible in Monadic Second Order Logic (see [26]), and thus can be solved in polynomial time on graphs of bounded treewidth using Courcelle's theorem [13]. As the same is true for the problem of finding a minimum size subset of vertices meeting all H-models, it follows that the aforementioned packing Q and subset Y of vertices can be computed in polynomial time. Given Q and Y , and seeing P as a packing of H-models in G, we let P denote the largest of the two packings P and Q, and let X := X ∪ Y .
Letting k := |P|, we thus find a packing of k H-models in G, and a subset X of at most c k log(k + 1) + (b|Q| − 1)(w + 1) ≤ c k log(k + 1) + (bk − 1)(f 4.1 (|H|) + 1) = O(k log k) vertices of G such that G − X has no H-model, as desired. Now, we turn to the proof of Theorem 3.1. In order to turn this proof into an algorithm, we will start the proof with D = (∅, . . . , ∅), instead of an optimal orchard (m, ω)-packing (which could be difficult to compute). Then, each time we apply one of the lemmas about orchards, we have the extra possibility that the current (m, ω)-packing D could be improved (which could not happen when D was optimal), either by adding a new orchard to D which is vertex-disjoint from the existing ones, or by replacing existing orchards with better ones. Specifically, this could happen when following the proofs of Lemma 7.1 or Lemma 7.2 (seen as algorithms) with our non-optimal (m, ω)-packing D. Instead of producing the outcome guaranteed by these lemmas when D is optimal, these proofs could stop and output instead an (m, ω)-packing D of higher grade than D. If this happens, we replace D with D and restart from the beginning. Note that the grade of an (m, ω)-packing is at most 2 m n, where n is the number of vertices of G. Therefore, there will be at most linearly many such improvements of D because m is a constant. Eventually, no improvement of D will be found anymore, and then we can follow the proof of Theorem 3.1 as if D were optimal.
Since we restart the algorithm at most linearly many times, it only remains to check that the different steps in the proof of Theorem 3.1 can be done in polynomial time. Our use of the lemmas from Sections 6 and 7 about orchards can be implemented in polynomial time because we always apply them to one (or two) orchard(s) from D, and all orchards in D have size bounded from above by a constant. Thus, most steps of these proofs can be realized efficiently simply by using brute force. The only exceptions are the computations of matchings (in Lemma 6.4 and Lemma 6.5) and of vertex-disjoint paths using Menger's theorem (in Lemma 6.5 and Lemma 6.7), which can be done in polynomial time using standard algorithms. The remaining steps of the proof of Theorem 3.1 are easily implemented efficiently. The only step requiring a comment is the use of Theorem 4.4 to obtain an H-model of size at most σ log |G|. However, this can be done in polynomial time as well, as explained in [18].

Proofs of the remaining corollaries
We prove in this section the results stated in Section 2 after the approximation algorithm. We start with the proof of Theorem 2.6. Since it is similar to that of Theorem 1.1, we shortened the common parts.
Proof of Theorem 2.6. First suppose that H is connected. From the facts that G is proper and minor-closed, we respectively deduce that there is a graph F in the complement of G, and that the graphs in G are F -minor free. Let σ be the constant of Theorem 3.1 for the parameters H, p := |F |, and let g denote the function f 5.2 of Theorem 5.2 for the graph H. We prove the result for c := σ. As in the proof of Theorem 1.1, we consider a graph G ∈ G such that τ H (G) > cν H (G), with (ν H (G), |G|, G ) lexicographically minimum. Let us apply Theorem 3.1 on G with the aforementioned parameters. According to Theorem 5.2, the outcome (iii) of Theorem 3.1 does not hold. As G is F -minor free -this is the difference with the proof of Theorem 1.1 -, the outcome (ii) is not possible either. Therefore G contains an H-model of size at most σ = c. By considering the graph obtained by deleting this model, we can conclude as in the proof of Theorem 1.1. Now assume that H is not connected. Then we can reduce to the connected case using the result of Robertson and Seymour [46,Theorem 8.8], exactly as in the description of the approximation algorithm in Section 9. (The only difference here is that when we apply the proof for a connected planar graph H containing H as a subgraph, we obtain a linear bounding function for H -models.) We now move to the proof of Corollary 2.8. We actually prove a stronger statement, which we describe now. Given m ∈ N, we say that G is a (0 mod m)-subdivision of H if G can be obtained from H by subdividing edges, in such a way that every edge of H is replaced by a path having (0 mod m) edges. A graph is subcubic if it has no vertex of degree more than 3. Thomassen [49] proved the following result.
Theorem 10.1 (Thomassen [49,Theorem 3.3]). For every planar subcubic graph H and every m ∈ N there is a function f such that, for every k ∈ N and every graph G, either G contains k vertex-disjoint (0 mod m)-subdivisions of H as subgraphs, or there is a subset X of at most f (k) vertices such that G − X contains no such subgraph.
Actually, our discussion of Theorem 2.7 in Section 2 also applies to Theorem 10.1. Namely, the function f that can be extracted from Thomassen's proof is doubly exponential in k (for fixed m and H) and was later improved to f (k) = O(k log d k) for some d by Chekuri and Chuzhoy [10]. We prove here that Theorem 10.1 holds for f (k) = O(k log k).
Theorem 10.2. For every planar subcubic graph H and every positive integer m there is a constant c := c(m, H) such that, for every k ∈ N and every graph G, either G contains k vertex-disjoint (0 mod m)-subdivisions of H as subgraphs, or there is a subset X of at most ck log(k + 1) vertices such that G − X contains no such subgraph.
As noted in Section 2, this bound is optimal, up to the value of c. Corollary 2.8 is the special case of Theorem 10.2 where H consists of a unique vertex with a loop. 2 2 The reader might rightly object that only simple graphs were considered so far in the paper. While it is true that the proof of Theorem 10.2 works even if H is a planar subcubic multigraph, let us mention the following alternative way of deducing Corollary 2.8 without resorting to multigraphs: Take H to be a triangle. Then (0 mod m)-subdivisions of H correspond to cycles of lengths 0 mod m that are at least 3m. Thus, applying Theorem 10.2 with this graph H, we obtain either k vertex-disjoint such cycles in G, in which case we are done, or a subset X of at most ck log(k + 1) vertices such that G − X has no such cycles. In the latter case, G − X could still have cycles of length m or 2m. However, it suffices to take an inclusion-wise maximal packing of these in G − X: If we find at least k of them, we are done. And if not, then we let Y be the set at most 2m(k − 1) vertices of the cycles in the packing. Then G−(X ∪Y ) has no cycle of length 0 mod m, and |X ∪Y | ≤ ck log(k +1)+2m(k −1) = O(k log k), as desired. We will also need the following more precise version of Lemma 5.3, which appears in [1].
Lemma 10.4 (Aboulker, Fiorini, Huynh, Joret, Raymond, and Sau [1], reworded). Let J be a planar graph and let g be a bounding function for J-models. If H is a class of graphs such that for every J-model, the graph of the model contains a graph in H as a subgraph, then there is a function f = O(g) such that, for every graph G and every k ∈ N, at least one of the following holds.
• G has k vertex-disjoint subgraphs, each isomorphic to an element of H; • there is a set X of at most f (k) vertices such that G − X has no subgraph isomorphic to an element of H.
The proof of Theorem 10.2 is now immediate. Note that the same proof as above using a (linear) bounding function provided by Theorem 2.6 instead of Theorem 1.1 yields the following corollary.
Corollary 10.5. Let G be a proper minor-closed class, let H be a planar subcubic graph, and let m be a positive integer. Then there is a constant c := c(G, H, m) such that, for every k ∈ N and every graph G ∈ G, either G contains k vertex-disjoint (0 mod m)subdivisions of H as subgraphs, or there is a subset X of at most c · k vertices such that G − X has no such subdivision.
We now address the proof of our partitioning corollary.
Proof of Corollary 2.2. For every r ∈ N, let Γ r denote the r × r-grid, which is known to have treewidth r, and let c r be the constant c in Theorem 1.1 for H := Γ r . We prove the result for f (r) := (c r + f 4.1 (|Γ r |)). Let us consider a graph G of treewidth at least f (r) · k log(k + 1). If ν Γr (G) ≥ k, then we are done. In the opposite case, by Theorem 1.1, G has a set X of at most c r · k log(k + 1) vertices such that G − X has no Γ r -model. According to Theorem 4.1, G − X has treewidth less than f 4.1 (|Γ r |). By the properties of treewidth, we have tw(G) ≤ tw(G − X) + |X| < f 4.1 (|Γ r |) + c r · k log(k + 1). This is in contradiction with the assumption tw(G) ≥ f (r) · k log(k + 1).
Let us conclude this section with the algorithms to compute minor-closed bidimensional parameters. The proofs are essentially the same as in [10] but since they are short, we include them for completeness.
Proof of Corollary 2.4. Let k = s 2.2 (p)(k + 1) log(k + 2). We use an approximation algorithm for treewidth, for instance that of [2], that given G and k , either produces a tree-decomposition of G of width 4k or correctly concludes that tw(G) ≥ k , in 2 O(k ) n O(1) time.
In the first case, we use the algorithm required by the statement of the corollary on the tree-decomposition output by the approximation algorithm in order to decide whether π(G) ≤ k in time h(4k ) · n O(1) . In the second case we immediately conclude that (G, k) is a negative instance. Indeed, by Corollary 2.2, G then contains as a minor (actually, as subgraph) the disjoint union of k + 1 graphs of treewidth at least p. From the properties of π we deduce π(G) ≥ k + 1. The total worst-case running time is 2 O(s 2.2 (p)(k+1) log(k+2)) n O(1) + h(4s 2.2 (p)(k + 1) log(k + 2)) · n O(1) , as claimed.
Proof of Corollary 2.5. Observe that π is positive on all graphs with treewidth at least f 4.1 (t), as any such graph contains H as a minor (Theorem 4.1) and π is minorclosed. The result then follows from Corollary 2.4 with p = f 4.1 (t).