An analytic approach to sparse hypergraphs: hypergraph removal, Discrete Analysis 2018:3, 47 pp. The famous triangle removal lemma of Ruzsa and Szemerédi states that for every $\epsilon>0$ there exists $\delta>0$ such that every graph $G$ on $n$ vertices with at most $\delta n^3$ triangles has a triangle-free subgraph $G'$ such $|E(G)\setminus E(G')|\leq\epsilon n^2$. To put it less formally, a graph that is almost triangle free can be approximated by a graph that is exactly triangle free. A surprising application of this innocent looking result is a proof of Roth's theorem, that for every $\delta>0$ there exists $n$ such that every subset $A\subset\{1,2,\dots,n\}$ of size at least $\delta n$ contains an arithmetic progression of length 3. To prove this deduction, one defines a tripartite graph with appropriate vertex sets $X,Y,Z$ (one can, for instance, take each set to be $\{1,2,\dots,3n\}$), with $xy$ an edge if $y-x\in A$, $yz$ an edge if $z-y\in A$, and $xz$ an edge if $(z-x)/2\in A$. Then a triangle $xyz$ in the graph corresponds to an arithmetic progression $y-x,z-x,z-y$ in $A$ (which may be degenerate, but there cannot be too many degenerate ones). This lemma has been generalized in several different directions. One is to a "simplex removal lemma" for hypergraphs. For 3-uniform hypergraphs, for example, it states that for every $\epsilon>0$ there exists $\delta>0$ such that if a 3-uniform hypergraph $H$ on $n$ vertices contains at most $\delta n^4$ simplices (that is, configurations of the form $xyz,xyw,xzw,yzw$), then one can throw away at most $\epsilon n^3$ triples from $H$ in order to obtain a subhypergraph $H'$ that contains no simplices at all. Another direction of generalization is to a "sparse random version". Here the idea is to start with a random graph $U$ with $n$ vertices and edge probability $n^{-\alpha}$ for some suitable $\alpha>0$, and to prove a result about subgraphs $G$ of $U$, where the various densities are measured relative to $U$. Thus, $G$ is deemed to have few triangles if the number of triangles in $G$ is at most $\delta$ times the number of triangles in $U$, and $G'$ is deemed to be a good approximation of $G$ if it differs from $G$ by at most $\epsilon$ times the number of edges in $U$. Of course, one can try to generalize the result in both directions simultaneously in order to obtain a sparse random version of the simplex removal lemma. The purpose of this paper is to give a new way of proving such a generalization. The methods used are infinitary, continuing a line of research initiated by Lovász and various coauthors with their introduction of graph and hypergraph limits. The novelty of the paper is that it manages to apply an infinitary approach of this kind in the sparse random context, which was not obviously possible and which required the author to work out how to deal with some serious-looking obstacles.
Read full abstract