A Possible Ramsey Insight into P Versus NP?

September 26, 2021

In mathematics the art of proposing a question must be held of higher value than solving it—Georg Cantor

David Zuckerman has a beautiful result on the approximate hardness of the clique problem. His paper, “Linear Degree Extractors and the Inapproximability of Max Clique and Chromatic Number,” has has received almost 1,000 citations since it was first published in 2007. Wow.

Today we discuss a possible relevance of this result for P versus NP.

Recall a clique is a subset of nodes of an undirected graph, such that every two nodes in the subset are connected by an edge. Wikipedia’s illustration below does not have large cliques—only two cliques of ${4}$ nodes shaded blue to go with a bunch of triangles shaded lighter.

Is there a relatively small change we could make to the graph to give it a much larger clique? At top right, we could add two edges to extend the blur ${4}$ -clique to a ${5}$ -clique—a pentagram inside a pentagon. At bottom left we could add ${4}$ edges to make the four triangles blossom into a ${6}$ -clique.

Computing the size of the largest clique in a graph—or just telling whether it has a clique of a given size ${k}$ —is a famous NP-complete problem. Counting the number of cliques of the largest size is evidently a harder problem still. Yet you might think it easier to tell a graph with only small cliques apart from a graph that has large ones. This is where David’s result comes in.

In passing, we note David’s place among the winners of the 1986 William Lowell Putnam Mathematical Competition. Also on the list was Bjorn Poonen, whom we have also highlighted multiple times including recently. Here is the 1986 exam; Ken and I confess we did much worse when we took the Putnam in earlier years.

David’s Result

Polynomial growth refers to a function that is bounded by ${n^c}$ for some constant ${c}$ . Quasi-polynomial growth means one bounded by ${n^{\log^c n}}$ for some ${c}$ . Let ${\mathsf{\widetilde{P}}}$ those sets accepted by a deterministic algorithm that runs in quasi-polynomial time and let ${\mathsf{N\widetilde{P}}}$ be those sets accepted by a nondeterministic algorithm that runs in quasi-polynomial time

Let ${G}$ be an ${n}$ -vertex graph. We have two properties of ${G}$ for a fixed ${\epsilon>0}$ :

Clumpy: ${G}$ has a clique of size at least ${\epsilon n}$ . Limpid: ${G}$ has no clique of size ${n^{\epsilon}}$ .

The following is a main result in David’s paper:

Theorem 1 The problem of distinguishing clumpy graphs from limpid graphs is not in ${\widetilde{P}}$ provided ${\mathsf{\widetilde{P} \neq N\widetilde{P}}}$ .

It is fundamentally a de-randomization result. Our idea is to scale it down in a way that may bring ${\mathsf{\widetilde{P} \neq N\widetilde{P}}}$ into contact with mathematical theorems that govern effects of the scaling. The theorems may be adjacent to conjectures that can suggest new approaches. The idea may require bringing back randomization, however.

The Question

Our quest to build on David’s theorem leads us to interpose a classic idea. In work with Paul Erdős, Tibor Radó wrote ${G\rightarrow(H)}$ to mean that every ${2}$ -coloring of the edges of ${G}$ has a monochromatic subgraph ${H}$ . The classic two-color Ramsey theorem states that for all ${r}$ there is a ${G}$ such that ${G\rightarrow(K_r)}$ . Erdős and George Szekeres further showed ${G}$ can have fewer than ${2^{2r}}$ nodes.

Definition 2 Say an ${n}$ -node graph ${G}$ is ${c}$ -Radó if every edge 2-coloring of ${G}$ leaves a clique of ${\frac{1}{2}\log n - c}$ vertices.

In Radó’s notation, this is if ${G \rightarrow (K_{\frac{1}{2}\log_2 n - c})}$ . The first point is that all clumpy graphs are ${c}$ -Radó for large enough ${c}$ . Taking ${c \geq 2\log_2(\frac{1}{\epsilon})}$ puts the ${\epsilon n}$ -sized clique above the Erdős-Szekeres bound for the Ramsey property to hold.

Thus the key question becomes:

Suppose that ${G}$ is a limpid graph. Can ${G}$ also be ${c}$ -Radó?

Suppose the answer is no. Then every limpid graph ${G}$ has an edge 2-coloring that does not leave a clique of size ${k = \frac{1}{2}\log n - c}$ . We can verify this in quasi-polynomial time by trying all ${k}$ -subsets of the vertices.

This sets up a situation where both “limpid” and “clumpy” have an ${\mathsf{N\widetilde{P}}}$ witness. More to the point, we obtain the modified problem of distinguishing “clumpy” from “not ${c}$ -Radó” (for appropriately chosen ${c}$ ). This would then likewise be ${\mathsf{N\widetilde{P}}}$ -hard by David’s result. Then, we would have an ${\mathsf{N\widetilde{P}}}$ -hard promise problem with promise set ${\mathsf{clumpy \cup}}$ not-c-Radó belonging to ${\mathsf{N\widetilde{P}}}$ . This would be a plausible unlikelihood along the lines of what the ESY conjecture, which we covered here, rules out. We can obtain a clearer unlikelihood if we relax the new definition.

Definition 3 An ${n}$ -node graph ${G}$ is almost ${c}$ -Radó if all but a negligible fraction of edge 2-colorings leave a monochrome clique on ${\frac{1}{2}\log_2 n - c}$ vertices.

Here “negligible” means a function asymptotically less than ${1/q(n)}$ for any quasi-polynomial function ${q(n)}$ . With some informality, we can prove:

Proposition 4 Suppose no limpid graphs are almost ${c}$ -Radó. Then ${\mathsf{N\widetilde{P}}}$ is contained in randomized ${\widetilde{P}}$ .

Proof: Formally, the negation is worded so that there is a quasi-polynomial function ${q(n)}$ such that for every limpid graph ${G}$ —with the concrete value of ${\epsilon}$ and corresponding ${c}$ —there are a ${\frac{1}{q(n)}}$ fraction of 2-edge-colorings of ${G}$ that do not have a monochromatic clique of size ${\frac{1}{2}\log_2 n - c}$ . Now define an algorithm ${A}$ that given any graph ${G}$ guesses order- ${q(n)}$ such colorings at random.

If ${A}$ finds a coloring that does not have a monochromatic clique of size ${\frac{1}{2}\log_2 n - c}$ , then ${A}$ rejects.
Else, ${A}$ accepts.

Whenever ${G}$ is clumpy, ${A}$ will always accept. If ${G}$ is limpid, then with high probability, ${A}$ will find a coloring witnessing that ${G}$ is not clumpy, and ${A}$ will reject ${G}$ . Thus, ${A}$ distinguishes limpid from clumpy in randomized quasi-polynomial time. $\Box$

The Quest For Limpid Radó Graphs

If, like most, you believe ${\mathsf{\widetilde{P} \neq N\widetilde{P}}}$ , or further in the “ESY”-type strengthening of ${\mathsf{\widetilde{NP} \neq \text{co-}N\widetilde{P}}}$ , then the answer must be that limpid (almost-) ${c}$ -Radó graphs exist. The question of finding them, however, has a long trail that leads back into complexity theory.

To begin, let’s decouple the clique size from the number ${n}$ of nodes by considering graphs with a clique of size ${6}$ to be “lumpy,” and use the original form of the Radó property: every 2-edge coloring has a monochrome triangle. It is surprisingly hard to find a ${K_6}$ -free graph with the property. Brute-force attempts for ${7}$ -node and ${8}$ -node graphs that are maximal for having no ${K_6}$ fail:

The smallest ${K_6}$ -free graph with the Radó triangle property that we know how to build has ${n=14}$ nodes and comes from a proof that the Radó triangle property is ${\mathsf{NP}}$ -complete. The proof is in a 1985 paper by Moon-Jung Chung, W. Michael Evangelist, and Ivan Hal Sudborough. The graph at left, which looks like a bat, is such that any 2-edge coloring without a monochrome triangle must give the two edges marked ${x}$ the same color—whichever color is used for two edges of the middle triangle.

Joining a bat and an upside-down bat creates the graph at right, in which the edges marked ${x}$ and ${\bar{x}}$ must have opposite colors. This creates a truth-assignment gadget for each pair of literals ${x_i}$ and ${\bar{x}_i}$ in a 3CNF formula ${\phi}$ given as instance of the ${\mathsf{NP}}$ -complete Not-All-Equal 3SAT problem. Using one triangle per clause of ${\phi}$ , connecting more bats between ${x_i}$ or ${\bar{x}_j}$ appearing in the clause to the corresponding edges in the truth gadgets creates a ${K_6}$ -free graph ${G_\phi}$ that has the Radó property if and only if ${\phi}$ has an assignment making 1 or 2 literals true in each clause.

Then every negative instance ${\phi}$ induces a ${K_6}$ -free Radó graph ${G_\phi}$ . We can get one more simply, however, by adding a third “bat” to the graph at right above that connects ${x}$ and ${\bar{x}}$ . The resulting ${14}$ -node graph has no 2-edge coloring without making a monochrome triangle.

To be truly “limpid,” however, the graph should exclude smaller cliques. The following problem of Erdős and Radó was open for some time:

Does there exist a ${K_4}$ -free graph with the Radó triangle property?

This was solved in 1970 by Jon Folkman and then improved in 1987 by a probabilistic argument of Joel Spencer, in a paper titled, “Three Hundred Million Points Suffice.” That’s right: for the minimum Radó criterion he showed there is a limpid graph with ${N < \mathbf{300,000,000}}$ nodes. This 2007 paper by the Ramsey-theory experts Stanisław Radziszowski and Xu Xiaodong proves that the minimum possible ${N}$ is ${19}$ and gives evidence that ${N \leq 127}$ is plausible. let ${N_{3,3}}$ stand for the least possible such ${N}$ .

However one such graph exists, it must give rise to a “bat”-type construction and forthwith a reduction showing the ${\mathsf{NP}}$ -completeness of the ${K_4}$ -free Radó property. Thus the Ramsey-type existence question is entangled with complexity theory. How this relationship scales up for ${c}$ -Radó and clumpy versus limpid graphs is the point we are driving at.

Ramsey Scaling And Complexity

There are two main levers by which we are scaling:

Up from the constant ${3}$ of the Radó triangle property to ${\rho = \frac{1}{2}\log_2(n) - c}$ for ${c}$ -Radó.
Up to ${k=n^\epsilon}$ as the allowed clique size in a limpid graph.

The latter corresponds to the ${\epsilon n}$ size for clumpy that we are distinguishing, but there is some freedom in ${\epsilon}$ . If we imagine ${k}$ and ${\rho}$ as fixed—temporarily ignoring the dependence on ${n}$ —we can ask, what is the minimum size ${N = N_{\rho,k}}$ of a ${k}$ -limpid graph that is ${c}$ -Radó?

To avoid a complexity collapse, we must bet on at worst a single-exponential dependence on ${\rho}$ , polynomial in ${k}$ . Can we possibly show this directly? Note that ${\rho = 3}$ , ${k = 3}$ already raised the specter of ${N_{3,3}}$ in the hundred millions. Not only do we not see a way to scale up Spencer’s proof, he remarked in the intro that his proof is “extremely case specific” and does not scale even to 3-edge colorings. The “almost” feature of Definition 3 adds a further complication.

It may be that the guiderails for ascending to higher ${k}$ and ${\rho}$ (scaling with the number ${n}$ of nodes) are already present among the details of the graph constructions in David’s proof and the proofs of the theorems his paper builds on. We have not had time to delve. There must be some connection; the question is whether complexity considerations take the lead or follow only in train of the sides that can be mathematically disproved. Other connections of Ramsey theory are in this nice 2004 survey by Vera Rosta.

Here is one more indication of why we think the interaction between complexity and Ramsey theory can be nontrivial. Putting ${R = 2^r}$ , the upper bound noted above for the diagonal- ${r}$ Ramsey number has rough order ${R^2}$ . The best known lower bound has rough order ${R^{1/2}}$ . Can we show an impact of the growth of ${R}$ on complexity theory, in advance of resolving this gap which has been open for over fifty years?

Open Problems

The key issue is, what about the above question? Assuming the usual belief that quasi-polynomial time and nondeterministic quasi-polynomial time are distinct we must be able to show a fact about ${2}$ -colorings of clumpy and limpid graphs. If this is hard to prove, then we have an interesting impasse. Even worse, if this is false, then our beliefs collapse.

The feature that surprises us about the 1986 Putnam exam is that it does not have a question that involves the number 1,986. Ken and I both remember such features. Here is a problem they could have used; we will give the answer later:

Alan and Barbara play a game in which they take turns filling entries of an initially empty 1986 x 1986 array. Alan plays first. At each turn, a player chooses a real number and places it in a vacant entry. The game ends when all the entries are filled. Alan wins if the determinant of the resulting matrix is nonzero; Barbara wins if it is zero. Which player has a winning strategy?

The answer is at the end of the next post.

[removed reference to Putnam ranking, “batch”->”bats” in the proof from 1985, added answer link for puzzle at end]

A Possible Ramsey Insight into P Versus NP?

A Possible Ramsey Insight into P Versus NP?

David’s Result

The Question

The Quest For Limpid Radó Graphs

Ramsey Scaling And Complexity

Open Problems

Like this:

Recommend

Microsoft Surface Pro 8 Reviews, Pros and Cons | TechSpot

2Africa Pearls connects Africa, Europe, and Asia - Facebook Engineering

On Sprint Planning and Peeling Potatoes – Web Development Insights

Rakuten's Kobo brand introduces Kobo Sage and Kobo Libra 2 eReaders | TechSpot

Xbox teams with Rockstar energy drink on Halo-inspired collector's edition cans...

3 Ways to Remove the Background from an Image

Bug in Compound DeFi platform caused a giveaway of $162 million worth of COMP to...

Facebook whistleblower renews scrutiny of the social media giant : NPR

My Logging Best Practices

Careers, Internships and Jobs at Debrief | Debrief Careers

About Joyk