Boolean satisfiability is prototypical NPNP-complete problem.

There is no algorithm deciding SATSAT in worst case in polynomial time unless P=NPP=NP.

We always assume that PNPP\neq NP.

For SATSAT problem, SOTA algorithm uses time 2(11O(log(m/n)))nmO(1)2^{(1-\frac{1}{O(\log(m/n))})n}m^{O(1)}.

For special case of kSATkSAT, there are faster algorithms, but all take time 2(1ck)n2^{(1-\frac{c}{k})n} for some constant 0<c<k0\lt c\lt k.

Schöning’s kSAT Algorithm

该算法在上学期随机算法课最后一节课 马尔可夫链和随机游走 中讲过,故此处略过。

分析过程很恶心。

The Sparsification Lemma

🎯Lemma 22. For all ϵ>0\epsilon\gt0 and integers kk, there exists a constant C=C(ϵ,k)C=C(\epsilon,k) and an algorithm that given as input a kCNFkCNF Φ\Phi in nn variables runs in time tnO(1)tn^{O(1)}, where t2ϵnt\le2^{\epsilon n}, and give as output a list of kCNFkCNF formulas Ψ1,,Ψt\Psi_1,\cdots,\Psi_t each with at most CnCn clauses such that

Φ(x)=i=1tΨi(x),\Phi(x)=\bigvee_{i=1}^t\Psi_i(x),

for all x{0,1}nx\in\{0,1\}^n.

Reduce to Hitting Set Problem

Recall HittingSetHittingSet problem. Consider a universe UU of elements and a collection S\mathcal S of subsets of U\mathcal U. We say that TUT\subseteq\mathcal U is a hitting set for S\mathcal S if STS\cap T\neq\empty for all SSS\in\mathcal S. We denote by σ(S)\sigma(\mathcal S) the collection of all hitting sets of S\mathcal S.

kHittingSetkHittingSet problem:

  • Instance: Collection S\mathcal S of subsets of a universe U=2U\mathcal U=2^U.
  • Output: Hitting set TT for S\mathcal S minimizing T|T|.

We denote by kHittingSetkHittingSet the restriction to collections of subsets of size at most kk.

We can transform kSATkSAT into kHittingSetkHittingSet problem:

  • C\mathcal C is the subsets of U\mathcal U corresponding to the clauses of Φ\Phi.
  • An assignment to the variables of Φ\Phi can be viewed as a subset of U\mathcal U.
  • A\mathcal A is the collection of the nn sets {xi,xi}\{x_i,\overline{x_i}\} for i=1,,ni=1,\cdots,n. Note that assignments to the variables of Φ\Phi are in one-to-one correspondence to hitting sets of A\mathcal A.
  • Defining SΦ=CA\mathcal S_\Phi=\mathcal C\cup\mathcal A

Observation: There is a one-to-one correspondence between the set of satisfying assignments of Φ\Phi and hitting sets of SΦ\mathcal S_\Phi of size nn.

Any collection of subsets T\mathcal T over U\mathcal U can be viewed as a CNF formula by letting each subset define a clause.

📖Definition 57. Let T\mathcal T and S\mathcal S be collections of subsets. We say that T\mathcal T is a restriction of S\mathcal S if for every SSS\in\mathcal S there exists TTT\in\mathcal T s.t. TST\subseteq S.

也就是集合的集合 S\mathcal S 中的每一个元素(是一个集合)是集合的集合 T\mathcal T 中对应有一个元素(是一个集合)的子集。

很绕,不是吗?

Clearly, if T\mathcal T is a restriction of S\mathcal S, then σ(T)σ(S)\sigma(\mathcal T)\subseteq\sigma(\mathcal S). Any satisfying assignment of CNF formula given by T\mathcal T is also a satisfying assignment of the CNF formula given by S\mathcal S.

🤔Lemma 23 (Hitting set correspondence). For all ϵ>0\epsilon\gt0 and integers kk, there exists a constant C=C(ϵ,k)C=C(\epsilon,k) and an algorithm that given as input collection S\mathcal S of subsets of size at most kk of a universe U\mathcal U of size nn runs in time tnO(1)tn^{O(1)}, where t2ϵnt\le 2^{\epsilon n}, and give as output a list of restrictions T1,,Tt\mathcal T_1,\cdots,\mathcal T_t each with at most CnCn subsets of U\mathcal U which are all of size at most kk s.t.

σ(S)=i=1tσ(Ti).\sigma(\mathcal S)=\bigcup_{i=1}^t\sigma(\mathcal T_i).

We call a subset of U\mathcal U of size jj: jj-set.

The algorithm is a branching algorithm where each leaf of the recursion tree will give a restriction of S\mathcal S.

To branch, the algorithm looks for a large collection of subsets of the same size that has a non-empty intersection, and branches on whether or not the common intersection will be hit by the hitting set. We define such collections of sets as flowers.

📖Definition 58. Let S1,,ScS_1,\cdots,S_c be jj-sets.

  • We say that these sets are a flower if H=i=1cSiH=\bigcap_{i=1}^cS_i\neq\empty.
  • We call HH the heart of the flower and sets (S1H),,(ScH)(S_1\diagdown H),\cdots,(S_c\diagdown H) the petals.
  • Note that all petals have the same size jHj-|H|, which we denote as petal-size.

定义一朵花:一组大小为 jj 的集合,它们的交集为空。

花心:这组集合的交集。对应定义花心的大小。

花瓣:每个集合去除花心。对应定义花瓣的大小。

The algorithm will be controlled by integer parameters 2=θ0<θ1<<θk12=\theta_0\lt\theta_1\lt\cdots\lt\theta_{k-1}. We make θi\theta_i increase rapidly as a parameter of ii.

📖Definition 59. Let S1,,ScS_1,\cdots,S_c be a flower of jj-sets and petal-size ii.

  • The flower is large if cθic\ge\theta_i.
  • The flower is good if
    • it is a large flower chosen s.t. jj is chosen minimally
    • and if more than one large flower of jj-sets exists then chosen s.t. ii is chosen minimally.

如果一朵花含有集合的个数足够大,则称这朵花是“大”的。

如果一朵花是“大”的,并且其中包含的集合大小、花瓣大小都是最小的,则称这朵花是“好”的。

很形象的定义,但还是很绕

The algorithm will eliminate sets of a collection T\mathcal T that contain other sets of the collection. If S,TTS,T\in\mathcal T are such that STS\subsetneq T, we say that TT is eliminated by SS.

Let π(T)\pi(\mathcal T) denote the collection of minimal sets of T\mathcal T, i.e., the sets of T\mathcal T that are not eliminated by sets of T\mathcal T.

Pseudo code of Reduce Algorithm

Reduce(S)(\mathcal S)

Input: Collection S\mathcal S of subsets of size at most kk of a universe U\mathcal U.

Output: Restrictions T1,,Tt\mathcal T_1,\cdots,\mathcal T_t of s.t. σ(S)=i=1tσ(Ti)\sigma(\mathcal S)=\bigcup_{i=1}^t\sigma(\mathcal T_i).

  1. Sπ(S)\mathcal S\gets\pi(\mathcal S).
  2. If S\mathcal S has no large flower, return {S}\{\mathcal S\}
  3. Pick good flower S1,,ScS_1,\cdots,S_c with heart H=i=1cSiH=\bigcap_{i=1}^cS_i.
  4. return Reduce(S{H})(S\cup\{H\})\bigcupReduce(S{S1H,,ScH})(\mathcal S\cup\{S_1\diagdown H,\cdots,S_c\diagdown H\}).

Correctness of Reduce

🤔Lemma 24. If Reduce(S)={T1,,Tt}(\mathcal S)=\{\mathcal T_1,\cdots,\mathcal T_t\} then σ(S)=i=1tσ(Ti)\sigma(\mathcal S)=\bigcup_{i=1}^t\sigma(\mathcal T_i).

Each Ti\mathcal T_i is a restriction of S\mathcal S. Thus σ(Ti)σ(S)i\sigma(\mathcal T_i)\subseteq\sigma(\mathcal S)\forall i.

Note that σ(π(T))=σ(T)\sigma(\pi(\mathcal T))=\sigma(\mathcal T) for any collection T\mathcal T. It follows that eliminating sets from S\mathcal S does not eliminate any hitting sets of S\mathcal S.

Suppose that TT is a hitting set of S\mathcal S and let S1,,ScS_1,\cdots, S_c be a flower of S\mathcal S with heart HH. If THT\cap H\neq\empty, then TT is a hitting set of S{H}S\cup\{H\}. If TH=T\cap H=\empty then TT must be a hitting set of S{S1H,,ScH}\mathcal S\cup\{S_1\diagdown H,\cdots,S_c\diagdown H\}. Thus any hitting set of S\mathcal S is preserved in at least one recursive branch.

It’s easy to see that Reduce always terminates.

It remains to establish qualitative properties of Reduce: Every restriction output by Reduce is sparse, i.e., contains at most CnCn sets, for a constant C=C(k,ϵ)C=C(k,\epsilon).

🤔Lemma 25. Let C=i=1k(θi11)C=\sum_{i=1}^k(\theta_{i-1}-1). Every restriction T\mathcal T output by Reduce contains at most (θi11)n(\theta_{i-1}-1)n many ii-sets, and therefore at most CnCn sets in total.

Prove by contradiction. Use pigeonhole principle.

The number of restriction output by Reduce should be bounded by the number of leaves tt in the recursion tree.

🤔Lemma 26. Let 0<m<n0\lt m\lt n be integers. It holds that

i=1m(ni)2H(mn)n.H(p)=plog2(1/p)+(1p)log2(1/(1p))\sum_{i=1}^m{n\choose i}\le2^{H(\frac{m}{n})\cdot n}.\\ H(p)=p\log_2(1/p)+(1-p)\log_2(1/(1-p))

🤔Corollary 14. A binary tree of depth hh in which every root-to-leaf path follows the right child at most γh\gamma h times has at most 2H(γ)h2^{H(\gamma)h} leafs.

Proof by Lemma 26.

When γ1/2\gamma\le1/2 we have H(γ)2γlog2(1/γ)H(\gamma)\le2\gamma\log_2(1/\gamma), thus we use the simpler upper bound 2(2γlog2(1/γ))h2^{(2\gamma\log_2(1/\gamma))h} on the number of leaves.

🤔Lemma 27. If Reduce(T)(\mathcal T) recursively calls Reduce(T)(\mathcal T') by adding an ii-set to T\mathcal T, then Ii(T)\mathcal I_i(\mathcal T') holds. Furthermore in all recursive calls Reduce(T)(\mathcal T'') that follows, the invariant Ii(T)\mathcal I_i(\mathcal T'') holds.

没咋看懂这个引理

🤔Lemma 28. Any set SS can eliminate at most 2(θi11)2(\theta_{i-1}-1) of the added ii-sets.

🤔Lemma 29. Let 1<i<k1\lt i\lt k and suppose βi1\beta_{i-1} is s.t. at most βi1n\beta_{i-1}n sets of size at most i1i-1 that are added along a given path of the recursion tree of Reduce. Then the number of ii-sets added along that path is at most (2βi1+1)(θi11)n(2\beta_{i-1}+1)(\theta_{i-1}-1)n.

🤔Corollary 15. At most βin\beta_in sets of size at most ii are added along any path of the recursion tree of Reduce, for all 1i<k1\le i\lt k.

🤔Lemma 30. Along any path of the recursion tree of Reduce at most βin/θi\beta_in/\theta_i families are created by adding petals of size ii for every 1i<k1\le i\lt k.

We set θi\theta_i in such a way that the ratio θi/βi\theta_i/\beta_i is equal to an integer constant α\alpha, to be fixed later, for 1ik1\le i\le k. Thus

βi=4βi1θi1=4α(βi1)2βi=(4α)2i11,θi=α(4α)2i111ik\beta_i=4\beta_{i-1}\theta_{i-1}=4\alpha(\beta_{i-1})^2\\ \Rarr \beta_i=(4\alpha)^{2^{i-1}-1},\theta_i=\alpha(4\alpha)^{2^{i-1}-1}\forall 1\le i\le k

🤔Corollary 16. Along any path of the recursion tree of Reduce at most kn/αkn/\alpha families are created by adding petals.

让我缓缓

The Exponential Time Hypothesis

A stronger hypothesis than PNPP\neq NP: Exponential time is required to decide SATSAT.

Exponential time hypothesis (aka ETH): kSATkSAT requires worst-case linear exponential time, for all k3k\ge3.

A strengthening of this hypothesis, strong exponential time hypothesis: it is not possible to save an exponential factor 2ϵn2^{\epsilon n} for a fixed 1>ϵ>01\gt\epsilon\gt0 independent of kk

📖Definition 60. Define sk=inf{δkSAT is solvable in time O(2δn}s_k=\inf\{\delta|kSAT\text{ is solvable in time }O(2^{\delta n}\}. The ETH is the statement that k3:sk>0\forall k\ge3:s_k\gt0, and SETH is the statement that s=limksk=1s_{\infty}=\lim_{k\to\infty}s_k=1.

🤔Proposition 20. For any ϵ>0\epsilon\gt0 and integer kk there is C>0C\gt0 s.t. if there is an algorithm AA for solving 3SAT3SAT in time 2δnnO(1)2^{\delta n}n^{O(1)} there is an algorithm BB for kSATkSAT running in time O(2(ϵ+δkC)nnO(1))O(2^{(\epsilon+\delta kC)n}n^{O(1)}).

Input: kCNF:ΦkCNF:\Phi on nn variables. the algorithm BB first invokes the algorithm of the Lemma 22 to compute kCNFkCNF formulas Ψ1,,Ψt\Psi_1,\cdots,\Psi_t in time tnO(1)tn^{O(1)}, where t2ϵnt\le2^{\epsilon n}, and each Ψi\Psi_i has hat most CnCn clauses.

Φ\Phi is satisfiable iff at least one of the formula Ψi\Psi_i are satisfiable.

BB may apply the standard reduction from kSATkSAT to 3SAT3SAT, obtaining 3CNF3CNF formula Ψ1,,Ψt\Psi_1',\cdots,\Psi_t', each having at most n+Cn(k3)kCnn+Cn(k-3)\le kCn variables.

Using algorithm AA each Ψi\Psi_i' can be checked for satisfiability in time 2δ(kCn)nO(1)2^{\delta(kCn)n^{O(1)}}. This together means that BB solves kSATkSAT in time as claimed.

🤔Corollary 17. The following holds:

  1. ETH is equivalent to s3>0s_3\gt0.
  2. SETH implies ETH.

🤔Proposition 21. ETH is equivalent to the existence of δ>0\delta\gt0 s.t. no algorithm can solve 3SAT3SAT in time 2δ(n+m)nO(1)2^{\delta(n+m)}n^{O(1)}, where nn is the number of variables and mm the number of clauses.

Consider the independent set problem:

  • Instance: Graph G=(V,E)G=(V,E), integer KK.
  • Question: Does GG have an independent set of size at least KK?

🎯Theorem 60. If ETH holds, then there exists δ>0\delta\gt0 s.t. no algorithm can solve the independent set problem in time 2δnnO(1)2^{\delta n}n^{O(1)}, where nn is the number of vertices of the given graph GG.

Standard reduction from 3SAT3SAT to independent set problem.