随机算法 11 Randomized Rounding for MAX SAT
MAX SAT and MAX CUT
randomized ½ approximation for each problem
MAX SAT
aka maximum satisfiability problem
- Input consists of Boolean variables , clauses .
- Clauses consist of some number of variables and \or and a non-negative weight .
- The objective of the problem is to find an assignment of that maximizes the weight of the satisfied clauses.
- A clause is said to be satisfied if one of the unnegated variables is set to
true
, or one of the negated variables is set tofalse
.
Some terminology:
- literal. A variable or a negated variable is a literal, so that each clause consists of some number of literals.
- is called positive literal
- is called negative literal
- length. The number of literals in a clause. Length of is .
- Clause of length 1 is unit clause.
WLOG, assume that no literal is repeated in a clause, at most one of and appears in a clause.
Assume the clauses are distinct.
A very straightforward use of randomization: set each to true
independently with probability .
🤔Theorem 1. Setting each to true
with probability independently gives a randomized -approximation algorithm for the MAX SAT problem.
Define indicator variable : if clause is satisfied and otherwise.
Let be a random variable that is equal to the total weight of the satisfied clauses, so that .
Let denote the optimum value of the MAX SAT instance. Then, by linearity of expectation, and the definition of the expectation of 0-1 random variable, we know
For each , the probability that it is not satisfied is the probability that each positive literal in is set to
false
and each negative literal is set totrue
, each of which happens with probability . Hence,where the last inequality is a consequence of the fact that . Hence,
where the last inequality follows from the fact that the total weight is an easy upper bound on the optimal value, since each weight is assumed to be non-negative.
If for each clause , the algorithm is a -approximation algorithm for such instances. Thus the longer the clauses, the better the approximation.
Consider for all clauses, the problem turns into MAX E3SAT. By analysis above, this randomized algorithm gives guarantee of approximation. Nothing better is possible for these instances unless .
🤔Theorem 2. If there is an -approximation algorithm for MAX E3SAT for any constant , then .
MAX CUT
aka maximum cut problem
In this problem. the input is an undirected graph , along with a non-negative weight for each edge . The goal is to partition the vertex set into two parts and , so as to maximize the weight of edges whose two endpoints are in different parts, one in and another in .
An edge with endpoints in both and is in the cut. In the case for each edge , we have an unweighted MAX CUT problem.
We place each vertex into independently with probability . As with the MAX SAT algorithm, this can be viewed as sampling a solution uniformly form the space of all possible solutions.
🤔Theorem 3. If we place each vertex into independently with probability , then we obtain a randomized -approximation algorithm for the maximum cut problem.
Consider a random variable that is if edge is in the cut, and otherwise.
Let be the random variable equal to the total weight of edges in the cut, so that .
Let denote the optimal value of the maximum cut instance. Then, as before, by linearity of expectation and the definition of expectation of a 0-1 random variable, we get that
The probability that a specific edge is in the cut: since the 2 endpoints are placed in the sets independently, they are in different sets with probability equal to . Hence,
where the inequality follows directly from the fact that the sum of the non-negative weights of all edges is obviously an upper bound on the weight of the edges in an optimal cut.
Derandomization
Randomized algorithm for MAX SAT can be derandomized by replacing the randomized decision of whether to set to true
with a deterministic one that will preserve the expected value of the solution. The decisions will be made sequentially.
Assume that is fixed, and other variables will be set true
with probability . Intuitively we should choose the value of which maximize the expected value of . In this way, we should maintain the algorithm that the expected value is at least half the optimum, while having fewer random variables left.
If , then we set to true
, otherwise false
. Then
By induction, we can set the sequence of variables better than .
Flipping Biased Coins
Consider MAX SAT instance with no unit clauses . (This condition can be moved)
Suppose we set each to be true
independently with probability , then the probability that any given clause is satisfied is at least for MAX SAT instances with no negated unit clauses.
假设这些布尔表达式不含有单个非变量。
对于只含一个变量的布尔表达式,满足的概率为 .
对于含有至少两个变量的布尔表达式,假设 有 个, 有 个,则该表达式不满足的概率为:
现在就需要将 最大化。
Best performance is .
🤔Theorem 4. Setting each to true
with probability independently gives a randomized -approximation algorithm for MAX SAT instances with no negated unit clauses.
Let be the weight of the unit clause if it exists in the instance, and let be zero otherwise.
has a tighter bound: .
Let be the set of indices of clauses of the instance excluding unit clauses of the form . WLOG the weight of each clause is no greater than the weight of clause . Thus . Each is set to true
with probability . So the algorithm produces
This algorithm can be derandomized using the method of conditional expectations.
Randomized Rounding
Biasing probability with which we set true
yields an improved approximation algorithm. However, we gave each variable the same bias. We can do better by giving each variable its own bias!
Idea of randomized rounding:
- First set up an integer programming formulation of the problem at hand in which there are 0-1 integer variables. 0-1 variable for each Boolean variable s.t. corresponds to set
true
. - The integer program is relaxed to a linear program by replacing the constraints with .
- The fractional value is interpreted as the probability that should be set to . In this case we set each to
true
with probability independently.
For MAX SAT problem, in addition to the variables , we introduce a variable for each that will be if the clause is satisfied and otherwise. For each clause let be the indices of the variables that are positive, be negated in the clause.
We denote the clause by
\bigvee_{i\in P_j}x_i\or\bigvee_{i\in N_j}\bar x_i.
Then the inequality
must hold for since if each variable that occurs positively in the clause is set to false
and each variable that occurs negatively is set to true
, then the clause is not satisfied, and must be .
This yields integer programming formulation of MAX SAT problem:
- maximize
- subject to , \forall C_j=\bigvee_{i\in P_j}x_i\or\bigvee_{i\in N_j}\bar x_i,
is the optimum value of this linear program , then clearly the optimum value of integer program (IP) .
Let be the optimum value of solution to the linear programming, relaxation. Consider the result of using randomized rounding, and setting to true
with probability independently. Before we can begin the analysis, we will need 2 facts. The first is commonly called the arithmetic-geometric mean inequality, because it compares the arithmetic and geometric means of a set of numbers.
If , then .
🤔Theorem 5. Randomized rounding gives a randomized -approximation algorithm for MAX SAT.
Applying arithmetic-geometric mean inequality:
By rearranging terms,
By the inequality Optimum of LP Optimum of IP, we see
Function is concave for , then
Therefore the expected value of the randomized rounding algorithm is
is non-increasing. It approaches from above as tends to infinity. Thus we have
This randomized rounding algorithm can be derandomized in the standard way using the method of conditional expectations.
这个证明不太优雅,其实可以用 进行放缩,将乘积化为 的指数和的形式。
然后再用线性规划的结果至少优于整数线性规划的结果,得到 。
Choosing the Better of 2 Solutions
Consider a given clause of length . The randomized rounding algorithm satisfies the clause with probability at least , while the unbiased randomized algorithm satisfies the clause with probability .
When the clause is short, it is very likely to be satisfied by the randomized rounding algorithm, tough not by the unbiased randomized algorithm. When the clause is long, the opposite is true.
🤔Theorem 6. Choosing the better of the two solutions given by the randomized rounding algorithm and the unbiased randomized algorithm yields a randomized -approximation algorithm for MAX SAT.
Let be a random variable denoting the value of the solution by randomized rounding algorithm.
Let be the results of unbiased randomized algorithm. To show:
Observe that
Claim that
for all positive integers .
通过求导等方法可以求出这个式子的最小值,不太优雅啊……
Notice that taking the best solution of the two derandomized algorithms gives at least half of sum of two results.
Thus taking the best solution of the two derandomized algorithms is a deterministic -approximation algorithm.
Non-linear Randomized Rounding
-approximation algorithm for MAX SAT can be obtained by using randomized rounding with a non-linear function .
is a function from to
Select s.t. for every . Such an does exist.
Use the similar analysis of linear randomized rounding.