随机算法 14 Markov Chains and Random Walks
Lecture notes by Elias Koutsoupias from Oxford
Markov Chains
Definitions
A discrete-time finite Markov chain is a sequence of random variables taking values in a finite set of states s.t.
where is an stochastic transition matrix.
represents the state of the Markov chain at time and the matrix encodes the state-to-state transition probabilities. The characteristic property of a Markov chain is that the distribution of depends only on . The conditional probability above is independent of the time .
The -step transition probabilities are given by the power matrix , i.e.,
. Thus,
A Markov chain is completely determined by the distribution of and the transition matrix .
A Markov chain can be represented as a weighted directed graph whose vertices are the states of the chain and in which there is an edge of weight whenever . We define the weight of a path to the product of the weights of the edges along the path, i.e. the probability to take the path. Then the sum of the weights of all length- paths to is .
The term random walk usually refers to the random process in which a particle moves on a graph, selecting the next node uniformly at random from the adjacent nodes. It is the special case of a Markov chain when for every state all probabilities for are equal and .
A lazy random walk is similar but .
Randomized 2-SAT
Given a 2-CNF formula, either find a satisfying assignment or determine that no such assignment exists.
CNF aka conjunctive normal form,合取范式,形如 ()\and()\and()
2-CNF 就是 ()\and()\and()\cdots\and(),括号内里面是两个布尔变量的析取。
DNF aka disjunctive normal form,析取范式,形如 ()\or()\or()
2-SAT is solvable in polynomial time (in fact linear time).
There is a polynomial time randomized algorithm for 2-CNF formulas whose analysis is based on a random walk.
Pseudo-code
Randomized 2-SAT
- Pick a random assignment
- Repeat times or until a satisfying assignment is found
- Pick an unsatisfied clause
- Pick a literal in uniformly at random, and flip its value
- Return current assignment if satisfying, else return unsatisfiable
Analysis
If the is not satisfiable then the algorithm returns unsatisfiable. However, if is satisfiable then the algorithm might not find a satisfying assignment. We analyze the probability that the algorithm gives correct answer.
Consider is satisfiable. Let be a particular assignment that satisfies .
Then we bound the probability that algorithm finds . If is not satisfied by an assignment there must be an unsatisfied clause . Observe that one or both literals of must have different truth values under and . Thus if we flip a random literal in then the probability that we increase the agreement between and is either or .
Writing for the assignment after iterations of the main loop and for the number of variables in which agrees with , we have
If we replace inequalities with equations, then becomes a Markov chain on the state space . The Markov chain of this process is:
flowchart LR 0((0)) 1((1)) 2((2)) 3((...)) 4((n)) 0-->|1|1 1-->|1/2|0 1-->|1/2|2 2-->|1/2|1 2-->|1/2|3 3-->|1/2|2 3-->|1/2|4 4-->|1|4
The chain describes the movement of a particle on a line: it is an example of a one-dimensional random walk with one reflecting and one absorbing barrier.
The expected time until is an upper bound on the expected number of steps for the 2-SAT algorithm to find .
Let be the expected time for to first reach starting at . Then we have
This is a linear system of equations in unknowns.
By some deduction, we have . Combining with , we get that . Thus .
Let denote the number of steps for the 2-SAT algorithm to find assignment starting from some arbitrary assignment. By Markov’s inequality, we have
Theorem: If is satisfiable then the 2-SAT algorithm returns a satisfying assignment with probability at least . If is unsatisfiable then the 2-SAT algorithm is always correct.
The failure probability can be decreased to by repeating the loop instead of times.
Randomized 3-SAT
Extending the randomized algorithm for 2-SAT to 3-SAT.
3-SAT is NP-complete. We can use exhaustive search to solve it.
Suppose a given 3-CNF formula has a satisfying assignment . Let be the assignment after steps and let denote the number of propositional variables in which agrees with . Similarly,
Then by replacing into , we turn the process into Markov chain:
flowchart LR 0((0)) 1((1)) 2((2)) 3((...)) 4((n)) 0-->|1|1 1-->|2/3|0 1-->|1/3|2 2-->|2/3|1 2-->|1/3|3 3-->|2/3|2 3-->|1/3|4 4-->|1|4
The expected time for the resulting chain to reach state is an upper bound for the expected time of the algorithm to find a satisfying valuation.
In this case, there is twice likelihood to move away from as to move towards .
Let denote expected number of steps to reach when starting from state . Then we have the following system of equations
By solving this linear system, we get for all .
With this leads to .
The above bound shows that the expected number of flips to find is at most . This bound is useless. However, we can improve the performance of this procedure by adapting random restarts into search. A random truth assignment leads to a binomial distribution over the states of the Markov chain centered around the state .
Since Markov chain has a tendency to move towards state , after running it for a little while we are better off restarting with a random distribution than continuing to flip literals.
After random restart, by Chernoff bounds, the current assignment differs from is approximately positions with high probability.
There is an algorithm trying to find a satisfying assignment in phases. where each phase starts with a random restart and consists of literal flips.
Pseudo-code
Randomized 3-SAT
- Repeat times or until a satisfying assignment is found
- Pick a random assignment
- Repeat times or until a satisfying assignment is found
- Pick an unsatisfied clause
- Pick a literal in uniformly at random, and flip its value
- Return current assignment if satisfying, else return unsatisfiable
Theorem: If is satisfiable then the 3-SAT algorithm returns a satisfying assignment with probability at least . If is unsatisfiable then the 2-SAT algorithm is always correct.
Let be the probability to reach position with steps starting from position . We have
since this is the probability to make exactly right moves and left moves in the first steps.
By Stirling’s formula:
we have
Thus when ,
The probability of success drops exponentially with the distance between the initial random assignment of the phase and the satisfying truth assignment. But by Chernoff bounds, the probability that this distance is away from also drops exponentially.
Having established a lower bound for we now obtain a lower bound for , the probability to reach state with steps in case the initial state of the random walk is determined by a random assignment. We have
The number of phases to find a satisfying assignment is a geometric random variable with parameter , so the expected number of phases to find a satisfying assignment is . Thus the algorithm finds a satisfying assignment with probability at least after phases.
Stationary Distributions
Definition
Let be a Markov chain with transition matrix . A probability distribution on the set of states is called stationary distribution of if
Given states of a Markov chain, the hitting time is a random variable giving the number of steps to visit state starting from state . Formally we have
We define to be the mean hitting time.
In general may be infinite. Certainly if is not reachable from in the underlying graph, or, if can reach a state from which is no longer reachable. We therefore restrict attention to Markov chains whose underlying graph consists of a single strongly connected component. We call such chains irreducible.
We can still have in an irreducible chain.
Consider Markov chain with set of states and transition matrix , where and . Starting at state , the probability not to have returned to state with steps is
Thus we return to state almost surely, but the expected return time is
In finite irreducible Markov chain, the mean hitting time between any pair of states is always finite.
Theorem: For any pair of states in a finite irreducible Markov chain, .
Let be the diameter of the underlying graph, let be the smallest positive entry of the transition matrix . Then for any pair of states , if the chain is started in then it visits within steps with probability at least .
A finite irreducible Markov chain has a unique stationary distribution.
Theorem: A finite irreducible Markov chain has a unique stationary distribution where for each state .
Stationary distribution represents a time average. On average the chain visits state every steps, and thus spends proportion of time in state . We expect that the stationary distribution is also a space average, i.e., the distribution of converges to as tends to infinity. However, in general this need not be true.
Definition of period: The period of a state in a Markov chain is defined to be . A state is aperiodic if it has period . A Markov chain is aperiodic if all states are aperiodic.
A Markov chain is irreducible and aperiodic iff there exists some power of the transition matrix with all entries strictly positive.
Theorem: Consider a finite, irreducible, aperiodic Markov chain with stationary distribution . Then is the matrix
If follows that for any initial distribution we have as , i.e., the chain converges to the stationary distribution from any starting point.
Random walks on undirected graphs
Let be a finite, undirected and connected graph. The degree of a vertex is the number of edges incident to . A random walk on is a Markov chain modeling a particle that moves along the edges of and which is equally likely to take any outgoing edge at a given vertex/
Formally a random walk on has set of states is and for each edge the probability to transition from to is .
Claim: A random walk on is aperiodic iff is not bipartite.
Theorem: A random walk on a connected graph has a unique stationary distribution , where
所有点的度数之和等于 .
Corollary: Let denote the expected number of steps to go from vertex to vertex . Then for any vertex we have
Theorem: If are adjacent then .
omitted.
Theorem: The cover time of a connected graph with nodes and edges is at most .
omitted
s-t connectivity
We bound on the cover time of random walks to design an impressively simple space efficient algorithm for the fundamental problem of s-t connectivity. In this problem we are given an undirected graph with nodes, not necessarily connected. We are also given two nodes and we want to find out whether are connected.
Pseudo-code
s-t connectivity
- start a random walk from
- If is reached in steps then return reachable else return unreachable
No false positives in the algorithm.
By Markov’s inequality, the algorithm outputs a path from to with probability .
The algorithm only needs to store a constant number of vertex-addresses at any one time.
If follows that we can decide reach-ability in undirected graph in space using randomization.