Dynamic Programming

An optimal solution to a problem is built up from the optimal solution of some subproblems.

Weakly $NP$-hard problems sometimes have dynamic programming algorithms which run in pseudo-polynomial time.

Strongly $NP$-hard problem: A problem is strongly $NP$-hard if it is $NP$-hard even when its numeric data is encoded in unary.

Pseudo-polynomial algorithms: An algorithm is pseudo-polynomial if its running time is polynomial in the size of the input when the numeric part is encoded in unary.

A problem is weakly $NP$-hard if it has a pseudo-polynomial algorithm
For instance, a dynamic programming algorithm for knapsack requires $O(n\cdot B)$ time, it is polynomial if $B$ is encoded in unary, but $B$ can be represented in $\log_2B$ bits.

Two techniques which uses dynamic programming:

We can devise approximation algorithm by rounding the values of the instances in order to have a polynomial number of distinct values.
We can split the input in "large" and "small" parts.

The Knapsack Problem

Dynamic Programming

We maintain an array $A(j)$, for each $j$, which is a list of pairs $(t,w)$. If $(t,w)\in A(j)$, then there exists $S\subseteq\{1,\cdots,j\}$ s.t.

$\sum_{i\in S}s_i=t$
$\sum_{i\in S}v_i=w$

A pair $(t,w)$ dominates $(t',w')$ if $t\leq t'\land w\geq w'$.

We ensure that in $A(j)$ no pair dominates another one: $A(j)=\{(t_i,w_i)|i=1,\cdots,k\}$

$t_1<t_2<\cdots<t_k$
$w_1<w_2<\cdots<w_k$

The sizes and the value are integers. If $V=\sum_{i=1}^nv_i$, then each least has at most $\min\{B+1,V+1\}$ pairs.

Pseudocode

$A(1)=\{(0,0),(s_1,v_1)\}$

for $j=2\to n$ do

$A(j)=A(j-1)$
for each $(t,w)\in A(j-1)$ do
if $t+s_j\leq B$ then Add $(t+s_j,w+s_j)$ to $A(j)$
Remove dominated pairs from $A(j)$

return $\max_{(t,w)\in A(n)}w$

To remove dominated pairs:

sort $A(j)$ w.r.t. first component
Retain the best value for each space total
Remove any larger space total that does not have a larger value

Theorem: The DP correctly computes the optimal value of knapsack problem.

Running Time

$O(n\cdot\min\{B,V\})$, $V=\sum_{i=1}^nv_i$. It is pseudo-polynomial.

We round the values of $v_i$ in order to have $V=O(poly(n))$.

Complexity Classes

Polynomial Time Approximation Scheme (PTAS): A PTAS is a family of algorithms $\mathcal A$, where there is an algorithm for each $\epsilon>0$, s.t. $A\in\mathcal A$ is a $(1+\epsilon)$-approximation algorithm (for min problem) or a $(1-\epsilon)$-approximation algorithm (for max problem)

Fully Polynomial Time Approximation Scheme (FPTAS): a FPTAS is a PTAS s.t. the running time of $A$ is bounded by a polynomial in $1/\epsilon$.

FPTAS for Knapsack

Consider a value $\mu$ and convert each $v_i$ by rounding down to the nearest integer multiple of $\mu$: $$ v_i'=\lfloor\frac{v_i}{\mu}\rfloor. $$ We can use the dynamic programming algorithm on the items with sizes $s_i$ and values $v_i'$.

We choose $\mu$ s.t.

The optimal solution for the rounded data is as a near-optimal solution for the true data.
The sum of $v_i'$ is polynomial in $n$.

If we use $\tilde v_i=v_i'\mu$ then we have an error at most $\mu$ for each $v_i$.

Therefore, we have a solution which changes by at most a factor $n\mu$.

We want this error to be $\epsilon$ far from the optimum.

$M=\max_{i\in I}v_i$ is a lower bound to the optimum.

We choose $\mu$ s.t. $n\mu=\epsilon M$ i.e. $\mu=\epsilon M/n$.

Moreover, with the modified values, $$ V'=\sum_{i=1}^{nv_i'=\sum_{i=1}}n\lfloor\frac{v_in}{\epsilon M}\rfloor=O(n^2/\epsilon) $$

Pseudocode

$M=\max_{i\in I}v_i$

$\mu=\epsilon M/n$

$v_i'=\lfloor v_i/\mu\rfloor$ for all $i\in I$

Run the DP on instance with $v_i'$.

Running time $O(n^3/\epsilon)$.

Theorem: this algorithm is a $FPTAS$ for the knapsack problem

Parallel Job Scheduling

Minimum makespan scheduling

PTAS

We use similar arguments for longest processing time rule to obtain $PTAS$. We partition the jobs in two sets: short jobs and long jobs.

We optimally assign the long jobs by enumeration and assign the short jobs by using list scheduling.

If the last job is long, then it is optimal.
If the last job is short, then the list scheduling analysis applies that gives a $(1+s)$-approximation, where $s$ is the processing time of a short job.

We show that there exists a trade-off between the number of long jobs and the solution found.

Given a job $i$ and a fixed integer $k$ ($\epsilon=1/k$)

short: $p_i\leq\frac{1}{km}\sum_{j=1}^np_j$
long: $p_i>\frac{1}{km}\sum_{j=1}^np_j$

There are at most $km$ long jobs.

PTAS for minimum makespan scheduling on identical parallel machines.

Enumerate all the possible schedule for the long jobs and choose the one with minimum makespan
Extend this schedule by using list scheduling for the short jobs

There are $m$ machines and at most $km$ long jobs. So enumerating $m^{km}$ possible schedules. It is polynomial if $m$ is constant.

Approximation

For the last job $\ell$ $$ C_\ell=C_\max\leq p_\ell+\frac{1}{m}\sum_{j\neq\ell}p_j. $$ Two cases:

$\ell$ is long: the schedule is optimal. (the short jobs fill the "empty spaces")
$\ell$ is short: $p_\ell\leq\frac{1}{km}\sum_{j=1}^np_j$, then

\[ C_\max\leq\frac{1}{km}\sum_{j=1}^np_j+\frac{1}{m}\sum_{j\neq\ell}p_j\leq(1+1/k)\frac{1}{m}\sum_{j=1}^np_j\leq(1+1/k)C_\max^*. \]

Theorem: The family $\{A_k\}$ of the above algorithms is a PTAS for the minimum makespan scheduling on a constant number of identical parallel machines.

Extend to More Machines

We don't need the schedule for the long jobs to be optimal.

We only used the optimality when the last job to finish was a long job.
It is sufficient to have a schedule for the long jobs that has makespan at most $1+1/k$ times the optimal value

We can round the processing time of long jobs and then use dynamic programming as in knapsack problem.

Fix a target makespan $T$ and find a family of algorithm $B_k$ that

either find a schedule of length $(1+1/k)T$,
or prove that a schedule of length $T$ does not exist.

We assume that $T\geq\frac{1}{m}\sum_{j=1}^np_j$, since otherwise no feasible schedule exists.

Algorithm

Partition: Short if $p_i<T/k$, long if $p_i≥T/k$.

Round: For each long job, let $p_i′=⌊p_i/(T/k2)⌋⋅(T/k2)p_i′=⌊p_i/(T/k2)⌋⋅(T/k2)$. (The number of distinct rounded values is limited)

DP: Determine if the rounded long jobs can be scheduled on mm machines within time $T$.

If no such schedule exists, then no schedule of length $T$ exists for the original input.
If yes, convert it to a schedule for original long jobs (each job's true time is at most $T/k^2$ more than its rounded time).

Extend: Assign short jobs greedily (list scheduling).

Analysis

Long jobs: On a single machine, at most $k$ long jobs (since each rounded job $\geq T/k$). The total true processing time on that machine is at most $T+k⋅T/k^2=(1+1/k)T$.
Short jobs: If the last job $\ell$ is short, $p_\ell<T/k$, and by list scheduling: $C_\max⁡≤p_\ell+\frac{1}{m}\sum_{j\neq\ell}p_j≤T/k+T=(1+1/k)T$.

Thus, if a schedule of length $T$ exists, $B_k$ finds one of length $(1+1/k)T$.

DP for rounded long jobs: Since each long job has rounded processing time a multiple of $T/k^2$, the number of distinct types is $O(k^2)$. The DP can be done in time $n^{O(k^2)}$, which is polynomial in $n$ but exponential in $k^2$.

To PTAS (Binary Search)

We want to find a schedule with makespan at most $(1+\epsilon)OPT$. Let $k=⌈1/\epsilon⌉$.

We perform a binary search on $T$ over an interval $[L,U]$ that is guaranteed to contain $OPT$:

$L_0=\max⁡\{\frac{1}{m}\sum p_j, \max⁡_jp_j\}$ (a lower bound).
$U_0=\frac{1}{m}\sum p_j+\max⁡_jp_j$ (an upper bound, e.g., from list scheduling).

Search:

While $L<U$:

Set $T=⌊(L+U)/2⌋$.
Run $B_k$ with this $T$.
If $B_k$ produces a schedule, set $U=T$ (we have a schedule of length $\leq (1+\epsilon)T$).
Else, set $L=T+1$ (no schedule of length $T$ exists, so $OPT>T$).

Termination: When $L=U$, we have a schedule of length at most $(1+\epsilon)L≤(1+\epsilon)OPT$.

Running time: Each $B_k$ call takes $n^{O(k^2)}=n^{O(1/ϵ^2)}$, and we make $O(\log(U0−L0))$ calls (polynomial in input size). The overall complexity is exponential in $1/ϵ^2$, so it is a $PTAS$ but not an $FPTAS$.

Strong NP-Hardness and Impossibility of FPTAS

Theorem: If a strongly $NP$-hard optimization problem has an integer-valued objective and the optimum is bounded by a polynomial in the unary input size, then the problem does not admit an $FPTAS$ unless $P=NP$.

Proof sketch: Suppose an $FPTAS$ exists with running time polynomial in $1/ϵ$. Set $ϵ=1/p(|Iu|)$, where $p$ is the polynomial bound on $OPT$. Then the $FPTAS$ returns a solution with value $<OPT+1$ (since $\epsilon\cdot OPT<1$), hence an optimal solution. The running time becomes polynomial in the unary input size, contradicting strong $NP$-hardness.

Corollary: Minimum makespan scheduling is strongly $NP$-hard, so it cannot have an $FPTAS$ (unless $P=NP$).

Approximation Classes

FPTAS (Fully Polynomial Time Approximation Scheme): approximation ratio $1\pm\epsilon$ in time polynomial in $n$ and $1/\epsilon$. Example: Knapsack.

PTAS (Polynomial Time Approximation Scheme): ratio $1\pm\epsilon$ in time polynomial in $n$ (may be exponential in $1/\epsilon$). Example: Minimum makespan scheduling.

APX: constant factor approximation. Example: $k$-center.

log-APX: approximation ratio $O(\log n)$. Example: Set Cover.

Poly-APX: ratio $O(n\alpha)$. Example: Coloring.

exp-APX: ratio exponential in $n$. Example TSP.

Approximation Classes Hierarchy

\[ FPTAS\subset PTAS\subset APX\subset\log-APX\subset\text{poly}-APX\subset \exp-APX \]

The hierarchy collapse if $P=NP$.