We dedicated to solve the open problem left by Greedy Capture: Is there an algorithm that has better PROP guarantee 2ρ1+22\le\rho\le1+\sqrt2?

My Initial Immature Idea

Do some fine-tuning on results of Greedy Capture.

Pseudo code

  1. XX\gets Greedy Capture(N,M,k)(\mathcal N,\mathcal M,k)
  2. NN\gets\empty // NN denotes “free” data points
  3. while \exist 2-approximate Block Coalition
    • NbcN_{bc}\gets all data points in that 2-approx Block Coalition
    • XbcX_{bc}\gets all cluster centers associated with any data points in NbcN_{bc}
    • NaffectN_{\text{affect}}\gets all data points associated with any cluster centers in XbcX_{bc}
    • NN(NaffectNbc)N\gets N\cup(N_{\text{affect}}\diagdown N_{bc})
    • xx^*\gets cluster center of 2-approximate Block Coalition
    • XXXbcX\gets X\diagdown X_{bc}
    • XX{x}X\gets X\cup\{x^*\}
  4. return XX

Idea is straightforward, directly eliminate 2-approx Block Coalition.

But how to continue, we need to rerun the Greedy Capture at some point.

Improve the Idea

Instead of using Greedy Capture as black box, we do some modification in it.

“Fine-tuning” Greedy Capture

In the while loop of Greedy Capture, we audit whether there exist 2-approx Block Coalition.

If there is one, we open the corresponding cluster center xx^*, and fix all n/k\lceil n/k\rceil point in the Block Coalition.

Set the fixed radius rbcr_{bc}, denoting the distance from cluster center of BC to the n/k\lceil n/k\rceil-th nearest data point.

Open that cluster center.

Then run the Greedy Capture from the scratch neglecting all fixed point. We don’t synchronously increase rbcr_{bc} until δ\delta reaches rbcr_{bc}.

Also auditing the 2-approximate in that Greedy Capture, if there are block coalition related to any point in cluster xx^*, we do the same recursively.

Pseudo Code

Fine-tuning Greedy Capture(N,M,k,rbc,x)(\mathcal N,\mathcal M,k,r_{bc},x^*)

Input: N,M,k,rbc,x\mathcal N,\mathcal M,k,r_{bc},x^* // rbcr_{bc} denotes the fixed radius for 2-block coalition, xx^* denotes the center of 2-block coalition

  1. X,Δ=(δ1,δ2,,δm)(0,0,,0),NNX\gets\empty,\Delta=(\delta_1,\delta_2,\cdots,\delta_m)\gets(0,0,\cdots,0),N\gets\mathcal N

  2. XX{x},δxrbcX\gets X\cup\{x^*\},\delta_{x^*}\gets r_{bc}

  3. while NN\neq\empty do

    • Smoothly and synchronously increase Δδx\Delta\diagdown\delta_{x^*}

    • If δ\delta for other surpass δx\delta_{x^*}, increase δx\delta_{x^*} also.

    • while xX\exist x\in X s.t. B(x,δx)N1|B(x,\delta_x)\cap N|\ge1

      • NNB(x,δx)N\gets N\diagdown B(x,\delta_x)
    • while xMX\exist x\in\mathcal M\diagdown X s.t. B(x,δx)Nn/k|B(x,\delta_x)\cap N|\ge\lceil n/k\rceil do

      • XX{x}X\gets X\cup\{x\}
      • NNB(x,δx)N\gets N\diagdown B(x,\delta_x)
    • while xMX\exist x'\in\mathcal M\diagdown X s.t. xx' is the center of a 2-block coalition

      • xxx^*\gets x'

      • rbcn/kr_{bc}\gets\lceil n/k\rceil-th nearest distance from xx^*

      • return Fine-tuning Greedy Capture(N,M,k,rbc,x)(\mathcal N,\mathcal M,k,r_{bc},x^*)

  4. return XX

Important: DOES IT WORK?

It is obvious that if the algorithm terminates, then we can find 2-approximate PROP clustering.

Running on the Example

On the example where no matter how the clustering is, the PROP approximation can not be better than 2. The algorithm exactly returns 2-PROP clustering.

On the example in which 1+21+\sqrt2 is the lower bound for Greedy Capture, The algorithm returns exactly PROP clustering.

Two symmetry parts, n=6,m=4,k=3n=6,m=4,k=3.

Consider one part that we only assign one cluster.

  • First the algorithm open x1x_1 by Greedy Capture.
  • Fine-tuning part finds that x2x_2 and a2,a3a_2,a_3 are a 2-approximate block coalition, then fixes them, rerun the Greedy Capture from the scratch.
  • Finally, x1x_1 are no longer opened. It is exactly PROP clustering.

Questions

There are a few question need to be answered.

❓How to prove that the algorithm always terminate?

❓Can this algorithm always find the best PROP clustering among all (mk)\left( \begin{array}{c}m\\k\end{array} \right) possible clustering?