Fair AI/ML 3 PROP Fair Clustering Revisited
Read the paper Proportionally Fair Clustering Revisited by Micha et Shah.
cluster centers must be placed given points in a metric space.
The cost to each point is its distance to the nearest cluster center.
This paper: Focus on the case where cluster centers can be placed anywhere in metric space.
E.g. distance over , the approximation ratio of greedy capture improves to .
For metric, the approximation ratio remains .
The paper gives universal lower bounds which apply to all algorithms.
Introduction
Points , possible cluster centers .
Given , task is to select a set consisting of cluster centers.
Group fairness.
Proportional fairness.
Contribution
Focus on the case where the metric consists of usual distance functions such as over .
⚠️Cluster centers can be placed anywhere in the infinite metric space. ❗
请注意此处,候选聚类中心和上一篇论文不一样,所以得出的近似下界不一样。
Greedy Capture by Chen et al. where and .
For , GC provides -PROP. The approximation is obtained by Apollonius radius.
For , the approximation ratio can not be better than .
Universal lower bounds: for and , no algorithm achieves better than -PROP, whereas for and , the lower bound is .
Also, this paper consider the case where is the set of nodes of an unweighted graph, and measures the shortest distance between two nodes on the graph.
- If the graph is a tree, then PROP must exist. And there are an efficient algorithm.
- If the graph is arbitrary, . PROP must exist and can be computed efficiently.
- Open problem: Whether an exact PROP exists for all graphs.
For and for , checking whether a clustering is PROP is NP-hard.
When is large,even greedy capture is NP-hard. When is constant, the problem becomes efficiently solvable.
When is large, using a as sub-routine of greedy capture, we can efficiently compute -PROP fair clustering for any fixed
Problem of generalization: Would a clustering that is PROP w.r.t. samples drawn from remain approximately PROP w.r.t. entire set ? Chen et al. proved that the answer is yes when is finite. Using the VC dimension, the answer is also yes for .
Preliminaries
: the set of data points lie in a metric space , where is a distance function satisfying the triangle inequality.
Consider for .
Distance function: .
: set of candidate cluster centers. In this paper .
Given , a -clustering is a set .
Each is called the open cluster center.
Cost to agent induced by a cluster center is the distance . The cost to agent should be the minimal distance.
Agent is interested in minimizing the cost.
A set of points containing at least , if they can find a new cluster that is better for each of them, it is a violation of fairness (block coalition)
📖Definition 1. -PROP, block coalition. Omitted.
PROP on graphs: given an undirected graph , . Distance between is the length of the shortest path.
Greedy Capture
Detail is omitted here.
-PROP
This paper gives a refined analysis of Greedy Capture.
📖Definition 2 (Apollonius radius) Given , the -Apollonius radius of a metric is defined as , where is the radius of the smallest ball centered at some point in that contains the entire set .
When , the set is a ball.
🎯Theorem 3. For any metric and , greedy capture finds a -PROP clustering, where is the smallest positive number satisfying .
: 候选聚类中心可以是空间上任意一点。
Proof by triangle inequality. Proof that if is not -PROP, then .
🎯Theorem 4. For any metric , the -Apollonius radius is . Hence, greedy capture finds a -PROP fair clustering for every metric.
在我看来,本篇论文定义了一个复杂的概念——阿波罗尼乌斯半径,来形式化证明算法结果的下界。
但是这个阿波罗尼乌斯半径的定义过于复杂了。
Next, the paper shows for , the -Apollonius radius is better, -PROP guarantee.
🎯Theorem 5. For the metric space , where , the -Apollonius radius , and hence, greedy capture finds a -PROP clustering.
Proof omitted.
Whether the introduction of Apollonius improves approximation bound for other distance metrics. For , the answer is unfortunately NO. The approximation remains .
🎯Theorem 6. For the metric space where and , and , there exists an example in which the clustering produced by greedy capture is not -PROP for .
Universal Lower Bounds
This paper generalizes lower bounds on approximation to PROP that apply to ALL ALGORITHMS.
Chen et al. showed that is not guaranteed. For , the weaker lower bound .
This paper: for the case where and .
再次提醒:: 候选聚类中心可以是空间上任意一点。
- When , it’s easy to see that there always exist PROP clusterings.
- When , a lower bound for and a lower bound for .
🎯Theorem 7. For the metric space where and , there is an example in which no clustering is -PROP for .
Closing the gap between lower bound and upper bound for is an open problem.
🎯Theorem 8. For the metric space , where and , and , there is an example in which no clustering is -PROP for .
Clustering in Graphs
In this section, we consider the special case where the metric space is induced by an undirected graph .
Consider the case of . Distance is the length of the shortest path between .
所有节点都是候选聚类中心,距离定义为最短路长度。
When is a tree, an exact PROP clustering always exists, and can be computed by an efficient algorithm.
PROP Clustering for Trees
First root the tree at an arbitrary node to obtain a rooted tree .
Denote with the height of the rooted rooted tree, and with the height of node relative to root with .
Let denote the sub-tree of node
The algorithm starts from the leaves, opens a center every time it finds a node whose sub-tree contains at least nodes, then deletes this sub-tree from the graph. At the end, the cost to each node is still defined using the closest node at which a center is opened by the algorithm.
Pseudo Code - PROP Clustering for Trees
- Root the tree at an arbitrary node .
- .
- for to do
- for every with and do
- if then
- return
🎯Theorem 9. Let be an undirected tree, be the metric induced by , , and . Then, the algorithm yields a PROP clustering.
Hard proof by contradiction.
What about Non-tree
Consider the universal lower bound for metric from Theorem 8.
If the graph is dense enough, we can derive same lower bound of .
❓Whether better lower bounds exist is an open problem.
❓For special case where , we do not know whether PROP always exists
If is connected and we want to place a large number of clusters , then PROP always exists. This is because a dominating set of any size is guaranteed to exist in a graph with nodes and can be computed efficiently.
Definition of dominating set: A set of nodes is called a dominating set if every node in the graph is either in this set or adjacent to a node in this set.
Thus the problem of finding a PROP clustering in general graphs with is trivial when , but remains open when .
Computational Aspects
Two problems:
- Checking whether a PROP clustering exits.
- Implementing the Greedy Capture when .
🎯Theorem 10, Given a finite , finite , , and , checking whether a PROP clustering exists is NP-hard.
Polynomial-time reduction from the planar monotone rectilinear 3-SAT problem.
Planar monotone rectilinear 3-SAT problem: Given a 3-SAT formula, each clause consists of only positive or only negative literals, the graph connecting clauses to literals they contain is planar, and this graph has a planar embedding in which each variable is represented by a rectangle on the -axis and each positive/negative clause is represented by a rectangle above/below -axis with 3 vertical lines or legs to its 3 variables.
什么玩意啊,什么乱七八糟的 NP-hard 问题。欺负我不会证规约是吧。
🎯Theorem 11. Let , finite , and be given as input. Suppose and . Then, the following hold:
- The clustering returned by Greedy Capture algorithm cannot be computed in polynomial time unless .
- If is constant, then it can be computed in polynomial time.
- Even if is not constant, for any constant , there exists a polynomial time algorithm which finds a -PROP clustering.
Learning Fair Clustering by Sampling
VERY DIFFICULT!
Is a clustering that is approximately PROP w.r.t. random samples taken from an underlying population would remain approximately PROP w.r.t. the whole population.
YES!
📖Definition 12. We say that a -clustering is -PROP fair to -deviations w.r.t. if for all with and all , there exists at least one s.t. .
📖Definition 13. Given a set of points , a -clustering , and , define the binary classifier s.t. iff . Define the error of this classifier on a set of points as .
The classifier equals iff can be part of a coalition that complains about the unfairness of by demonstrating as a location that provides them -improvement.
is -PROP to -deviations w.r.t. iff for all .
Goal: Show that given a large enough random simple , if we have clustering that is -PROP w.r.t. , then it is -PROP to -deviations w.r.t. with high probability. But there is no control over . Because of the uniform convergence guarantee.
is bounded for all .
📖Definition 14 (VC Dimension). Let be a set of points. Let be a family of binary classifier over . We say that shatters if for every labeling , there exists a classifier s.t. for all . The VC dimension of , denoted , is the size of the largest that can be shattered by .
🤔Proposition 15. Let be a family of binary classifier over a set of points . If is a uniformly random sample with , then with probability at least , for all .
🎯Theorem 16. Fix , and metric where and . Let be a set of points and be the set of candidate centers. Let be sampled uniformly at random with . Then with probability at least , every -clustering that is -PROP w.r.t. is -PROP to -deviation w.r.t. .
Open Questions
This paper mainly consider is entire metric space.
There are some open questions.
Bridging Gaps
Can we bridge the lower and upper bounds on the approximation ratio to PROP.
Conjecture for , the lower bound may be achievable.
Maybe by Jung’s theorem, for distance in , any set of points with diameter at most is contained in a ball of radius at most . This could be useful in closing the gap for .
On Graphs
Does PROP clustering always exist for all graphs when is the set of all nodes of the graph?
Computational Hardness
What is the computational hardness for other distance metrics?
Other ML Models
Can we adapt this PROP to other machine learning settings such as regression or classification?
Some Personal Unremarkable Rants
This paper revisit Greedy Capture. But remember whole metric space, this gives different bounds.
Some dazzling new concepts like “Apollonius radius”, “shatter” etc.
Impressive spectacular showmanship on hardness proof.