Fair AI/ML 0 Overview
This semester, I will do a project again with Iannis.
I took some time to review the tutorial Fairness in AI/ML via Social Choice by Evi Micha & Nisarg Shah.
Social Choice Theory: Aggregating individual preferences into “good” collective decisions
Fairness in Social Choice
很熟悉了,可以快速掠过。
⚠️这里考虑的是 DIVISIBLE ITEMS.
EF, PROP
Fair division: Agents , divisible resources , utility function .
Non-negative utilities (good), non-positive utilities (chores).
Proportionality, envy-freeness.
The Core
A core is a subset of agents, among which they can redistribute their entitlements.
But redistributed entitlements shouldn’t be a Pareto improvement.
2 types of the core
Resource-scaling version
No group of agents should be able to find any allocation of their proportionally entitled share of the resources that is Pareto improvement.
Utility-scaling version
No group of agents should be able to find any allocation of the resources that is a Pareto improvement even after proportional utility-scaling.
- Two versions are equivalent for some setting.
- They are different when utilities are not linear additive.
- If there is no scalable resource, there is no way to define resource-scaling version. But it is more appealing.
Nash Social Welfare
NSW is geometric mean of agent utilities.
There are some other welfare functions:
Utilitarian social welfare:
为什么要起这么怪的名字,叫 average social welfare 不好嘛!Egalitarian social welfare:
明明可以叫 least social welfare 的……
🤔Theorem [Varian '74]: Any allocation maximizing NSW is envy-free in the core.
🤔Theorem [Orlin '10]: An allocation maximizing the NSW can be computed in strongly polynomial time.
Application of Core: Committee Selection
There are set of voters , set of candidates .
Each agent approves a subset of candidates .
The utility for each voter is defined as the number of intersection of committee member and its approval.
The goal is to find with with given .
is in the core if there is no and with (Here the candidates are resources, so its resource-scaling version) s.t. .
Advantages
It’s broadly defined
- It often depends only on the definition of AGENTS and PREFERENCES.
- Applicable to any setting as long as you define these two pieces of information.
Notions such as the core achieve group fairness to all possible groups.
- No need to pre-specify the groups.
- scale automatically with group size and cohesiveness, without additional parameters.
Envy-Freeness in ML
Classification
The model
Population of individuals given by a distribution over .
individual represent data point .
Classifier maps every individual to a classification outcome
Types of classification:
- Hard binary
- Hard multi-class
- Soft binary
- Soft multi-class
Objective: minimize loss function
Utility function should be somehow related to loss function.
How to measure fairness in ML?
Individual Fairness
[Dwork, Hardt, Pitassi, Reingold, Zemel, 2012]
Similar individuals should be treated similarly
Classifier is individual fair if
Envy-Freeness
[Balcan, Dick, Noothigattu, Procaccia, 2019]
Equal individuals shouldn’t envy each other.
Classifier is envy-free if
Envy-freeness is too strong for deterministic classifiers
- loss of optimal deterministic EF classifier .
- Loss of optimal randomized EF classifier .
一直没看懂这一部分在讲个啥
Preference-Informed IF
[Kim, Korolova, Rothblum, Yona, 2019]
Similar individuals shouldn’t envy each other too much.
Classifier is PIIF if
\forall x,y\in N,\exist z\in Y,D(z,f(y))\le d(x,y)\and u_x(f(x))\ge u_x(z).
Approximate EF
Equal individuals shouldn’t envy each other too much.
Classifier is approximately EF if .
OR
Classifier is approximately EF if
Average Group Envy-Freeness
Equal groups shouldn’t envy-each other too much on average.
Classifier is approximately average group EF w.r.t. groups if
Applicable for decision-making with limited resources
- e.g. deciding on loan or bail applications
- envy can not be prevented
Can be imposed for several pairs of groups simultaneously
Recommendations
[Do, Corbett-Davies, Atif, Usunier, 2023]
Model
- Individuals represented by data points in set
- A set items
- A set of contexts
Recommendation policy
- is probability of recommending item to user given a context .
Utility function: .
Envy-freeness:
Two-Sided Fairness
为什么是 two-sided 呢?因为可以给用户定义效用函数,也可以给要推荐的商品定义效用函数。
[Biswas, Patro, Ganguly, Gummadi, Chakraborty, 2023]
Many-to-many matching
- each user is recommended products
- each product may be recommended to a different number of users
Relevance of products to users given by
Recommendation policy
- Each user is recommended with
- Let be the top-k products for user by relevance
Utilities
- Utility to user given by
- Utility to product given by , the number of users is exposed to
Two-sided fairness
Fairness for users: EF1
Fairness for products: minimum exposure
Theorem: There exists an efficient algorithm that achieves EF1 among all users and the minimum exposure guarantee among at least products.
Non-Centroid Clustering
[Ahmadi, Awasthi, Khuller, Kleindessner, Morgenstern, Sukprasert, Vakilian, 2022]
[Aamand, Chen, Liu, Silwal, Sukprasert, Vakilian, Zhang, 2023]
Goal: Partition the agents into clusters, i.e.,
Model
- Set of agents partitioned into .
- Cluster containing agent denoted by .
- Distance metric .
-EF: For each and with : either or
Theorem:
A 1-envy-free clustering does not always exist, but an -envy-free clustering always exist and can be computed efficiently.
Another Setting
[Li, M, Nikolov, S, 2023]
Model
- Set of agents partitioned into .
- Cluster containing agent denoted by .
- Binary costs
Theorem: There exists a balanced clustering (satisfying ) s.t. for all and , .
- If we divide by the size of the clusters, then the additive term becomes .
Nash Social Welfare in ML
Multi-Armed Bandits
多臂赌博机,遗憾最小化算法
arms, , .
Exploration vs Exploitation
Regret
Multi-Agent Multi-Armed Bandits
[Hossain, M, S, 2021]
Agents
Each arm for each agent have different data distribution .
For each agent .
❓What is a fair policy?
Distribution gives expected reward to agent
Maximizing welfare functions
- Utilitarian welfare
- Egalitarian welfare
- Nash welfare
Regret:
🤔Theorem: A variation of UCB achieves the optimal regret.
Fair Exploration
[Barman, Khan, Maiti, Sawarni, 2023]
Agent arrives at time
We chose distribution , which gives the agent utility
Regret:
Theorem: A variation of UCB achieves near-optimal regret in terms of .
[Baek, Farias, 2021]
Agent arrives at time and belongs to group , where with probability
Policy: choose arm at time
Regret for group
Utility to group , where is the policy minimizing the overall regret
Nash social welfare:
Theorem: A version of UCB exactly optimizes NSW.
Core in ML
Federated Learning
[Chaudhury, Li, Kang, Li, Mehta, 2022]
FL: Choose from
Utility of each agent:
Goal: Choose that is fair for all agents
Core: A parameter vector is in the core if for all and , it holds
with at least one strict inequality.
Pareto Optimality
Proportionality
🤔Theorem: When the agents’ utilities are continuous and the set of maximizer of any conical combination of the agents’ utilities is convex, a parameter vector in the core always exists.
🤔Theorem: When the agents’ utilities are concave, then the parameter vector that maximizes the NEW is in the core:
Clustering
In ML
Goal: Analyze datasets to summarize their characteristics
Objects in the same group are similar
In Economics
Goal: Allocate a set of facilities that serve a set of agents. (e.g. hospital)
Centroid Clustering
Input
- Set of data points
- Set of feasible cluster centers
- : we have which forms a Metric Space
Output
- A set of centers
- Each data points is assigned to its closest cluster center
Some Objective Functions
-median: minimizes the sum of the distances
-means: minimizes the sum of the square of the distances
-center: minimizes the maximum distance
Fairness in Clustering
[Chen, Fain, Lyu, Munagala, 2019]
Fair clustering through proportional entitlement
- Every group of agents deserves its own cluster center
Definition in Committee Selection: is the core if
For all and
If , then for some
“If a group can afford , then should not be a strict Pareto improvement for the group”
Definition in Clustering: is in the core if
For all and
If , then for some
“If a group can afford a center , then should not be a strict Pareto improvement for the group”
Core in Centroid Clustering
[Chen, Fain, Lyu, Munagala, 2019]
A clustering solution in the core does not always exist
Approximation core
-core: A solution is in the -core with if there is no group of points with and s.t.
🤔Theorem. There exists an algorithm Greedy Capture that returns a clustering solution in the -core under any metric space. For arbitrary metric spaces and , a clustering solution in the -core is not guaranteed to exist.
🤔Theorem.
- For , Greedy Capture returns a solution in the -core
- For and , GC does not always return a -core
- For and , a clustering solution in the -core is not guaranteed to exist
- For and , and , a clustering solution in the -core is not guaranteed to exist
Core in Non-Centroid Clustering
Non-Centroid Clustering
- Partition the individuals into clusters
- loss for each cluster: for and .
-core: A solution is in the -core with , if there is no group of points with s.t.
Average loss: for each
🤔Theorem
- Greedy Capture returns a clustering solution in the -core under any metric space
- For arbitrary metric space and , a clustering solution in the -core is not guaranteed to exist
Open question: Does a clustering solution in the -core always exist?