Brief Introduction
Invertible Bloom Lookup Table (abbr. IBLT)
a fancy hash-table stores key-value pair (x,y)
space budget: t
Guarantee
It “works” when the number of key-value pairs, n, in table has n≤t.
“temporarily” stops working if n>t.
starts working again if n drops below t.
Operations
(x,y)
is key-value pair.
insert(x,y)
, delete(x,y)
, get(x)
, listentries
.
Details
Start with table with k≥3 rows, m=2×t columns.
In each row, we have a hash function h1,⋯,hk.
Each of the entry, called cell, stores values:
Insert Operation
- for i=1 to k
- c=A[i][hi(x)]
- c.count++
- c.key_sum+=x
- c.value_sum+=y
Delete Operation
“undo” insert operation
We must know the pair is there before deletion.
- for i=1 to k
- c=A[i][hi(x)]
- c.count−−
- c.key_sum−=x
- c.value_sum−=y
Get Operation
slightly different from min sketch count
ret=∞
for i=1 to k
c=A[i][hi(x)]
if c.count==0 return “not there”
else if c.count==1 then
return “don’t know”
Probability of Success
If n≤t, then get(x)
succeeds with probability ≥1−2−k.
Assume k>1, n pairs: (xi,yi) is stored in IBLT, n≤t.
For get(x)
Pr[get(x) fails]=Pr[xi=x⋃Ei]≤xi=x∑Pr[Ei]≤xi=x∑m1=mn−1≤mt−1=2tt−1≤21
If x is not in IBLT.
明明 x 不在 IBLT 的这一行里面,但是 get 函数得出该键对应的哈希值指向的位置在 IBLT 这一行中产生碰撞的概率。
Define fail-event Eij={h(x)=h(xi)=h(xj)}
Pr[i=j⋃Eij]≤i<j∑Pr[Eij]=i<j∑m21=Cn2m21=2m2n(n−1)≤2m2t2≤8t2t2=81
If we choose k=log2t+log2δ1, the probability of failure is at most δ.
How to Avoid False Deletion?
another set of universal hash function g1,⋯,gk:U→[R]
while query, check whether c.hash_sum==g(c.key_sum)
ListEntries Operation
Q=∅
for c in IBLT
- if c.count==1
- Q.push(c.key_sum,c.value_sum)
// Max size of Q is mk here
Analysis of Success Probability
hyper-graph
node: memory location
h is truly random
n edges, e1,⋯,en are independently uniformly random.
List entries fails iff non-empty 2-core exists.
i.e. largest subgraph where every node has degree≥2
F={ListEntries fails}
Fj={ListEntries fails with j key-value left}
Pr[F]=Pr[j=2⋃nFj]≤j=2∑nPr[Fj] (1)
For every S∈Cxj, FjS={ListEntries fails, with S as j keys}
(1)=j=2∑nPr[S∈Cxj⋃FjS]≤j=2∑nS∈Cxj∑Pr[FjS] (2)
Denote: (T1,⋯,Tk)∈(C[m]j/2)k
Fj,S,T1,⋯,Tk={LE fails for S and ∀i:hi(S)⊆Ti}
(2)=j=2∑nS∈Cxj∑(T1,⋯,Tk)∈(C[m]j/2)k∑Pr[Fj,S,T1,⋯,Tk] (3)
Pr[h1(x1)⊆T1]=2mjPr[h1(S)⊆T1=(2mj)jPr[∀i:hi(S)⊆Ti]=(2mj)jkPr[Fj,S,T1,⋯,Tk]≤Pr[∀i:hi(S)⊆Ti]=(2mj)jk(3)≤j=2∑nCnj(Cmj/2)k(2mj)jk≤j=2∑n(jen)j(jm2e)jk/2(2mj)jk=j=2∑n(jen)j(m2je)jk/2=j=2∑n(2me2n)j(2mje)j(k/2−1)≤j=2∑n2−j(2mje)j(k/2−1)≤2⋅41⋅(me)k−2≤(me)k−2≤(1/t)k−2
☠️☠️☠️