a superposition implementation of simm
Previously I gave a list implementation for simm.
Recall:
simm(w,f,g) = (w*f + w*g - w*[f - g])/2.max(w*f,w*g)
simm(w,f,g) = \Sum_k w[k] min(f[k],g[k]) / max(w*f,w*g)
In this post I'm going to give a couple of superposition implementations of this equation. And we should note in the BKO scheme most superpositions have coeffs >= 0, so we can use the min version of simm given above. So, first we need to observe some correspondences:
\Sum_k x[k] <=> x.count_sum()
min[f[k],g[k]] <=> intersection(f,g)
rescale so w*f == w*g == 1 <=> f.normalize() and g.normalize()
And so, some python:
# unscaled simm:
def unscaled_simm(A,B):
wf = A.count_sum()
wg = B.count_sum()
if wf == 0 and wg == 0:
return 0
return intersection(A,B).count_sum()/max(wf,wg)
# weighted scaled simm:
def weighted_simm(w,A,B):
A = multiply(w,A)
B = multiply(w,B)
return intersection(A.normalize(),B.normalize()).count_sum()
# standard use case simm:
def simm(A,B):
if A.count() <= 1 and B.count() <= 1:
a = A.ket()
b = B.ket()
if a.label != b.label:
return 0
a = max(a.value,0) # just making sure they are >= 0.
b = max(b.value,0)
if a == 0 and b == 0: # prevent div by zero.
return 0
return min(a,b)/max(a,b)
return intersection(A.normalize(),B.normalize()).count_sum()
where, most of the time I use the last one. And I should note it is only the last one that takes into account you should not rescale if length = 1. I didn't bother with the other two versions, since I don't actually use them much at all.
So why is this distinction important, well, some examples:
If you rescale length 1 you get:
simm(|a>,2|a>) = 1
when you really want:
simm(|a>,2|a>) = 0.5
But this is only an issue if both superpositions are length 1. eg, this case has no issues:
simm(|a>,|a> + |b>) = 0.5
I guess that is it for this post! Heaps more to come!
Home
previous: some examples of list simm in action
next: the landscape function
updated: 19/12/2016
by Garry Morrison
email: garry -at- semantic-db.org