Recall:

simm(w,f,g) = (w*f + w*g - w*[f - g])/2.max(w*f,w*g) simm(w,f,g) = \Sum_k w[k] min(f[k],g[k]) / max(w*f,w*g)In this post I'm going to give a couple of superposition implementations of this equation. And we should note in the BKO scheme most superpositions have coeffs >= 0, so we can use the min version of simm given above. So, first we need to observe some correspondences:

\Sum_k x[k] <=> x.count_sum() min[f[k],g[k]] <=> intersection(f,g) rescale so w*f == w*g == 1 <=> f.normalize() and g.normalize()And so, some python:

# unscaled simm: def unscaled_simm(A,B): wf = A.count_sum() wg = B.count_sum() if wf == 0 and wg == 0: return 0 return intersection(A,B).count_sum()/max(wf,wg) # weighted scaled simm: def weighted_simm(w,A,B): A = multiply(w,A) B = multiply(w,B) return intersection(A.normalize(),B.normalize()).count_sum() # standard use case simm: def simm(A,B): if A.count() <= 1 and B.count() <= 1: a = A.ket() b = B.ket() if a.label != b.label: return 0 a = max(a.value,0) # just making sure they are >= 0. b = max(b.value,0) if a == 0 and b == 0: # prevent div by zero. return 0 return min(a,b)/max(a,b) return intersection(A.normalize(),B.normalize()).count_sum()where, most of the time I use the last one. And I should note it is only the last one that takes into account you should not rescale if length = 1. I didn't bother with the other two versions, since I don't actually use them much at all.

So why is this distinction important, well, some examples:

If you rescale length 1 you get:

simm(|a>,2|a>) = 1

when you really want:

simm(|a>,2|a>) = 0.5

But this is only an issue if both superpositions are length 1. eg, this case has no issues:

simm(|a>,|a> + |b>) = 0.5

I guess that is it for this post! Heaps more to come!

Home

previous: some examples of list simm in action

next: the landscape function

updated: 19/12/2016

by Garry Morrison

email: garry -at- semantic-db.org