Here is the python:

import math import cmath # define p'th roots of unity: def jpk(p,k): return cmath.exp(1j*2*math.pi*k/p) # define wf_k: def wf(vect): return sum(abs(x) for x in vect) # define wf^p: def wfp(vects): p = len(vects) i_max = len(vects[0]) # assume all vects are the same size as the first one. r1 = 0 for i in range(i_max): r2 = 0 for k in range(p): r2 += jpk(p,k)*vects[k][i] r1 += abs(r2) return r1 def multi_simm(vects): p = len(vects) i_max = len(vects[0]) # assume all vects are the same size as the first one. # sum over wf_k term: r1 = 0 max_wf = 0 for k in range(p): wf_k = wf(vects[k]) max_wf = max(max_wf,wf_k) r1 += wf_k # wf^p term: r2 = wfp(vects) # p.max term: r3 = p*max_wf # prevent divide by 0: if r3 == 0: return 0 # return result: return (r1 - r2)/r3 def rescaled_multi_simm(vects): p = len(vects) i_max = len(vects[0]) # assume all vects are the same size as the first one. # find normalization terms: norms = [] for k in range(p): wf_k = wf(vects[k]) if wf_k == 0: # prevent divide by zero return 0 norms.append(wf_k) # find normalized wf^p: r1 = 0 for i in range(i_max): r2 = 0 for k in range(p): r2 += jpk(p,k)*vects[k][i]/norms[k] r1 += abs(r2) # return result: return 1 - r1/p # test the code: print("wfp: %s" % wfp(list_of_vects)) print("multi-simm: %s" % multi_simm(list_of_vects)) print("rescaled-multi-simm: %s" % rescaled_multi_simm(list_of_vects))Now, some test cases. First, with all patterns equal, which should give 1, else we made a mistake!

list_of_vects = [[2,3,4,5,6], [2,3,4,5,6], [2,3,4,5,6]] $ ./multi-simm.py wfp: 5.887076992907251e-15 multi-simm: 0.9999999999999999 rescaled-multi-simm: 0.9999999999999999Next, all patterns "disjoint", this time we expect 0, else we made a mistake:

list_of_vects = [[5,0,0,0], [0,-5,0,0], [0,0,-5,0], [0,0,0,5]] $ ./multi-simm.py wfp: 20.0 multi-simm: 0.0 rescaled-multi-simm: 0.0Next, test that rescaling works, and gives a different answer to non-rescaled:

list_of_vects = [[2,3,4,5,6], [4,6,8,10,12], [6,9,12,15,18]] $ ./multi-simm.py wfp: 34.641016151377556 multi-simm: 0.47421657693679137 rescaled-multi-simm: 0.9999999999999999And finally, a test case where we don't expect 0 or 1:

list_of_vects = [[2,3,4,5,6], [6,5,4,3,2], [2,4,3,5,6], [2,4,5,3,6]] $ ./multi-simm.py wfp: 10.82842712474619 multi-simm: 0.8646446609406727 rescaled-multi-simm: 0.8646446609406726Cool. It all seems to work as desired. Heh, now I need to find a use case for p > 2.

Home

previous: p pattern similarity metric in latex

next: normalizing mnist digits

updated: 19/12/2016

by Garry Morrison

email: garry -at- semantic-db.org