a list implementation of the simm

Briefly, here is the code for a list implementation of simm (as opposed to the superposition version). Note it is unscaled, so we are not rescaling f and g so that w*f == w*g. I may give the code for that later (I don't think I have it typed up anywhere just yet).

Recall:
`simm(w,f,g) = (w*f + w*g - w*[f - g])/2.max(w*f,w*g)`
Python:
```def list_simm(w,f,g):
the_len = min(len(f),len(g))
#  w += [0] * (the_len - len(w))
w += [1] * (the_len - len(w))
f = f[:the_len]
g = g[:the_len]

wf = sum(abs(w[k]*f[k]) for k in range(the_len))
wg = sum(abs(w[k]*g[k]) for k in range(the_len))
wfg = sum(abs(w[k]*f[k] - w[k]*g[k]) for k in range(the_len))

if wf == 0 and wg == 0:
return 0
else:
return (wf + wg - wfg)/(2*max(wf,wg))
```
And that's it! Heaps more to come!

Update: OK. I wrote a rescaled version of list simm. I haven't tested it, but I think it is probably right :)
Python:
```def rescaled_list_simm(w,f,g):
the_len = min(len(f),len(g))
# normalize lengths of our lists:
#  w += [0] * (the_len - len(w))
w += [1] * (the_len - len(w))
f = f[:the_len]
g = g[:the_len]

# rescale step, first find size:
s1 = sum(abs(w[k]*f[k]) for k in range(the_len))
s2 = sum(abs(w[k]*g[k]) for k in range(the_len))

# if s1 == 0, or s2 == 0, we can't rescale:
if s1 == 0 or s2 == 0:
return 0

# now rescale:
# we just need w*f == w*g, the exact value doesn't matter, so we choose 1.
# noting that our equation has symmetry under: "f => k.f, g => k.g"
# also, note that finite precision floats means sometimes it does matter, but hopefully we will be fine.
f = [f[k]/s1 for k in range(the_len)]
g = [g[k]/s2 for k in range(the_len)]

# proceed with algo:
# if we did the rescale step correctly we will have:
# wf == wg == 1
#  wf = sum(abs(w[k]*f[k]) for k in range(the_len))
#  wg = sum(abs(w[k]*g[k]) for k in range(the_len))
wfg = sum(abs(w[k]*f[k] - w[k]*g[k]) for k in range(the_len))

# we should never have wf or wg == 0 in the rescaled case:
#  if wf == 0 and wg == 0:
#    return 0
#  else:
#    return (wf + wg - wfg)/(2*max(wf,wg))
return (2 - wfg)/2 ```
Update: OK. May as well do an implementation of Gaussian simm too:
```gaussian-simm(s,f,g) = exp(-||f - g||^2/2s)
```
Python:
```import math

# define Euclidean Distance function:
def ED(f,g):
if len(f) != len(g):
print("different length vectors!")
return 0
return math.sqrt(sum((f[k] - g[k])**2 for k in range(len(f))))

# define Guassian simm:
# guassian-simm(s,f,g) = exp(-||f - g||^2/2s)
def guass_simm(s,f,g):
return math.exp(-ED(f,g)**2/2*s)
```

Home
previous: a similarity metric
next: some examples of list simm in action

updated: 19/12/2016
by Garry Morrison
email: garry -at- semantic-db.org