new function average categorize

This one I think will be super useful, and super powerful. I only just came up with it, so I haven't fully explored its properties, but I have already found some. And it's looking promising!

So the idea is, given a collection of points, how do we classify them into groups? I'm pretty sure this is a solved problem, but I wanted to see what I could come up with.

It goes something like this (though see the code for exact details). We have a list of currently known patterns (out_list), and a list of incoming patterns (one). For each incoming pattern (r), compare it with each of the known patterns (sp) using simm. Store the name and similarity value for the best match. At the end of the loop, if the best match is lower than a threshold (recall for simm, higher threshold is a better match, exactly 1 for exact match) then store it as its' own known pattern. Otherwise "average" it with the best matching known pattern, weighted based on its similarity. And in BKO land, simm scales/normalizes superpositions, so to average just means add the superpositions, and we don't need to divide by n.

Here is the code:

def average_categorize(context,parameters):
  try:
    op,t,phi,ave = parameters.split(',')
    t = float(t)
  except:
    return ket("",0)
    
  one = context.relevant_kets(op)
  out_list = []
  for x in one:
    r = x.apply_op(context,op)
    best_k = -1
    best_simm = 0
    for k,sp in enumerate(out_list):
      similarity = silent_simm(r,sp)
      if similarity > best_simm:
        best_k = k
        best_simm = similarity
    if best_k == -1 or best_simm < t:
      out_list.append(r)
    else:
      k = best_k
#      out_list[k] += r
      out_list[k] += r.multiply(best_simm)  # reweight based on result of simm.
  for k,sp in enumerate(out_list):
    context.learn(ave,phi + ": " + str(k+1),sp)
  return ket("average categorize")

Now, I may tweak this code yet, but it is already a good start. I will give a couple of examples in future posts.

Some properties:
1) it converges as you add more examples
2) the order you learn is important, earlier examples have a bigger effect than later ones
3) it reproduces the light-bulb effect, ie stronger examples have a bigger effect
4) it reproduces the reverse-light-bulb effect, ie, weaker examples have essentially no impact
5) I think it is roughly O(n^2), certainly cheaper than my previous categorize code.

I guess I need to explain what I mean by "stronger examples" and "weaker examples". Superpositions have a thing I call currency. Let's drop to the console:

sa: measure-currency |x>
|number: 1>

sa: measure-currency 2.71|y>
|number: 2.71>

sa: measure-currency (|a> + |b> + |c>)
|number: 3>

sa: measure-currency (0.2|a> + 4|b> + 11|c> + 0.3|d>)
|number: 15.5>

So when I talk of "stronger example" I mean r has a high currency, and a "weaker example" I mean r has a small currency.

If we look at the line:

      out_list[k] += r.multiply(best_simm)

then (ignoring the simm reweighting), the higher the currency of r, the more power it has to effect out_list[k]. I'm not sure that is clear! I'm working from mental images, and it is hard to translate them to words.

More to come!

BTW, I don't think I have mentioned this anywhere yet, but we can say an operator "op" preserves currency if:

measure-currency SP == measure-currency op SP

A related definition (and probably a slightly better one) for currency conservation is that if in matrix format the sum of each column is exactly 1, then we can say that operator conserves currency.

Home
previous: an interesting little observation
next: average categorize h i simple pattern recognition example

updated: 19/12/2016
by Garry Morrison
email: garry -at- semantic-db.org