Here is the python:

# e is a ket, X is a superposition # for best effect X should be a frequency list def normed_frequency_class(e,X): e = e.ket() # make sure e is a ket, not a superposition, else X.find_value(e) bugs out. X = X.drop() # drop elements with coeff <= 0 smallest = X.find_min_coeff() # return the min coeff in X as float largest = X.find_max_coeff() # return the max coeff in X as float f = X.find_value(e) # return the value of ket e in superposition X as float if largest <= 0 or f <= 0: # otherwise the math.log() blows up! return 0 fc_max = math.floor(0.5 - math.log(smallest/largest,2)) + 1 # NB: the + 1 is important, else the smallest element in X gets reported as not in set. return 1 - math.floor(0.5 - math.log(f/largest,2))/fc_maxThe motivation for this function is the frequency class equation given on wikipedia.

N = floor(1/2 - log_2(frequency-of-this-item/frequency-of-most-common-item))All I have done is normalized it so 1 for best match, 0 for not in set.

I guess I should give some examples. Let's load up some knowledge in the console:

sa: load normed-frequency-class-examples.sw sa: dump ---------------------------------------- |context> => |context: normed frequency class> |X> => |the> + |he> + |king> + |boy> + |outrageous> + |stringy> + |transduction> + |mouse> |Y> => 13|the> + 13|he> + 13|king> + 13|boy> + 13|outrageous> + 13|stringy> + 13|transduction> + 13|mouse> |Z> => 3789654|the> + 2098762|he> + 57897|king> + 56975|boy> + 76|outrageous> + 5|stringy> + |transduction> + |mouse> the |*> #=> ket-nfc(|the>,""|_self>) he |*> #=> ket-nfc(|he>,""|_self>) king |*> #=> ket-nfc(|king>,""|_self>) boy |*> #=> ket-nfc(|boy>,""|_self>) outrageous |*> #=> ket-nfc(|outrageous>,""|_self>) stringy |*> #=> ket-nfc(|stringy>,""|_self>) transduction |*> #=> ket-nfc(|transduction>,""|_self>) mouse |*> #=> ket-nfc(|mouse>,""|_self>) not-in-set |*> #=> ket-nfc(|not-in-set>,""|_self>) |nfc table> #=> table[SP,the,he,king,boy,outrageous,stringy,transduction,mouse,not-in-set] split |X Y Z> ---------------------------------------- -- now take a look at the table: sa: "" |nfc table> +----+-----+----------+----------+----------+------------+----------+--------------+----------+------------+ | SP | the | he | king | boy | outrageous | stringy | transduction | mouse | not-in-set | +----+-----+----------+----------+----------+------------+----------+--------------+----------+------------+ | X | nfc | nfc | nfc | nfc | nfc | nfc | nfc | nfc | 0 nfc | | Y | nfc | nfc | nfc | nfc | nfc | nfc | nfc | nfc | 0 nfc | | Z | nfc | 0.96 nfc | 0.74 nfc | 0.74 nfc | 0.30 nfc | 0.13 nfc | 0.04 nfc | 0.04 nfc | 0 nfc | +----+-----+----------+----------+----------+------------+----------+--------------+----------+------------+And we can clearly see it has the properties promised above. |X> and |Y> give the same results even though X has all coeffs 1, and Y has all coeffs 13. "not-in-set" returned 0, since it is not in any of the three superpositions. And Z gives a nice demonstration of the fuzzy set membership idea.

That's it for this post. We will be putting it to use in the next couple of posts.

Update: for frequency lists, log(f/largest) is the best choice. But in other cases, maybe some other foo(f/largest) would work better. I haven't given it all that much thought.

Home

previous: some categorize examples

next: the map to topic and find topic functions

updated: 19/12/2016

by Garry Morrison

email: garry -at- semantic-db.org