the normed frequency class equation
In this post I will give the normed frequency class equation. I guess it could be considered a type of fuzzy set membership function. If all coeffs in a superposition X are equal, then it gives 1 if a ket is in X, and 0 if that ket is not in X. If the coeffs are not all equal then it has fuzzier properties.
Here is the python:
# e is a ket, X is a superposition
# for best effect X should be a frequency list
def normed_frequency_class(e,X):
e = e.ket() # make sure e is a ket, not a superposition, else X.find_value(e) bugs out.
X = X.drop() # drop elements with coeff <= 0
smallest = X.find_min_coeff() # return the min coeff in X as float
largest = X.find_max_coeff() # return the max coeff in X as float
f = X.find_value(e) # return the value of ket e in superposition X as float
if largest <= 0 or f <= 0: # otherwise the math.log() blows up!
return 0
fc_max = math.floor(0.5 - math.log(smallest/largest,2)) + 1 # NB: the + 1 is important, else the smallest element in X gets reported as not in set.
return 1 - math.floor(0.5 - math.log(f/largest,2))/fc_max
The motivation for this function is the frequency class equation given on wikipedia.
N = floor(1/2 - log_2(frequency-of-this-item/frequency-of-most-common-item))
All I have done is normalized it so 1 for best match, 0 for not in set.
I guess I should give some examples. Let's load up some knowledge in the console:
sa: load normed-frequency-class-examples.sw
sa: dump
----------------------------------------
|context> => |context: normed frequency class>
|X> => |the> + |he> + |king> + |boy> + |outrageous> + |stringy> + |transduction> + |mouse>
|Y> => 13|the> + 13|he> + 13|king> + 13|boy> + 13|outrageous> + 13|stringy> + 13|transduction> + 13|mouse>
|Z> => 3789654|the> + 2098762|he> + 57897|king> + 56975|boy> + 76|outrageous> + 5|stringy> + |transduction> + |mouse>
the |*> #=> ket-nfc(|the>,""|_self>)
he |*> #=> ket-nfc(|he>,""|_self>)
king |*> #=> ket-nfc(|king>,""|_self>)
boy |*> #=> ket-nfc(|boy>,""|_self>)
outrageous |*> #=> ket-nfc(|outrageous>,""|_self>)
stringy |*> #=> ket-nfc(|stringy>,""|_self>)
transduction |*> #=> ket-nfc(|transduction>,""|_self>)
mouse |*> #=> ket-nfc(|mouse>,""|_self>)
not-in-set |*> #=> ket-nfc(|not-in-set>,""|_self>)
|nfc table> #=> table[SP,the,he,king,boy,outrageous,stringy,transduction,mouse,not-in-set] split |X Y Z>
----------------------------------------
-- now take a look at the table:
sa: "" |nfc table>
+----+-----+----------+----------+----------+------------+----------+--------------+----------+------------+
| SP | the | he | king | boy | outrageous | stringy | transduction | mouse | not-in-set |
+----+-----+----------+----------+----------+------------+----------+--------------+----------+------------+
| X | nfc | nfc | nfc | nfc | nfc | nfc | nfc | nfc | 0 nfc |
| Y | nfc | nfc | nfc | nfc | nfc | nfc | nfc | nfc | 0 nfc |
| Z | nfc | 0.96 nfc | 0.74 nfc | 0.74 nfc | 0.30 nfc | 0.13 nfc | 0.04 nfc | 0.04 nfc | 0 nfc |
+----+-----+----------+----------+----------+------------+----------+--------------+----------+------------+
And we can clearly see it has the properties promised above. |X> and |Y> give the same results even though X has all coeffs 1, and Y has all coeffs 13. "not-in-set" returned 0, since it is not in any of the three superpositions. And Z gives a nice demonstration of the fuzzy set membership idea.
That's it for this post. We will be putting it to use in the next couple of posts.
Update: for frequency lists, log(f/largest) is the best choice. But in other cases, maybe some other foo(f/largest) would work better. I haven't given it all that much thought.
Home
previous: some categorize examples
next: the map to topic and find topic functions
updated: 19/12/2016
by Garry Morrison
email: garry -at- semantic-db.org