what do we know about bananas
Now we have a subset of wikipedia in sw format, we need to look around, see what we have.
There are 4 things that might be interesting to ask (with this data-set):
links-to |wikipage>
inverse-links-to |wikipage>
similar[links-to] |wikipage>
similar[inverse-links-to] |wikipage>
So, let's start with bananas:
sa: table[wikipage,coeff] select[1,50] coeff-sort links-to |WP: Banana>
+-------------------------------------------------+-------+
| wikipage | coeff |
+-------------------------------------------------+-------+
| Musa_balbisiana | 5 |
| Southeast_Asia | 5 |
| Musa_acuminata | 4 |
| Kerala | 4 |
| Musa_(genus) | 3 |
| starch | 3 |
| Philippines | 3 |
| India | 3 |
| Plantain_(true) | 3 |
| Chiquita_Brands_International | 3 |
| Malaysia | 3 |
| Indonesia | 3 |
| Thailand | 3 |
| Central_America | 3 |
| Uganda | 3 |
| Panama_disease | 3 |
| fruit | 2 |
| herbaceous | 2 |
| Papua_New_Guinea | 2 |
| fiber | 2 |
| #Cavendish | 2 |
| List_of_banana_cultivars | 2 |
| Musa_velutina | 2 |
| Fe'i_banana | 2 |
| Ensete_ventricosum | 2 |
| Musaceae | 2 |
| pseudostem | 2 |
| inflorescence | 2 |
| List_of_banana_cultivars#AAB_Group | 2 |
| Synonym_(taxonomy) | 2 |
| Palestine | 2 |
| Madagascar | 2 |
| China | 2 |
| Dole_Food_Company | 2 |
| Burundi | 2 |
| Rwanda | 2 |
| Hybrid_(biology) | 2 |
| genetic_engineering | 2 |
| Food_and_Agriculture_Organization | 2 |
| Guatemala | 2 |
| ICTSD | 2 |
| Fyffes | 2 |
| United_Fruit_Company | 2 |
| coffee | 2 |
| International_Institute_of_Tropical_Agriculture | 2 |
| Daily_Value | 2 |
| potatoes | 2 |
| dopamine | 2 |
| South_Asia | 2 |
| Pisang_goreng | 2 |
+-------------------------------------------------+-------+
sa: table[wikipage,coeff] select[1,50] coeff-sort inverse-links-to |WP: Banana>
+-----------------------+-------+
| wikipage | coeff |
+-----------------------+-------+
| Economy_of_Costa_Rica | 1 |
| Flavor | 1 |
| Economy_of_Honduras | 1 |
| Economy_of_Jamaica | 1 |
| Jericho | 1 |
| Jell-O | 1 |
| Economy_of_Malawi | 1 |
| Neapolitan_ice_cream | 1 |
| Thai_cuisine | 1 |
| Vitamin_C | 1 |
| Zingiberales | 1 |
+-----------------------+-------+
sa: table[wikipage,coeff] select[1,50] 100 self-similar[links-to] |WP: Banana>
+---------------------------------------------------------------------------------+--------+
| wikipage | coeff |
+---------------------------------------------------------------------------------+--------+
| Banana | 100.0 |
| Cooking_plantain | 15.096 |
| ITU_prefix | 9.52 |
| Wikipedia:Status_of_the_porting_of_U.S._Dept_of_State_info | 9.088 |
| United_Nations_Industrial_Development_Organization | 8.376 |
| Wikipedia:Status_of_the_porting_of_the_CIA_World_Factbook | 8.096 |
| Abac | 8.036 |
| International_Tropical_Timber_Agreement,_1994 | 7.813 |
| Deforestation | 7.483 |
| International_Tropical_Timber_Agreement,_1983 | 7.366 |
| Western_imperialism_in_Asia | 6.983 |
| International_Hydrographic_Organization | 6.92 |
| Foreign_relations_of_China | 6.806 |
| Lions_Clubs_International | 6.326 |
| Asia | 6.242 |
| Passport | 6.24 |
| Diaspora | 5.931 |
| International_Electrotechnical_Commission | 5.925 |
| Member_states_of_the_United_Nations | 5.645 |
| Indian_Ocean | 5.625 |
| History_of_Southeast_Asia | 5.62 |
| Solanaceae | 5.581 |
| Tiger | 5.56 |
| Kyoto_Protocol | 5.527 |
| Curry | 5.495 |
| Sugar | 5.41 |
| Blood_alcohol_content | 5.389 |
| List_of_mountains | 5.388 |
| Rice | 5.37 |
| Southeast_Asia | 5.361 |
| Ceiba_pentandra | 5.357 |
| Pigeon_pea | 5.357 |
| Rat | 5.219 |
| Abugida | 5.116 |
| Chinatown | 5.1 |
| Video_CD | 5.086 |
| History_of_the_Pacific_Islands | 5.052 |
| Time_zone | 5.037 |
| Asian_Development_Bank | 5.035 |
| Hindu | 5.03 |
| Foreign_relations_of_Indonesia | 5.004 |
| Foreign_relations_of_Singapore | 5.002 |
| Economy_of_Burma | 4.993 |
| Spice | 4.952 |
| Convention_on_Fishing_and_Conservation_of_the_Living_Resources_of_the_High_Seas | 4.911 |
| Foreign_relations_of_Taiwan | 4.842 |
| Tamil_language | 4.828 |
| Risk_(game) | 4.792 |
| Foreign_relations_of_the_Philippines | 4.776 |
| Numismatics | 4.748 |
+---------------------------------------------------------------------------------+--------+
Time taken: 10 minutes, 7 seconds, 599 milliseconds
sa: table[wikipage,coeff] select[1,100] 100 self-similar[inverse-links-to] |WP: Banana>
+-----------------------------------------+--------+
| wikipage | coeff |
+-----------------------------------------+--------+
| Banana | 100.0 |
| Grape | 27.273 |
| Pineapple | 27.273 |
| Mango | 27.273 |
| Ascorbic_acid | 18.182 |
| Cranberry | 18.182 |
| Kiwifruit | 18.182 |
| Pear | 18.182 |
| Tea | 18.182 |
| Vanilla | 18.182 |
| Chicken | 18.182 |
| Peach | 18.182 |
| apparel | 18.182 |
| lime_(fruit) | 18.182 |
| pudding | 18.182 |
| mangosteen | 18.182 |
| Coconut | 18.182 |
| gelatin_dessert | 18.182 |
| Lemon | 18.182 |
| galangal | 18.182 |
| Melon | 18.182 |
| Grapefruit | 18.182 |
| Raspberry | 18.182 |
| Apricot | 18.182 |
| Watermelon | 18.182 |
| Coffee | 16.667 |
| Strawberry | 15.789 |
| Cherry | 15.385 |
| custard | 15.385 |
| Pork | 15.385 |
| Sugar | 14.286 |
| Apple | 14.286 |
| condensed_milk | 14.286 |
| Garden_strawberry | 14.286 |
| Blackberry | 13.333 |
| raspberry | 13.333 |
| Fruit | 12.5 |
| pistachio | 12.5 |
| Orange_(fruit) | 12.5 |
| tapioca | 12 |
| Tomato | 11.765 |
| Center_for_Economic_and_Policy_Research | 11.111 |
| Lime_(fruit) | 10.526 |
| turmeric | 10.526 |
| pineapple | 10.256 |
| Domestic_sheep | 9.524 |
| broccoli | 9.524 |
| Adobe | 9.091 |
| Analytical_chemistry | 9.091 |
| Commelinales | 9.091 |
| Calf | 9.091 |
| Celebrity | 9.091 |
| Cream | 9.091 |
| Buddhist_cuisine | 9.091 |
| Celery | 9.091 |
| Chocolate | 9.091 |
| Cola | 9.091 |
| Ester | 9.091 |
| Epipaleolithic | 9.091 |
| Food_additive | 9.091 |
| Glycine | 9.091 |
| Gelatin | 9.091 |
| Gelatin_dessert | 9.091 |
| James_Lind | 9.091 |
| Kathleen_Kenyon | 9.091 |
| Ceiba_pentandra | 9.091 |
| Mya_(unit) | 9.091 |
| Monosaccharide | 9.091 |
| Parsley | 9.091 |
| Potato | 9.091 |
| Pia_colada | 9.091 |
| Spice | 9.091 |
| Soft_drink | 9.091 |
| Economy_of_South_Africa | 9.091 |
| Scurvy | 9.091 |
| Tarsiidae | 9.091 |
| Zingiberaceae | 9.091 |
| Beef | 9.091 |
| Cooking_plantain | 9.091 |
| Casimir_Funk | 9.091 |
| Pigeon_pea | 9.091 |
| Pistachio | 9.091 |
| Cod | 9.091 |
| Tropical | 9.091 |
| tadpole | 9.091 |
| water_buffalo | 9.091 |
| Phosphoric_acid | 9.091 |
| Tartaric_acid | 9.091 |
| Citric_acid | 9.091 |
| weak_acid | 9.091 |
| Lactic_acid | 9.091 |
| Dietary_Reference_Intake | 9.091 |
| Canaanite_language | 9.091 |
| cell_adhesion_molecule | 9.091 |
| salivary_gland | 9.091 |
| bananas | 9.091 |
| gourami | 9.091 |
| fishing_industry | 9.091 |
| Molding_(process) | 9.091 |
| Carrot | 9.091 |
+-----------------------------------------+--------+
Time taken: 14 minutes, 27 seconds, 470 milliseconds
So, similar[inverse-links-to] works best. And quite noticeably better too! Exactly why, I'm not sure. Anyway, now we know to apply this to more examples. I'll do that in the next post.
Home
previous: how many wikipage links
next: more inverse simm results
updated: 19/12/2016
by Garry Morrison
email: garry -at- semantic-db.org