what do we know about bananas

Now we have a subset of wikipedia in sw format, we need to look around, see what we have.

There are 4 things that might be interesting to ask (with this data-set):
links-to |wikipage>
inverse-links-to |wikipage>
similar[links-to] |wikipage>
similar[inverse-links-to] |wikipage>

So, let's start with bananas:
sa: table[wikipage,coeff] select[1,50] coeff-sort links-to |WP: Banana>
+-------------------------------------------------+-------+
| wikipage                                        | coeff |
+-------------------------------------------------+-------+
| Musa_balbisiana                                 | 5     |
| Southeast_Asia                                  | 5     |
| Musa_acuminata                                  | 4     |
| Kerala                                          | 4     |
| Musa_(genus)                                    | 3     |
| starch                                          | 3     |
| Philippines                                     | 3     |
| India                                           | 3     |
| Plantain_(true)                                 | 3     |
| Chiquita_Brands_International                   | 3     |
| Malaysia                                        | 3     |
| Indonesia                                       | 3     |
| Thailand                                        | 3     |
| Central_America                                 | 3     |
| Uganda                                          | 3     |
| Panama_disease                                  | 3     |
| fruit                                           | 2     |
| herbaceous                                      | 2     |
| Papua_New_Guinea                                | 2     |
| fiber                                           | 2     |
| #Cavendish                                      | 2     |
| List_of_banana_cultivars                        | 2     |
| Musa_velutina                                   | 2     |
| Fe'i_banana                                     | 2     |
| Ensete_ventricosum                              | 2     |
| Musaceae                                        | 2     |
| pseudostem                                      | 2     |
| inflorescence                                   | 2     |
| List_of_banana_cultivars#AAB_Group              | 2     |
| Synonym_(taxonomy)                              | 2     |
| Palestine                                       | 2     |
| Madagascar                                      | 2     |
| China                                           | 2     |
| Dole_Food_Company                               | 2     |
| Burundi                                         | 2     |
| Rwanda                                          | 2     |
| Hybrid_(biology)                                | 2     |
| genetic_engineering                             | 2     |
| Food_and_Agriculture_Organization               | 2     |
| Guatemala                                       | 2     |
| ICTSD                                           | 2     |
| Fyffes                                          | 2     |
| United_Fruit_Company                            | 2     |
| coffee                                          | 2     |
| International_Institute_of_Tropical_Agriculture | 2     |
| Daily_Value                                     | 2     |
| potatoes                                        | 2     |
| dopamine                                        | 2     |
| South_Asia                                      | 2     |
| Pisang_goreng                                   | 2     |
+-------------------------------------------------+-------+

sa: table[wikipage,coeff] select[1,50] coeff-sort inverse-links-to |WP: Banana>
+-----------------------+-------+
| wikipage              | coeff |
+-----------------------+-------+
| Economy_of_Costa_Rica | 1     |
| Flavor                | 1     |
| Economy_of_Honduras   | 1     |
| Economy_of_Jamaica    | 1     |
| Jericho               | 1     |
| Jell-O                | 1     |
| Economy_of_Malawi     | 1     |
| Neapolitan_ice_cream  | 1     |
| Thai_cuisine          | 1     |
| Vitamin_C             | 1     |
| Zingiberales          | 1     |
+-----------------------+-------+

sa: table[wikipage,coeff] select[1,50] 100 self-similar[links-to] |WP: Banana>
+---------------------------------------------------------------------------------+--------+
| wikipage                                                                        | coeff  |
+---------------------------------------------------------------------------------+--------+
| Banana                                                                          | 100.0  |
| Cooking_plantain                                                                | 15.096 |
| ITU_prefix                                                                      | 9.52   |
| Wikipedia:Status_of_the_porting_of_U.S._Dept_of_State_info                      | 9.088  |
| United_Nations_Industrial_Development_Organization                              | 8.376  |
| Wikipedia:Status_of_the_porting_of_the_CIA_World_Factbook                       | 8.096  |
| Abac                                                                            | 8.036  |
| International_Tropical_Timber_Agreement,_1994                                   | 7.813  |
| Deforestation                                                                   | 7.483  |
| International_Tropical_Timber_Agreement,_1983                                   | 7.366  |
| Western_imperialism_in_Asia                                                     | 6.983  |
| International_Hydrographic_Organization                                         | 6.92   |
| Foreign_relations_of_China                                                      | 6.806  |
| Lions_Clubs_International                                                       | 6.326  |
| Asia                                                                            | 6.242  |
| Passport                                                                        | 6.24   |
| Diaspora                                                                        | 5.931  |
| International_Electrotechnical_Commission                                       | 5.925  |
| Member_states_of_the_United_Nations                                             | 5.645  |
| Indian_Ocean                                                                    | 5.625  |
| History_of_Southeast_Asia                                                       | 5.62   |
| Solanaceae                                                                      | 5.581  |
| Tiger                                                                           | 5.56   |
| Kyoto_Protocol                                                                  | 5.527  |
| Curry                                                                           | 5.495  |
| Sugar                                                                           | 5.41   |
| Blood_alcohol_content                                                           | 5.389  |
| List_of_mountains                                                               | 5.388  |
| Rice                                                                            | 5.37   |
| Southeast_Asia                                                                  | 5.361  |
| Ceiba_pentandra                                                                 | 5.357  |
| Pigeon_pea                                                                      | 5.357  |
| Rat                                                                             | 5.219  |
| Abugida                                                                         | 5.116  |
| Chinatown                                                                       | 5.1    |
| Video_CD                                                                        | 5.086  |
| History_of_the_Pacific_Islands                                                  | 5.052  |
| Time_zone                                                                       | 5.037  |
| Asian_Development_Bank                                                          | 5.035  |
| Hindu                                                                           | 5.03   |
| Foreign_relations_of_Indonesia                                                  | 5.004  |
| Foreign_relations_of_Singapore                                                  | 5.002  |
| Economy_of_Burma                                                                | 4.993  |
| Spice                                                                           | 4.952  |
| Convention_on_Fishing_and_Conservation_of_the_Living_Resources_of_the_High_Seas | 4.911  |
| Foreign_relations_of_Taiwan                                                     | 4.842  |
| Tamil_language                                                                  | 4.828  |
| Risk_(game)                                                                     | 4.792  |
| Foreign_relations_of_the_Philippines                                            | 4.776  |
| Numismatics                                                                     | 4.748  |
+---------------------------------------------------------------------------------+--------+
  Time taken: 10 minutes, 7 seconds, 599 milliseconds

sa: table[wikipage,coeff] select[1,100] 100 self-similar[inverse-links-to] |WP: Banana>
+-----------------------------------------+--------+
| wikipage                                | coeff  |
+-----------------------------------------+--------+
| Banana                                  | 100.0  |
| Grape                                   | 27.273 |
| Pineapple                               | 27.273 |
| Mango                                   | 27.273 |
| Ascorbic_acid                           | 18.182 |
| Cranberry                               | 18.182 |
| Kiwifruit                               | 18.182 |
| Pear                                    | 18.182 |
| Tea                                     | 18.182 |
| Vanilla                                 | 18.182 |
| Chicken                                 | 18.182 |
| Peach                                   | 18.182 |
| apparel                                 | 18.182 |
| lime_(fruit)                            | 18.182 |
| pudding                                 | 18.182 |
| mangosteen                              | 18.182 |
| Coconut                                 | 18.182 |
| gelatin_dessert                         | 18.182 |
| Lemon                                   | 18.182 |
| galangal                                | 18.182 |
| Melon                                   | 18.182 |
| Grapefruit                              | 18.182 |
| Raspberry                               | 18.182 |
| Apricot                                 | 18.182 |
| Watermelon                              | 18.182 |
| Coffee                                  | 16.667 |
| Strawberry                              | 15.789 |
| Cherry                                  | 15.385 |
| custard                                 | 15.385 |
| Pork                                    | 15.385 |
| Sugar                                   | 14.286 |
| Apple                                   | 14.286 |
| condensed_milk                          | 14.286 |
| Garden_strawberry                       | 14.286 |
| Blackberry                              | 13.333 |
| raspberry                               | 13.333 |
| Fruit                                   | 12.5   |
| pistachio                               | 12.5   |
| Orange_(fruit)                          | 12.5   |
| tapioca                                 | 12     |
| Tomato                                  | 11.765 |
| Center_for_Economic_and_Policy_Research | 11.111 |
| Lime_(fruit)                            | 10.526 |
| turmeric                                | 10.526 |
| pineapple                               | 10.256 |
| Domestic_sheep                          | 9.524  |
| broccoli                                | 9.524  |
| Adobe                                   | 9.091  |
| Analytical_chemistry                    | 9.091  |
| Commelinales                            | 9.091  |
| Calf                                    | 9.091  |
| Celebrity                               | 9.091  |
| Cream                                   | 9.091  |
| Buddhist_cuisine                        | 9.091  |
| Celery                                  | 9.091  |
| Chocolate                               | 9.091  |
| Cola                                    | 9.091  |
| Ester                                   | 9.091  |
| Epipaleolithic                          | 9.091  |
| Food_additive                           | 9.091  |
| Glycine                                 | 9.091  |
| Gelatin                                 | 9.091  |
| Gelatin_dessert                         | 9.091  |
| James_Lind                              | 9.091  |
| Kathleen_Kenyon                         | 9.091  |
| Ceiba_pentandra                         | 9.091  |
| Mya_(unit)                              | 9.091  |
| Monosaccharide                          | 9.091  |
| Parsley                                 | 9.091  |
| Potato                                  | 9.091  |
| Pia_colada                              | 9.091  |
| Spice                                   | 9.091  |
| Soft_drink                              | 9.091  |
| Economy_of_South_Africa                 | 9.091  |
| Scurvy                                  | 9.091  |
| Tarsiidae                               | 9.091  |
| Zingiberaceae                           | 9.091  |
| Beef                                    | 9.091  |
| Cooking_plantain                        | 9.091  |
| Casimir_Funk                            | 9.091  |
| Pigeon_pea                              | 9.091  |
| Pistachio                               | 9.091  |
| Cod                                     | 9.091  |
| Tropical                                | 9.091  |
| tadpole                                 | 9.091  |
| water_buffalo                           | 9.091  |
| Phosphoric_acid                         | 9.091  |
| Tartaric_acid                           | 9.091  |
| Citric_acid                             | 9.091  |
| weak_acid                               | 9.091  |
| Lactic_acid                             | 9.091  |
| Dietary_Reference_Intake                | 9.091  |
| Canaanite_language                      | 9.091  |
| cell_adhesion_molecule                  | 9.091  |
| salivary_gland                          | 9.091  |
| bananas                                 | 9.091  |
| gourami                                 | 9.091  |
| fishing_industry                        | 9.091  |
| Molding_(process)                       | 9.091  |
| Carrot                                  | 9.091  |
+-----------------------------------------+--------+
  Time taken: 14 minutes, 27 seconds, 470 milliseconds
So, similar[inverse-links-to] works best. And quite noticeably better too! Exactly why, I'm not sure. Anyway, now we know to apply this to more examples. I'll do that in the next post.


Home
previous: how many wikipage links
next: more inverse simm results

updated: 19/12/2016
by Garry Morrison
email: garry -at- semantic-db.org