Wednesday, November 24, 2010

Query A Non-English Wikipedia

Pandorabot V3 and later can be configured to query a non-English Wikipedia. At the time of this writing Wikipedia supports 276 languages. Although English is the largest Wikipedia there are many very extensive non-English Wikipedias. If your Pandorabot will be interacting primarily with a non-English speaking public you may wish to configure it to query one of the non-English Wikipedias. Note that when configured in this manner the Pandorabot will still fall back on the English Wikipedia if the search phrase is not found in the non-English Wikipedia.

The easiest and quickest way to configure your Pandorabot V3 to access one of the non-English Wikipedias is to use the GUI dialog menu. Click your active Pandorabot and select "Language". In the Language menu, select one of the ten languages. The ten languages supported in the dialog menu are English, Deutsch, Français, Polski, Italiano, Nihongo, Español, Nederlands, Português, and Russkiy. These ten languages represent the ten Wikipedias with over half a million articles. Exit the configuration dialog menu and your Pandorabot is now set to query the non-English Wikipedia you selected.

If you wish to query a language Wikipedia not listed in the dialog menu then it is necessary to edit the Configuration notecard. Edit your Pandorabot, open the Contents tab, open the Configuration notecard within the contents, and add a line at the top:
    LANG_CODE = two_letter_language_code
    LANG_NAME = language_name
For instance, to query the Greek Wikipedia you would add the lines:
    LANG_CODE = el
    LANG_NAME = Greek
The language codes and names are listed on the List of Wikipedias.

It is also possible to change which ten languages are available in the dialog menu by editing the Configuration notecard and setting the LANG_CODES and LANG_NAMES variables. Follow the examples for the default settings in the Configuration notecard's comments.

Finally, you may also wish to configure the Wikipedia trigger phrases. These are, by default, English phrases like "what is" and "who are". To change these phrases to another language edit the Configuration notecard and add a line to the top:
    WIKIPEDIA_TRIGGERS = comma_separated_list_of_triggers
For instance, if you set "LANG_CODE = fr" and "LANG_NAME = French" then you may desire French trigger phrases for your French speaking visitors. To do so you would set something like the following:
    WIKIPEDIA_TRIGGERS = ce qui est ,?,quelle est ,?,quel est le ,?,qui est ,?,qui sont ,?
and so on (i did not translate all the triggers - that is your job!). Trigger phrases should be lower case, end in a space, and be separated by a question mark or whatever the symbol for a question is in your selected language.

No comments:

Post a Comment