Skip Navigation

We’ve been asked how to translate texts that appear in the site but are not a part of posts or pages. Some examples are:

  • Text widget
  • Blog tagline
  • Other widget titles

What we’re thinking about is adding a generic string translation facilty which will allow localizing anything that’s text. The way it will work is:

When new texts are created (by the widgets save or blog configuration save), they will be automatically added to the list of strings in the website. Each string will have entries in all the site’s languages, which will be user editable.

If there’s a translation for a certain language, that translation will be shown. Otherwise, the original value of the string is shown.

It’s very similar to the way WordPress uses .mo files to localize itself, just managed automaitcally (without the need to search for new strings and add them to the .po/.pot file).

Do we translate according to string contents or token names?

.po / .mo files handle strings according to their contents. It doesn’t matter where the string is used but only what it contains. For example, if you translate the word ‘Save’, it can be used in a form for saving the current values or in the context of «Save the Earth».

The translation table (to Spanish) looks like:

String Translation
Save Guardar

When you use the multilingual string, you would normally enter something like:

<?php _e('Save') ?>

If there is translation for the string, it will be output. Otherwise, the original text is displayed.

Resource files in applications normally use a different approach. Each string gets a token, that describes what the string is used for.  Translations are organized according to the tokens and not the string values. In this case, the translation table would be:

Token String Translation
SAVE_BUTTON Save Guardar

In the theme, this string will be displayed like:

<?php _st('SAVE_BUTTON') ?>

It appears like a minor change, but it’s very important. Now, the translation is arranged according to the context of the string and where it’s being used and not according to the contents. Several strings that have the same value can have different translations – each on matching the context of the string.

Pros and cons for using tokens to organize translations

When writing PHP code that includes localized strings, it’s easier organizing translations by string contents without using any tokens. Writing is more straight forward. You just wrap texts in gettext calls, like __(‘foo’) and _e(‘foo’), and you’re done. The code is easier to read and much easier to manage. Gettext does all the work for you.

However, handling the translation for these strings is a bit of a pain in the b**t. You don’t really know what you’re translating and which parts of the site have been translated. It’s just a bunch of strings that are used by ‘somthing’ (but no one can tell for sure by what).

String translation in WPML

In the next release of WPML, we plan to include string translation. We’re a bit biased for arranging translations according to tokens. Since a program is doing the bookkeeping, creating tokens and using them is not going to be a problem. The advantages of this method would be in the translation interface. When you go to translate a string, you’ll see immediately what you’re translating, where it’s used and how it will appear when translated.

What do you think?

Got other suggestions?