Glossary

Ubuntu Development > Internationalization Guide > Glossary

Glossary of internationalization terms

gettext
Gettext is the underlying and most widely used technology to enable translations of Open Source projects. It defines a standard format of translation files translators can do their work with (PO files) and lets applications load those translations compiled in a binary format (MO files) at runtime. It also defines the API, and it has implementations in many programming languages. The comprehensive gettext manual can be a very useful reference,

Imports queue

intltool
Intltool is a higher level tool that adds functionality to gettext by allowing the extraction of translatable strings from a variety of file formats.

It can be used to:

  • Extract translatable strings from various source files (.xml.in, glade, .desktop.in, .server.in, .oaf.in, .policy.in, etc.).
  • Merge the extracted strings with the messages from traditional source files (.c, .h) in po/$(PACKAGE).pot.
  • Merge back the translations from .po files into the .xml, .desktop, .policy files created at build time.

It has also become a standard tool when implementing internationalization for OSS projects. Nearly all (if not all) GNOME projects, for example, use intltool.

Message catalog (MO file)
Packages using gettext install translations as MO files, which provide language specific translations for a particular translation domain. At run time, software can use gettext's API to obtain the translation of a particular string for a particular language for a particular translation domain from the corresponding installed MO file. MO files are essentially stripped down PO files: they are compact and do not contain data that is not needed at run time. MO files are named based on the translation domain for which they are responsible. Their location in the directory structure shows which language's translations they contain. (Gettext will look in a prioritized set of directories for the MO file for a domain and language. This mechanism allows MO files to be installed by language packs, but also by the package itself, in which case the package's MO file takes precedence.)

For example, imagine a translation domain of 'mydomain', and imagine that some software using gettext wants to retrieve the French translation of some particular phrase, say, "phrase of interest" from that domain. Gettext will first look here: /usr/share/locale/fr/LC_MESSAGES/mydomain.mo. ("/usr/share/locale/..." is the highest precedence directory gettext uses). If the mydomain.mo is found there, it is checked for the target string ("phrase of interest") and its French translation, if any, is returned by gettext to the software for display. If mydomain.mo is not found, gettext looks here: /usr/share/locale-langpack/fr/LC_MESSAGES/mydomain.mo, and if it is found, it's translation is retrieved. Otherwise, gettext returns the message itself, untranslated, namely: "phrase of interest".

Note: MO files are (almost) never created by hand but are created at package build time from source PO files. However, you can use 'msgfmt -c <PO-file>' to manually create an MO file, which tests the syntax of the PO file and fails if the PO file contains errors.

Translation domain (domain)
The translation domain is a unique string assigned by the programmer in the code (usually in the build system) and used by the gettext functions to locate the message catalog where translations will be loaded from. The general form to compute the catalog's location is:

dir_name/locale/LC_category/domain_name.mo

which in Ubuntu expands to either /usr/share/locale/$LOCALE/LC_MESSAGES/domain_name.mo for translations not included in language packs (not in main) or /usr/share/locale-langpack/$LOCALE/LC_MESSAGES/domain_name.mo for those shipped in language packs. $LANGUAGE is generally the ISO 639-1 2-letter code (e.g. 'ca' or 'de') or ISO 639-2 3-letter code (e.g. 'ast') for the particular language. As an example, when using Nautilus in a Catalan locale with Ubuntu, the gettext functions will look for the message catalogue at:

/usr/share/locale-langpack/ca/LC_MESSAGES/nautilus.mo

The corresponding translation template should have the same translation domain in its filename, e.g. nautilus.pot. The translation domain must be unique across all applications and packages. I.e. something generic like messages.pot won't work.

Translation template (POT file, PO template file)
This term is used in Launchpad Translations to identify a set of messages that need to be translated (and sometimes also the set of translations for them). These messages are (for gettext packages) defined in a POT file in a source package. A source package using gettext may have one or more POT files. Each POT file has a unique name that specifies its 'translation domain' (see above). The POT file name becomes the template name in LP (without the ".pot" extension). Non-gettext packages do not use POT files, for example Mozilla uses XPI files, but these may still have LP templates that are imported using other means. A package's POT files are generated or updated at package build time or manually, after which they contain the package's latest set of translatable messages. Then, the POT files are imported into LP Translations at package upload time and, optionally, later through other LP functions. On import, if a template already exists in LP, the LP data for it is updated to reflect the new set of translatable messages from the corresponding POT file, otherwise LP adds a new template. Each template usually also has translations associated with it in LP. Thus, the term "template" is often used to indicate all of this taken together, namely, the set of translatable messages for a specific translation domain and all the corresponding per-language translations.

Note that "template" in this LP sense corresponds to a POT file in a source package sense, but the two are not identical. Bear in mind that Launchpad Translations has a database into which messages and translations are imported and from which they are exported. POT files and PO files are not stored in the database, but are generated as needed.

Translation (PO file)
Source packages using gettext contain translations (for a given translation domain and for each language) in dedicated PO files. PO file names are based on the language code, for example: fr.po for French, or pt_BR.po for Portuguese/Brazil. These PO files are imported to Launchpad Translations on package upload and at other times using other LP functions. Their translations are merged with any existing translations for the target template (that is, for the translation domain) in LP. The results of this merge depend on the context (in some cases, the new translations appear as suggestions, in other cases, they are considered the upstream translations and take precedence over existing LP translations (unless specific messages were intentionally modified in LP). Once imported, translations are displayed in the Launchpad Translations web UI. As with POT files, PO files can be exported from LP Translations. (Note that PO files are not saved in the LP database, rather they are generated from the database on export as needed.) For example, PO files can be downloaded in a tarball, and they can be automatically exported once a day to a specified bzr branch. Such exporting allows project maintainers to easily integrate translations done in Launchpad back into a source package for later use at run time. (Note that this reintegration is not required for packages in Ubuntu Main because these packages derive their translations from language packs, which are language-specific exports of LP Translations.)

Here you'll find more information on the format of PO files.


CategoryUbuntuDevelopment CategoryTranslations

UbuntuDevelopment/Internationalisation/Glossary (last edited 2010-02-18 10:05:36 by 28)