Unlike other distributions we do not ship upstream translations from source packages directly in binary application packages in main and restricted, since this does not allow us any flexibility in editing them on a central place (Launchpad), updating them independently from the applications, and updating them post release.
That is why we separate packages and their translations in Ubuntu and maintain/package them independently.
We currently do not apply this treatment to packages in universe and multiverse. We plan to change this in the future, but there are no concrete dates yet.
Extraction from source packages
While a source package is built, the autobuilders extract several translation related files from the built tree:
The PO templates (*.pot) which contain all translatable strings for each translation domain used by the package. Since it is important that this is up to date (and includes all string changes introduced in distro specific packages), the template should be generated dynamically during package build. For most packages this happens with intltool-update -p.
All existing PO translation files (*.po) which contain the upstream translations for a particular language in gettext format. This applies to the vast majority of free software.
Other translation related files for specific projects: *.ts (translation format used by Qt).
The extracted files are put into a tarball, which is added to the binary .changes file (section raw-translations).
If the package is in main, this process also removes all /usr/share/locale/.../*.mo files (binary gettext translation files) from the final .debs, since these are shipped separately in language packs (see below).
All this is done by the pkgbinarymangler package. The Ubuntu autobuilder chroots have this package installed and enabled in /etc/pkgbinarymangler/striptranslations.conf by setting enable: true. By doing this locally, one can replicate the setup and build translation tarballs/stripped .debs locally.
Import to Launchpad translations
On upload of the binary packages, the translation tarball gets picked up and imported to Launchpad translations. This database contains both the imported translations, and the ones entered in the web UI. Note: as of December 2008, the translation precedence policy was changed so that upstream ("packaged") translations will be given more priority in specific cases. Yet, Launchpad Translations keeps the ability to override any specific upstream translation if so is desired.
Now translation teams can download PO templates, edit them locally, and upload them again, or enter translations directly in the web UI. Please see TranslatingUbuntu for details about and help for this process.
Export from Launchpad translations
Launchpad translations now sorts all available translations by Ubuntu distribution release, locale, and translation domain, and exports them in tarballs again. For example, all available tarballs for Ubuntu 8.04 ("Hardy Heron") are linked on https://translations.launchpad.net/ubuntu/hardy/+language-packs.
There are two kinds of tarballs:
- Full language packs contain the complete set of all translations for all languages of a particular Ubuntu release. Thus they are quite big (usually in the order of 500 MB compressed). They are generated only on request, since it is quite expensive to build them.
- Delta language packs only contain PO files which have changed or have been added since the last full export. They are much smaller, and generated automatically once or twice a week.
Translation packages for Ubuntu
The tarballs from Launchpad get processed and turned into packages by a set of scripts called langpack-o-matic, which runs in the Canonical data center (on macquarie) and is maintained by the ubuntu-langpack team.
langpack-o-matic builds one set of language packs for each language XX, which are split as follows:
- Translations are sorted into categories: GNOME for everything that just applies to GTK and GNOME (thus, suitable for Ubuntu), KDE for everything related to Qt and KDE (suitable for Kubuntu), and a "common" pack for everything that either applies to both (e. g. the Xine library) or no particular desktop environment (such as compilers, coreutils, etc.) and thus should be installed for every derivative.
- Corresponding to the full and delta exports from above, there is a base pack with the complete set of translations, and a much smaller update pack with just files which changed since the latest release of base. This split exists for each category.
Finally, the PO files are wrapped by some automatically generated standard packaging, so that they turn into real Ubuntu source packages, binary packages, etc. Thus, for each language code XX, the following set of packages (both source and binary) is produced:
The language packs do not ship their *.mo files in the standard /usr/share/locale/ directory, but in /usr/share/locale-langpack/, to avoid file conflicts with locally built or third party packages. Our libc package has a patch which falls back to searching in locale-langpack/ if a .mo file is not found in locale/.
In addition to language packs, langpack-o-matic also creates a set of language-support-XX metapackages for each language which depend on language specific packages which are not gettext translation related, such as dictionaries, thesauri, help files, fonts, input methods, etc.
Automatically-built weekly packages
Translators and interested users are highly encouraged to use these packages and immediately report problems to email@example.com.
For the development release, langpack-o-matic uploads updates directly into the main archive twice a week.
Base and update packages in the PPA
It is normally not necessary to have the -base packages in the PPA, since those are generated upon a new distro release and are already in the normal archive.
As a general rule, the PPA archive contains only the delta packages (i.e. those without the -base suffix), with some exceptions:
- For LTS releases, such as Lucid, the -base packages are regenerated for every maintenance release (.1, .2, .3 and .4). Therefore those will also be in the PPA.
- In case the delta packages become too big, -base packages might also be regenerated for other releases (it will be done for e.g. Intrepid after message sharing is in place). Those will then also go into the PPA.
Official stable updates
On every first Monday of a month, the current PPA packages are copied to -proposed for all currently supported stable releases. Then a call for testing is announced to the Ubuntu Translators mailing list. After getting some positive feedback (and no problem reports), the packages are copied to -updates after at least a week.
Other language packs
Translations are fetched from https://ftp.mozilla.org/pub/mozilla.org/firefox/releases/ and https://ftp.mozilla.org/pub/mozilla.org/xulrunner/releases/ and manually imported into Launchpad Translations upon each Firefox release included in the distro (this includes also security releases).
The prefered way for an user to configure language support in Ubuntu is the GNOME language selector:
If you enable a particular language there, it checks your installed packages to detect whether you need the GNOME or KDE packages (or both), and installs all necessary language packs and language-support-XX. It also takes care of setting/changing the default locale.
On startup, language-selector checks if language support is incomplete (for example, you installed your system without network and language-support-XX is not on the CD for your language) and offers you to complete it:
The maintainer scripts of language-pack-XX-base automatically install all UTF-8 locales for language XX, so normally users do not need to worry about this.
If you want to add a locale without any translations, the easiest way is to call locale-gen with the locales as arguments:
$ sudo locale-gen ru_RU he_IL.UTF-8
You can also generate all available UTF-8 locales for a given language by merely specifying the language code:
$ sudo locale-gen de
How much space does each language occupy on the media? langpack-o-matic has a script
langpacksize which calculates the sizes of each language pack set and each installation variant (just GNOME, just KDE, or both). The list is sorted by 'relevance', i. e. the world's top 11 languages come first, then the rest alphabetically. This is the order of preference we use to fill the Ubuntu installation CDs. For this purpose it also calculates the cumulative size, i. e. the size of the current and all previous language pack sets.
It uses apt-cache to get package information, so you have to run it on the Ubuntu release for which you want to get numbers. Example:
$ ./langpacksize MB en G: 0.86 K: 1.91 G+K: 2.35 GSum: 0.86 KSum: 1.91 G+KSum: 2.35 es G: 8.31 K: 10.56 G+K: 14.21 GSum: 9.17 KSum: 12.46 G+KSum: 16.57 xh G: 1.13 K: 0.14 G+K: 1.16 GSum: 10.30 KSum: 12.61 G+KSum: 17.73 pt G: 10.54 K: 15.98 G+K: 22.22 GSum: 20.84 KSum: 28.59 G+KSum: 39.95 [...](The "MB" argument causes the numbers to be Megabytes. If it is not specified, the output is in bytes.) This means that on a GNOME (Ubuntu) CD, Spanish (es) takes 8.31 MB (which is the GNOME and common langpack). Spanish on a KDE CD (Kubuntu) takes 10.56 MB (KDE and common pack), and all three together take 14.21 MB. All languages from the top up to (and including) Portugese on a GNOME CD will take 20.84 MB. Please note that the script takes necessary input support packages into consideration as well (e. g. the necessary SCIM tables for Chinese).
More technical information on language packs
Language pack generation
The language pack generation page offers extensive information on the structure and procedures to release language packs.