Introduction

The purpose of this page is to document the technical details of the language pack generation process.

Ubuntu langpack admins

Language packs are administered by Ubuntu language pack builders.

language-pack- and language-support-

We provide the following packages for each language:

The following packages are optional and only exist for languages where we have additional software to provide input methods, writing aids or additional font packages for that language:

Warning /!\ Since Karmic, the following Meta packages are obsolete and have been removed from the archive.

Language codes

In general, we follow the ISO-639 list of languages and group all language variants into a single set of language-packs and language-support packages. Since Karmic, we have one exception to this rule: we have split the Chinese (zh) language-pack and language-support packages into two, one for Simplified Chinese (hans) and one for Traditional Chinese (hant). The reason is, that translations usually exist twice for Chinese, once for zh_CN, once for zh_TW, therefor the majority of all translations are duplicates, written in different scripts and sometimes with differing vocabulary. It is considered that putting both, Simplified and Traditional Chinese into the same package would be a waste of download capacity and disk storage for any Chinese user, since they only choose one variation for their translations, and the translations, due to their difference in script and vocabulary, cannot use the fall back mechanism of gettext.

This language-pack split has caused us to add certain exceptions to the code which handles translations and language-packs. This is true for langpack-o-matic, po2xpi and language-selector.

Mapping between locales and Simplified vs. Traditional Chinese

Usually, translations use a locale code (language + country) to indicate whether they are Simplified Chinese or Traditional Chinese translations. The most common of those are zh_CN (= Simplified Chinese) and zh_TW (= Traditional Chinese). Sometimes other regions use a different vocabulary than the mainstream Simplified and Traditional Chinese. Therefor, we can define the following mapping (in the order of fall back):

Translations Export from Launchpad

Translations are exported into translation tarballs once per week for stable releases and twice per week for development releases of Ubuntu (Exports schedule). These tarballs are stored in launchpadlibrarian.

For each Ubuntu release, there is a language-pack administration page:

The first translation export in a newly started release cycle is always a Base (or "full") export, i.e. the tarball contains all translations for all templates in that release where the templates are marked as to be exported into language packs, and is usually done manually on request by the Launchpad Translations team (aka Rosetta team). The setting whether or not the translations will be exported into the translation tarballs is on the template admin pages and is accessible by Ubuntu Translations Coordinator team members.

The subsequent translation exports are per default Delta exports, i.e. the tarballs only contain those translation files, which have been changed since the configured active Base translation tarball.

The language-pack administration website has the following lists and configuration items:

Layout of the translation tarballs

./rosetta-$release/
 ├ mapping.txt
 ├ timestamp.txt
 ├ $languagecode/
 │ └ LC_MESSAGES/
 │   └ $translation_domain.po
 └ xpi/
   ├ firefox[-3.5]/
   │ └ $languagecode.po
   └ xulrunner[-1.9.1]/
     └ $languagecode.po

langpack-o-matic

Langpack-o-matic is a service running on a Canonical internal server, which assembles language-packs out of the translation tarballs from Launchpad.

Directory structure

/srv/language-packs.ubuntu.com/
 ├ home/
 ├ $release[-proposed]/
 │ ├ sources-base/
 │ │ ├ language-pack-$languagecode-base/
 │ │ ├ language-pack-gnome-$languagecode-base/
 │ │ └ language-pack-kde-$languagecode-base/
 │ ├ sources-update/
 │ │ ├ language-pack-$languagecode/
 │ │ ├ language-pack-gnome-$languagecode/
 │ │ └ language-pack-kde-$languagecode/
 │ ├ sources-support/
 │ │ ├ language-support-$languagecode/
 │ │ ├ language-support-fonts-$languagecode/
 │ │ ├ language-support-input-$languagecode/
 │ │ ├ language-support-writing-$languagecode/
 │ │ ├ language-support-translations-$languagecode/ (only for Hardy to Jaunty)
 │ │ └ language-support-extra-$languagecode/ (only for Hardy to Jaunty)
 │ └ zh-transitional/ (only for Karmic and Lucid)
 ├ langpack-o-matic/
 └ logs/

Code directory structure

./langpack-o-matic/
 ├ bin/
 ├ check-supdeps-components
 ├ config/
 ├ copy-packages
 ├ cron.daily
 ├ doc/
 ├ extra-files/
 ├ import
 ├ langpacksize
 ├ lib/
 ├ maps/
 ├ merge-tarballs
 ├ mozilla-upstream-locales/
 ├ operator-guide.txt
 ├ packages
 ├ po2xpi/
 ├ skel-*/
 ├ support-depends/
 │ └ $release/
 │   └ $languagecode
 ├ updated-packages
 ├ update-maps
 ├ update-support
 ├ upgrade-notes/
 └ zh-transitional/

po2xpi (Mozilla translation handling)

Background knowledge

  1. XPI layout vs. upstream VCS The Layout of the XPIs (and also how Firefox accepts them) is radically different than what upstream uses it its VCS (Mercurial).
    1. XPI format:

      XPI files are nothing else than ZIP archives. You can uncompress them with unzip $langcode.xpi in a temporary directory. The archive structure looks like this:

        ./
         ├ chrome/
         │ └ $langcode.jar
         ├ chrome.manifest
         └ install.rdf

      The $langcode.jar file is another ZIP archive containing the real translations:

        ./locale/
         └ $component/
           │ ├ .dtd
           │ └ .properties
           └ $subcomponent/
             ├ .dtd
             └ .properties

      The chrome.manifest file states the paths to the translations for the different Firefox components within the .jar file, the install.rdf file is a XML file which contains some metadata:

      <?xml version="1.0"?>
      <!--
      
      -->
      
      <RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
           xmlns:em="http://www.mozilla.org/2004/em-rdf#">
        <Description about="urn:mozilla:install-manifest"
                     em:id="langpack-bs@xulrunner-1.9.ubuntu.com"
                     em:name="Xulrunner (bs)"
                     em:version="1.9"
                     em:type="8"
                     em:creator="http://translations.launchpad.net">
          <em:contributor></em:contributor>
      
          <em:targetApplication>
            <Description>
              <em:id>toolkit@mozilla.org</em:id>
              <em:minVersion>1.9</em:minVersion>
              <em:maxVersion>1.9.0.*</em:maxVersion>
            </Description>
          </em:targetApplication>
        </Description>
      </RDF>

      Warning /!\ Please notice the version numbers within the install.rdf file! They decide whether or not the translations are compatible with the current Firefox or Xulrunner version. The above example is compatible to the Xulrunner version we ship in Hardy, Intrepid and Jaunty (i.e. Xulrunner 1.9), and won't work with the version we ship in Karmic and Lucid currently (i.e. Xulrunner 1.9.1).

    2. Upstream VCS (Mercurial) layout:

  1. Checking upstream Mozilla translations for errors before they get released
    • FIXME: Ask jtv to review this section.
    Yes, we can do that! That is, if you have the Launchpad source code unpacked on your machine. To set up the environment to do such checks, you'll need the following:
    1. The Launchpad source code (https://dev.launchpad.net/Getting)

    2. The following manual steps:
        $ cd ~/launchpad/lp-branches/
        $ bzr branch lp:~jtv/launchpad/validate-translations-file
        $ cd validate-translations-file
        $ ./utilities/link-external-sourcecode ../trunk

    Finished. Now you can use the same XML parser as in Launchpad to check if the translations will pass the import script (in this example, all the upstream translation branches are under ~/build/mozilla/):

     $ cd ~/launchpad/lp-branches/validate-translations-file
     $ find ~/build/mozilla/ -name \*.dtd | xargs ./scripts/rosetta/validate-translations-file.py &> ~/mozilla_dtd_errors.log

    Now, you have the report in ~/mozilla_dtd_errors.log and can fix the upstream translations in their VCS tree. Submit the diff of your changes to the Mozilla bugtracker, one for each language you modified.

    (i) The XML parser Launchpad uses is not the best one out there. If you want to help improve or replace the parser, the source code is in ~/launchpad/lp-branches/trunk/sourcecode/old_xmlplus/.

  2. Mozilla translations from upstream get imported into Launchpad:
    1. History:

      While upstream ships Firefox as a single application, Debian and Ubuntu have split the source into the Xulrunner and Firefox packages. Xulrunner provides the base and Firefox is a XUL application, which gets executed in a XUL environment. This allows other applications, which are, like the Firefox extensions, written for XUL, to be executed as standalone programs in a XUL environment and don't depend on the whole Firefox browser.

      (i) Starting from Firefox 3.6, upstream will not use XUL any more for their browser extensions. As a consequense we won't split the Firefox code any more, but only ship one package, like upstream does.

      (i) When Firefox 3.6 gets released, we will update Firefox in all actively supported Ubuntu releases (Hardy to Lucid) to Firefox 3.6.

    2. Templates:

      What is the .pot file in applications using gettext, is the en-US.xpi package for Mozilla applications. Normally, when building Mozilla applications from source, the build process extracts the English strings from the source and puts them into a en-US XPI directory structure. This directory structure gets installed together with the compiled program.

      We have patched the Firefox and Xulrunner packages in order to extract that directory structure and convert it into a XPI package.

      This en-US.xpi pakage gets pushed to Rosetta the same way the stripped .pot and .po files get pushed from other applications. Rosetta treats the en-US.xpi as a template for the corresponding source package.

      (i) Although the en-US.xpi packages generated by the Firefox and Xulrunner packages carry the same name, they only contain the necessary strings for the corresponding component.

      (i) Starting from Firefox 3.6 the Firefox and Xulrunner templates in Launchpad will get merged into one. The corresponding translations will also get merged. This is a manual process and will take some time.

    3. Translations: Upstream Mozilla ships a number of translations as XPI packages. Since upstream does not separate Firefox from Xulrunner, their translation XPI packages contain the translations for both components. Every time upstream releases a new Firefox version, we need to update our language-packs to include their latest XPI translation packages. Occasionally the format changes, so older XPI translation packages are not compatible with the latest Firefox version, even for minor upgrades. For that we need to manually pull the upstream translations from Mozilla and import them into Launchpad. This is currently handled by Arne Goetje. Occasionally Launchpad refuses to import some upstream XPIs, because of buggy files inside the XPI packages. In those cases the XPIs need to be patched and uploaded again, until Launchpad accepts them. When uploading upstream XPIs to Launchpad, the import script will take care of splitting the translations into the firefox and xulrunner templates.

Building XPIs

When traversing the ./rosetta-$release/xpi/ directory in the unpacked translation tarball, the .po files will get fed to the po2xpi script.

1. Directory structure in po2xpi/data/:

./data/
 ├ $ubuntu_release_version/
 │ ├ blacklist.txt (obsolete)
 │ ├ $template (firefox[-3.5] and xulrunner[-1.9.1])
 │ │ └ xpi/ -> ../../common_xpi[-3.5]/ 
 │ ├ merge-hints.txt
 │ └ whitelist.txt
 ├ common_xpi/
 │ └ .xpi
 └ common_xpi-3.5/
   └ .xpi

For each language, po2xpi will check if the language has upstream XPI files in data/common_xpi[-3.5]/ or is whitelisted in data/$ubuntu_release_version/whitelist.txt.

Due to the way Mozilla uses translations, it is required that the translations are 100% completed in Launchpad, before they can be used. In contrast to gettext, where the English string is used as an identifier and is used as a fall back in case the translation string is empty, Mozilla uses a variable name as identifier. Mozilla does not have a fall back mechanism. That's why every message identifier must have a value.

The po2xpi script takes care that if a message identifier does not have a value in the .po files, which have been exported from Launchpad, the corresponding value from en-US.xpi gets filled in.

Warning /!\ If we would import those results into Launchpad again, like we did for the Firefox-3.0 to Firefox-3.5 transition and will do again when Firefox-3.6 comes out, the templates would appear to be 100% translated for those languages, although the missing translations have actually just be filled in from en-US.xpi and therefor should be considered to be untranslated.

The import script

The import script in the langpack-o-matic source tree assambles the language-pack source packages. If not present, it creates package skeletons in ../$release[-proposed]/sources-base/ and ../$release[-proposed]/sources-update/ by copying the skel-base/ and skel-update/ directories respectively.

Then, it traverses the tarball, pipes every .po file through msgfmt to check for errors and copies them into their respective language-pack structures.

If there are files in the xpi/ subdirectory, the import script will call po2xpi, which will build mozilla translation structures and tar them up into a mozilla.tar tarball. This tarball gets copied into the language-packs.

If there are files in the extra-files/ directory, they will get tar'ed up into extra.tar tarballs and also copied into the language-packs.

Since Karmic, the import script will check for static translations (e.g. gnome-docs translations) on launchpadlibrarian and copy them into the language-packs as well.

Upload

Langpack PPA

Recipes

These recipes are intended for common actions carried out by the the UbuntuTranslationsCoordinators team

Updating language packs for the stable release

(i) If you are a language pack admin you should consult the langpack-o-matic operator guide for more detailed technical information on how to perform some of these steps.


CategoryTranslations

Translations/TranslationLifecycle/LanguagePackGeneration (last edited 2010-03-16 08:01:00 by www)