BetterDesktopCDLanguageSupport

Summary

We should find a way to provide more considerable amount of language support on the desktop CDs. This spec is about improving the language selector (allow easy enabling of language support before installation) and including more languages on the CD (better compression).

Rationale

Currently the selection of languages on the CD is limited and varies from release to release (wildly, for some languages). For most languages the desktop CD UI is in English always. Non-native UI is a bit of a turn-off, but it'd still be nice to stick to the official CD images.

Use Cases

  • Tiina is starting Ubuntu desktop CD, after selecting Finnish ("suomi") in the boot menu as instructed in the installation guide she's reading. The CD starts, but everything is in English. She is sad.
    • Jaunty: This one has not seen that much change, LZMA is still not in use on the live cd. Other space improvements have always been eaten by new/bigger software otherwise.
  • Erkki let the desktop CD start by itself, ignoring the cryptic boot menu and just pressing enter. The Ubuntu desktop CD starts, but it's in English and nowhere in plain sight there is an option to change the language to his native language. He starts up the installation program nevertheless, selects Finnish and goes through. In the installed system, the UI is still in English since he didn't enable his WLAN connection before starting the installation program. He is furious, especially since it worked out-of-the-box (=Finnish was included on the CD) in Ubuntu 6.10, and 6.06, but not in 6.06.1 or 7.04.
    • Hardy: Boot menu improved, language selection always offered, DVDs improved, otherwise same.

      Jaunty: Erkki's case is quite well fixed finally in 9.04, because of bug #311228 fix and the preceding fixes. Post-install guidance now works to the end, if the user just happens to read the notification.

Scope

For those who have Internet connection: essentially a better and more visible language selector-downloader (on bootup already, before installation) is what would fix this.

For those without Internet connection: usage of rzip or Squashfs LZMA on the desktop CD so that more languages can fit in. And optimization of language packages.

Design

Language Selector

better-livecd-language-selector (old spec): This spec's description is a bit vague, but I'd think it's suggesting that, since we can't include everything on the CD anyway, we should provide an easy, automatic download method of language support even before the installation. For example, on bootup ask for the language so that that Ubuntu would connect to Internet and download the language support. Many people miss the "F2" selection in the boot menu, so language should be automatically asked for if the default (English) was selected (or rather not selected) in the boot menu.

  • Hardy: Great boot menu improvement, language dialog automatically opens so F2 not needed.
    Jaunty: Language selector got some important fixes (by mvo and me) and now suggests translations to be installed post-install properly.

SquashFS

Better compression would allow for more languages, make the problem go away for some more languages.

Squashfs LZMA (7-zip's default compression algorithm) could save in practice ca. 100MB-150MB of space, but the numbers are from the dpkg-7zip usage. On the desktop CD, 7-zip makes only sense via the squashfs lzma (thanks Mithrandir), not dpkg-7zip.

  • sladen/IRC: squashfs and lzma is mostly pointless without HUGE sectors (>64kB) (tests?)

rzip might be used instead of lzma "rzip is a (very) simplistic pre-processor. There's a version of rzip somewhere built only with the preprocessor and without the second-level bzip2 code". but only for non-desktop CDs? ("take all the language packages, uncompress them and rzip the lot together")

Though this is pointless from the SquashFS point of view, these are the numbers from compressing the contents of Ubuntu 7.04 SquashFS image with tar/gzip, tar/bzip2, and 7z/lzma, just to demonstrate the huge differences between compression algorithms:

  • 703991685 gzip.tar.gz 637187858 bzip2.tar.bz2 463900523 lzma.7z

<pkl_> lzma does bring benefits. 15% better compression perhaps. [with SquashFS]

Language Packages

Language files (both the MO files and editable PO files) include the untranslated English strings for every language, since the msgid:s are the English strings. This should be somehow avoided for rather big space savings.

Sidenote: in some projects, it has been experimented that msgid:s are short ones and there's an own MO file for English, too, which contains the actual English string. irc blurbs:

These specifications would help also:

language-packs-for-documentation: Language packs for documentation could be done, and it'd be rather reasonable to download those from the Internet for other than English, like is currently done for language-support-NN packages. It'd probably save some amount of space on the CD for the actual UI language packs.

Jaunty is seeing some work on this, in a superseding blueprint jaunty-gnome-help-langpacks

larger-livefs would address the problem on DVDs, but not on the desktop CD images.

DVD:s now include all langpacks properly!

Implementation

Language Selector

On bootup, present a modified (System/Administration/)Language Support UI with a simple language selection (like the first screen of the installer), that

  1. Asks for the language (default being what was selected in the boot menu, ie. English is nothing was selected)
  2. Then checks the status of language packs for the selected languages as usual (your language support is not complete, would you like to download?). Tells that Internet connection has to be in use, and that one can enable WLAN connections via network manager icon.
  3. Downloads and installs the language packs, maybe restarting gnome-panel to have it translated.

SquashFS

http://en.wikipedia.org/wiki/SquashFS

Compression savings come from finding duplicated strings.

Rzip/LZMA usage. SquashFS-LZMA might bring 15% benefit. Any way to use rzip on the Desktop CD?

One option is to just concentrate on the language packs, and other the whole CD. Or compress the whole CD in a better way and do special tricks with the language packs.

As a long term idea, desktop and alternate CDs could be combined.

Kernel space limitations

There are memory usage restrictions when working in kernel space (128kB allocations), which cause additional constraints to also compression. In addition to SLAB you can use other allocators that allow for bigger allocations (at least from the usage point of view).

AP: SquashFS-LZMA (if LZMA support gets into kernel), much larger block size (also with the current SquashFS).

Outstanding Issues

BoF agenda and discussion

Seems to have turned into a discussion of SquashFS inner workings.


CategorySpec

BetterDesktopCDLanguageSupport (last edited 2009-02-20 12:15:17 by dsl-hkibrasgw2-fef7de00-81)