BetterCJKSupportSpecification

Revision 21 as of 2005-12-08 06:25:58

Clear message

Summary

This project aims at improving CJK support in Ubuntu.

Rationale

As of Breezy, the default configuration for various applications and the whole desktop is not so suitable for Asian users and users from certain countries. For example, default desktop font size is simply too small for CJK users, especially Chinese. To improve the experience for these users, some packages need to be patched, while others may need additional configuration.

Use cases

  • Chulsu installed Ubuntu onto his laptop and opened Firefox to see his favorite Korean web forum. Then, he found that first, "why this page looks diffrent than Firefox on Windows of my desktop", second, "how to input Korean to write my reply to the forum" and so on. He started to search Ubuntu Korean wiki and KLDP, and asked his questions. Spending several days, he just knew about how to install font packages, how to configure .fonts.conf under his home directory, how to install and use his Korean Input Method and so on. Now, he is thinking "why Linux is so difficult than Windows, but if all of these installed and configured when I installed Ubuntu that's the way to go."
  • Yeonhee loves to listen her favorite CDs when she is working on OpenOffice for her writing. For her one month trip to Jeju island, she wanted to convert them into MP3, but couldn't find a convert tool from Ubuntu installed on her new laptop. Anyway, she converted her favorite songs with MP3 music tag from her Windows, then opened Rhythmbox on Ubuntu laptop to listen them in case of testing. Now, she is looking at the song names aren't correctly shown up with Korean, "how I gonna go to my trip?"

  • Miyoung wanted to try Linux for her class, but she never used Linux before. Her classmate gave her Unbuntu CD so she was happy. But, on the way back to home, she felt some difficulties for installing the CD to her desktop which already had Windows installed, decided that "OK, I am going to search Ubuntu site for installing, what a great if I can find Korean guides are there, I should learn English..., does this CD support Korean?...".

Scope

Design

  1. Install a default input method such as scim, and start it automatically when user start X. Besides, users should be allowed to have their own individual setting.
    • Useful links here for Korean Input Methods
    • Scim shall be the default input method, there are already many IM engines based on scim, and so far its language support is the best. Even Fedora and Mandriva are using it by default.
  2. Better environment variables tuning for CJK users in language-selector, installer etc.
  3. Tune fontconfig setting to achieve better CJK fonts display (e.g. more solid font outline, bold type, use bitmap for medium font size etc). Surely,to obtain this, we need some font package's support like ttf-arphic-uming/ukai.
    • Use ttf-arphic-uming/ukai by default, since these are the only package that contain Hong Kong characters for all sizes.
    • Install xfonts-wqy for simplified Chinese installation; ttf-newsung is not needed since it has already been included into uming/ukai.
    • Regarding this fontconfig topic, Korean Linux users are discussing about default font for Distro instead of ttf-baekmuk, currently most favorite font is ttf-unfonts then ttf-alee. ["KoreanTeam"] will provide up-to-date ["BeautifyKoreanFonts"] once decision will be made for a font package.

  4. CJK users should be able to display their mp3 file ID3 tag correctly. Historically these tagging issue is a mess, everybody is using her own legacy encoding for mp3 tag because there is no support for non-western languages until very recent ID3 tag specification.
    • For applications which make use of GStreamer, setting GST_ID3_TAG_ENCODING can be an internim solution. There are more discussions on ["UTFEightCurrentProblems"].
  5. (?) Allow users to read/write CJK under console.
    • Or when this is impossible, change $LANGUAGE to C automatically so that users won't see lots of junk on console.
  6. Better support of CJK fonts in OpenOffice.org.

  7. Configure firefox for print CJK correctly
  8. Enable embolden font by default for CJK users
    • Debian unstable has freetype 2.1.10 - Dapper has this now (2005/11/11), please take care of next two items below.

    • Build xft2, fontconfig, pango and cairo2 with embolden enabled
    • This bug found on Debian's freetype 2.1.10 package, of course same with Dapper's :
      1. words in sentence are individually displayed right-upward like as several slopes. This often happens in bigger size like as web page heading than smaller, and in Konqueror and Opera than Firefox. But, a Gentoo user who had compiled xorg-x11-7.0 rc1 showed much nicer screen. Please see the screenshots in this link. http://bbs.kldp.org/viewtopic.php?t=65304&highlight= Also you might catch rendering quality by Akito's patch (top) and "embolden" (bottom) from this screenshot. Top one is much better. http://bbs.kldp.org/download.php?id=5319

      2. Here is another screenshot which can show the embolden rendering problem on Konqueror(3.5.0-0ubuntu1 + fontconfig2.3.2-1.1ubuntu1). http://bbs.kldp.org/viewtopic.php?p=336668#336668

Implementation

Code

Data preservation and migration

Packages affected

input methods:

font packages:

freetype:

fontconfig:

firefox:

openoffice.org:

xinit:

  • A script has to be add into /etc/X11/Xsession.d/, and it should be able to automatically set $XMODIFIERS, $GTK_IM_MODULE, $QT_IM_MODULE and $XIM_PROGRAM according to system locale, or read user's personal setting and change these variables accordingly.

language-selector:

  • It should set appropriate environment variables like $LANGUAGE and $LANG according to real life usage, and not just dummy settings. For example, Hong Kong people are using Taiwan translation mostly, but they may have their own; thus the correct setting is LANGUAGE=zh_HK:zh_TW.

  • Add a variable, say $CONSOLE_NOT_LOCALIZED, and define it for each language. In particular, set it to "yes" for all CJK languages, so that during bash startup it could redefine $LANGUAGE to C under console. (and console ONLY!)

rhythmbox:

totem-xine:

Outstanding issues

BoF agenda and discussion