Dev Week -- Fixing internationalisation bugs -- kelemengabor -- Thu, 3rd Feb, 2012

   1 [20:30] <kelemengabor> Hi everyone!
   2 [20:31] <kelemengabor> Welcome to this UDW talk about internationalization (i18n) bugs. I'm Gabor Kelemen, long time Hungarian translator and member of the Ubuntu Translation Coordinators team, with the task of managing i18n bugs.
   3 [20:31] <kelemengabor> During this talk, I'll show you what are the most common reasons of the presence of untranslated strings on the Ubuntu UI, and how to make those translatable. But first things first, let's start with the basics.
   4 [20:31] <kelemengabor> * i18n is a fairly complicated process, where many things have to be in place for the whole process to work.
   5 [20:31] <kelemengabor> * These things are documented in the gettext manual:
   6 [20:32] <kelemengabor> * Most of the infrastructure documented here works for the software Ubuntu packages, but there are always unpolished edges.
   7 [20:32] <kelemengabor> * Most of the problems I'll talk about are *not* Ubuntu-specific, they affect every user of that software, independently of the distribution.
   8 [20:32] <kelemengabor> In theory, you should not see a single English string while running Ubuntu using your native language.
   9 [20:32] <kelemengabor> However that this is not always the case: even if the translators of your language did their best, you can still run into untranslated text.
  10 [20:32] <kelemengabor> This is what we call an i18n bug. But what can You do with it?
  11 [20:32] <kelemengabor> Let's suppose you run Ubuntu Precise, and you see an English string. First thing to check: is it just (yet) untranslated, or not even translatable?
  12 [20:32] <kelemengabor> To do this, you need to click Help -> Translate this application, or if this does not help, look up the template of the application manually on
  13 [20:32] <kelemengabor> where LL is your language code, like de for German or hu for Hungarian
  14 [20:33] <kelemengabor> Once at the template, search for the given string. If you find it untranslated, then translate it!
  15 [20:33] <kelemengabor> If it was translated recently, like a week ago or so, then maybe it is not exported yet into the language pack - there is always a few days delay.
  16 [20:33] <kelemengabor> If it is there since a longer time, or if it is not there at all, then you just found an i18n bug, congratulations :).
  17 [20:33] <kelemengabor> Run ubuntu-bug packagename if you know the name of the application (Recommended!), or go directly to and report it.
  18 [20:33] <kelemengabor> Either case, please include a screenshot!
  19 [20:33] <kelemengabor> So, we have now a bug to solve. Or do we?
  20 [20:34] <kelemengabor> If you don't see anything outstanding, but you would like to help solving problems - great! Go to and pick a bug.
  21 [20:34] <kelemengabor> You can also go to and search for i18n bugs there - keywords like "translat" or the English name of your language gives plenty of results.
  22 [20:34] <kelemengabor> like 3-5 times more than we have on the ubuntu-translations project
  23 [20:34] <kelemengabor> In an ideal world, all these should be marked as affecting the ubuntu-translations project, but... you might want to
  24 [20:35] <kelemengabor> mark it as affecting that too, so it can get a little more attention
  25 [20:35] <kelemengabor> Once you picked a bug, you can branch the code of the corresponding package, and start looking for the cause of the problem.
  26 [20:35] <kelemengabor> Let's suppose that you already know how to do the branching :).
  27 [20:35] <kelemengabor> Now that you have the code, what's the first thing to check?
  28 [20:36] <kelemengabor> It is the presence of the string and grep is your friend here. Packages build upon each other, so maybe what you see untranslated comes from another package.
  29 [20:36] <kelemengabor> If you cannot find the offending string, then you should search in the dependencies of the package. apt-cache can help with this.
  30 [20:36] <kelemengabor> I mean in the sources of the dependencies :)
  31 [20:36] <kelemengabor> Okay, so you have confirmed that the string is present in the source. It may or may not be present in the template (.pot file), let's see first what went wrong if it is not present in there.
  32 [20:37] <kelemengabor> Most common problem is that it is simply not marked for translation.
  33 [20:37] <kelemengabor> Example bug:
  34 [20:37] <kelemengabor> and its patch:
  35 [20:37] <kelemengabor> Overview:
  36 [20:37] <kelemengabor> For strings to be extracted into pot files, they need to be marked for translation with the gettext() function, or its shortcut macro, _().
  37 [20:37] <kelemengabor> In the attached patch, we see that this call was forgotten, the solution is pretty simple:
  38 [20:38] <kelemengabor> -                similar_artists_item = gtk_menu_item_new_with_mnemonic (("Listen to _Similar Artists Radio"));
  39 [20:38] <kelemengabor> +                similar_artists_item = gtk_menu_item_new_with_mnemonic (_("Listen to _Similar Artists Radio"));
  40 [20:38] <kelemengabor> This applies for C, C++, Vala, and Python sources, other languages / source file types use other calls or methods to mark strings for translation.
  41 [20:38] <kelemengabor> (This patch also contains a solution for an other type of problems, so don't close it yet.)
  42 [20:39] <kelemengabor> Another common source of untranslated strings is the po/ file.
  43 [20:39] <kelemengabor> Example bug:
  44 [20:39] <kelemengabor>
  45 [20:39] <kelemengabor> Overview:
  46 [20:39] <kelemengabor> This contains a list of file names, which contain strings marked for translation. This list is maintained manually by the maintainers, who often forget to update it when they add new source files.
  47 [20:39] <kelemengabor> Luckily, we have a way to detect such files, and this is the intltool-update -m command.
  48 [20:39] <kelemengabor> This generates the list of missing files, which you most probably want to include in the file.
  49 [20:39] <kelemengabor> Sometimes, there are files which really should not be exposed to translators, like sources of automated tests, or .c files generated from .vala sources.
  50 [20:39] <kelemengabor> Such files should go to the POTFILES.skip file. The attached branch illustrates this too.
  51 [20:40] <kelemengabor> intltool-update -m has its limitations too - for example, it can currently not detect translatable strings in .vala files, so you are on your own with those.
  52 [20:40] <kelemengabor> While we are at the file and intltool-update, I'd like to point out another limitation of the latter. This is file type detection, a prominent source of errors with Glade UI files.
  53 [20:40] <kelemengabor> But we need to take a step back to understand this.
  54 [20:40] <kelemengabor> Example bug:
  55 [20:40] <kelemengabor>
  56 [20:40] <kelemengabor> Overview:
  57 [20:40] <kelemengabor> If you have read the gettext manual (okay-okay... you are here because no one does that, including me :))
  58 [20:40] <kelemengabor> That I linked at the beginning, you might have noticed that it speaks about using xgettext
  59 [20:41] <kelemengabor> for extracting the translatable strings from source code into the .pot file.
  60 [20:41] <kelemengabor> This happens in Ubuntu too, so what is intltool anyway?
  61 [20:41] <kelemengabor> intltool is a set of scripts, written to make the localization of formats not supported by xgettext possible.
  62 [20:41] <kelemengabor> Such are .desktop files, .xml, Glade UI files, and GConf schemas, among others.
  63 [20:41] <kelemengabor> intltool can detect such files based on their extension, but sometimes files have extensions different of the default.
  64 [20:41] <kelemengabor> Glade files used to have the .glade extension, but since the latest format change they have .ui (sometimes .xml) extensions.
  65 [20:41] <kelemengabor> So we need to explicitly tell intltool the type of such files. Maintainers forget/don't know this frequently:
  66 [20:42] <kelemengabor> -./data/ui/oneconfinventorydialog.ui
  67 [20:42] <kelemengabor> +[type: gettext/glade]./data/ui/oneconfinventorydialog.ui
  68 [20:42] <kelemengabor> Simple enough, huh?
  69 [20:42] <kelemengabor> Let's dig deeper into the gettext system then.
  70 [20:42] <kelemengabor> You might remember that I said earlier:
  71 [20:42] <kelemengabor> For strings to be extracted into pot files, they need to be marked for translation with the gettext() function, or its shortcut macro, _().
  72 [20:42] <kelemengabor> The world is not this simple, unfortunately.
  73 [20:42] <kelemengabor> There are situations in C/Python/others, where you cannot call a function, and gettext() is a function.
  74 [20:42] <kelemengabor> Such are the constant arrays, and their strings should be marked for translation with the N_() macro.
  75 [20:42] <kelemengabor> This is really a no-op, it serves only xgettext, so that it can extract the string into the .pot file.
  76 [20:43] <kelemengabor> But for the program to show the actual translation, you need to call the gettext() function with the array items.
  77 [20:43] <kelemengabor> This is what maintainers often forget and this can be seen in the last part of
  78 [20:43] <kelemengabor> All in all, the _() macro marks the string for translation and does the translation at runtime, while the N_() macro does only the marking.
  79 [20:43] <kelemengabor> There are other gettext functions and macros, but there is no time to cover those
  80 [20:44] <kelemengabor> If you made it until this point, you can be fairly sure that the string will make it into the pot file: check it by running intltool-update -p
  81 [20:44] <kelemengabor> But this does not means that the string will show up translated on the UI. We are just at the middle of the class :).
  82 [20:44] <kelemengabor> When you grepped the source for the untranslated string, you might have found it in all the po files, translated into 20 languages, yet not showing up in any of those languages.
  83 [20:44] <kelemengabor> What can be wrong at this point?
  84 [20:44] <kelemengabor> Example bug:
  85 [20:45] <kelemengabor>
  86 [20:45] <kelemengabor> Overview
  87 [20:45] <kelemengabor> Glade files need a little special attention to set up their i18n in the source code.
  88 [20:45] <kelemengabor> Usually, people do something like this - this applies not only for C, but for other program languages too:
  89 [20:45] <kelemengabor> GtkBuilder * builder = gtk_builder_new ();
  90 [20:45] <kelemengabor> gtk_builder_add_from_file (builder, "something.ui", &error);
  91 [20:45] <kelemengabor> This is not enough, if you want to show your items localized.
  92 [20:45] <kelemengabor> As you can see it in the branch attached to the bug, a gtk_builder_set_translation_domain() call is necessary *after* you create the GtkBuilder object, and *before* you add the items of the .ui file.
  93 [20:46] <kelemengabor> See also:
  94 [20:46] <kelemengabor> Side note: documents a similar need for GtkActionGroups.
  95 [20:46] <kelemengabor> Maintainers sometimes forget to do this. No problem, we are here to correct such mistakes :).
  96 [20:46] <kelemengabor> Other sources of errors are libraries.
  97 [20:46] <kelemengabor> Example bug:
  98 [20:46] <kelemengabor>
  99 [20:46] <kelemengabor> Overview:
 100 [20:46] <kelemengabor> Libraries can have translatable strings, and the translation of these should be looked up from the translation file of the library.
 101 [20:46] <kelemengabor> Pretty straightforward, isn't it?
 102 [20:47] <kelemengabor> When i18n support is initialized in the software, the translation file (also called "domain") to look up strings from is defined.
 103 [20:47] <kelemengabor> But this is never the same as the libraries domain!
 104 [20:47] <kelemengabor> So how can we still see strings from both the program and the library?
 105 [20:47] <kelemengabor> Libraries (should) use dgettext() instead of gettext(), which explicitly specifies the translation domain, unlike gettext(), which just uses the default.
 106 [20:47] <kelemengabor> glib, on which most Ubuntu GUI software builds, has two convenience headers, which define the _() and some other macros not mentioned here.
 107 [20:48] <kelemengabor> One is gi18n.h, which defines _() as gettext(), and the other is gi18n-lib.h, which defines _() as dgettext()
 108 [20:48] <kelemengabor> Sometimes, the authors of libraries confuse these two, as you can see in the example bug.
 109 [20:48] <kelemengabor> Another problem might be that initialization of the i18n support is sometimes incomplete.
 110 [20:48] <kelemengabor> Example bugs:
 111 [20:48] <kelemengabor> for C:
 112 [20:49] <kelemengabor> for Python:
 113 [20:49] <kelemengabor> for Vala:
 114 [20:49] <kelemengabor> Overview:
 115 [20:49] <kelemengabor> In the  main source file of each standalone executable, you need a few lines of code to make the i18n work.
 116 [20:49] <kelemengabor> These are documented here:
 117 [20:49] <kelemengabor> If the maintainer forgets some of these, xgettext will still extract the translatable strings - yet the gettext() calls won't know where to look for translations,
 118 [20:49] <kelemengabor> or which language should they show the strings in.
 119 [20:49] <kelemengabor> When this happens, usually whole windows and command line outputs show up untranslated, so this kind of problem is easy to spot.
 120 [20:50] <kelemengabor> GTK+ only adds to the confusion, because it always calls setlocale() in the gtk_init*() function, so you can get used to not call it, even if you write a program that does not use GTK+
 121 [20:50] <kelemengabor> This is what happened in the first bug.
 122 [20:50] <ClassBot> There are 10 minutes remaining in the current session.
 123 [20:50] <kelemengabor> The other two bugs are not complicated, they just show the complete lack of the crucial few lines in Python and Vala - I leave them here for future reference.
 124 [20:50] <kelemengabor> Okay, I think it is enough of possible upstream bugs for today. These were the common ones, but there are many others not mentioned.
 125 [20:50] <kelemengabor> These are upstream ones, because they happen in the code anyone can package for her favourite distribution.
 126 [20:50] <kelemengabor> Now I'd like to talk a little about Ubuntu-specific problems, that can happen during the packaging process.
 127 [20:51] <kelemengabor> Fortunately, there is not so many of these.
 128 [20:51] <kelemengabor> Example bug:
 129 [20:51] <kelemengabor>
 130 [20:51] <kelemengabor>
 131 [20:51] <kelemengabor> Overview:
 132 [20:51] <kelemengabor> If you want a translation template appear in LP Translation, you need to generate it during the build process.
 133 [20:51] <kelemengabor> For this, usually dh_translations is used, which is a little helper script above intltool, to generate the translation template and prepare the package for use with language packs.
 134 [20:51] <kelemengabor> However, sometimes it is not in use, which is a bug.
 135 [20:51] <kelemengabor> So you need to make sure it is called, either as an argument of dh, like in the first bug
 136 [20:51] <kelemengabor> or as a standalone call in the rules file at the end of the build, like in the second.
 137 [20:51] <kelemengabor> Including the cdbs rule is also okay, because that runs dh_translations too
 138 [20:52] <kelemengabor> This part is no rocket science :)
 139 [20:52] <kelemengabor> Another possible Ubuntu-specific problem can be untranslated strings in patches.
 140 [20:52] <kelemengabor> Example bug:
 141 [20:52] <kelemengabor>
 142 [20:52] <kelemengabor> Overview:
 143 [20:52] <kelemengabor> Some Ubuntu patches add new strings, but the authors sometime make the same mistakes as upstream authors.
 144 [20:52] <kelemengabor> Getting those bugs fixed is a little different, because you need to patch the patch.
 145 [20:53] <kelemengabor> But that's all, the possible mistakes are the same as above.
 146 [20:53] <kelemengabor> Recommended reading for this is
 147 [20:53] <kelemengabor> Most of the time, you need only to use edit-patch, but this is covered better in the guide, so now I just recommend reading it.
 148 [20:53] <kelemengabor> My experience is that it may sound scary at first, but it isn't really!
 149 [20:54] <kelemengabor> Okay, we are almost there!
 150 [20:54] <kelemengabor> Last thing to talk about is submitting the patch. Where should you go with it?
 151 [20:54] <kelemengabor> Now, you have a branch of the Ubuntu package tree, with a fixed bug.
 152 [20:54] <kelemengabor> I recommended to branch the Ubuntu tree because you can instantly rebuild your patched package, and test it, which is a Good Thing.
 153 [20:54] <kelemengabor> Unless it is an Ubuntu-specific bug, where you can go straight ahead with the "Commit - bzr push - Link a related branch - Propose for merging" dance –
 154 [20:54] <kelemengabor> which was hopefully covered in other classes this week :) – you need also do the following:
 155 [20:55] <kelemengabor> - Get the upstream source, whether it comes from Gnome git, an LP project or anywhere else.
 156 [20:55] <kelemengabor> - Create a patch against your Ubuntu-tree
 157 [20:55] <ClassBot> There are 5 minutes remaining in the current session.
 158 [20:55] <kelemengabor> - Apply it on the upstream tree (let's assume  it applies cleanly :))
 159 [20:55] <kelemengabor> - Submit it into the respective bug tracker of the project.
 160 [20:55] <kelemengabor> If you are lucky, and you chose a project whose upstream is on LP, you can just mark that as also affected, link the branch, submit a merge proposal, and you are done!
 161 [20:55] <kelemengabor> Thanks for the attention, if there are future questions or you would like to help, you can find me on #ubuntu-translators !
 162 [20:55] <kelemengabor> unbelievable, I did it :)
 163 [20:55] <kelemengabor> Questions?

MeetingLogs/devweek1201/FixingI18NBugs (last edited 2012-02-03 09:53:30 by dholbach)