Rosetta

Ubuntu Open Week - Translations with Rosetta - Mon, Nov 27, 2006

see also Wednesday Session

08:04   jordi   Ok, so for those who don't know me, I'm Jordi Mallach, and I've been involved with the Rosetta team trying to be the link between the development team and the Ubuntu translators and rosetta users
08:06   jordi   let's get moving
08:06   jordi   The Rosetta Translation Portal
08:06   jordi   Rosetta is one of the components that make up Launchpad,
08:06   jordi   Canonical's service platform.
08:06   jordi   Launchpad is made up of five major components: a bug tracker, a
08:06   jordi   request tracker, a specification tracker, a "source code"
08:06   jordi   supermirror and Rosetta, a web-based translation portal.
08:06   jordi   Christian Reis will talk tomorrow about Launchpad in general, so
08:06   jordi   let's focus on Rosetta.
08:06   jordi   Rosetta's aim is to make translation of Free Software as easy and
08:06   jordi   non-technical as it can get. The Rosetta team has been working on
08:06   jordi   creating an interface which hides the specifics of the Gettext PO
08:06   jordi   file format, which is the standard for translating Free Software,
08:06   jordi   thus lowering the barrier so anyone with a reasonable knowledge
08:06   jordi   of English can help out with the translations of their favourite
08:06   jordi   project into their mother tongue.
08:07   jordi   (please say if I'm too fast, I'm worried about lack of time)
08:07   jordi   Rosetta is the main translation system of Ubuntu Linux, and is
08:07   jordi   the source of all translations which appear in the Ubuntu
08:07   jordi   releases, and in the frequently updated langpacks. Rosetta is
08:07   jordi   also designed to help program authors getting their applications
08:07   jordi   translated.
08:07   jordi   A close look on the Gettext PO file format
08:07   jordi   ==========================================
08:07   jordi   Most of the software in your desktops use a standard translation
08:07   jordi   interface called GNU gettext, which is in charge of showing the
08:07   jordi   applications in the language the user has chosen. Application
08:07   jordi   programmers need to take care of marking all the user-visible
08:07   jordi   messages (or strings, as the initiated tend to call them) with a
08:07   jordi   special marker which can be extracted to plain text ".po" files.
08:07   jordi   We translators use these files to translate the applications.
08:08   jordi   Let's look at how a PO file looks. I've put some examples in
08:08   jordi   http://pusa.informat.uv.es/~jordi/ubuntu-school/
08:08   jordi   Have a look at the ubuntu-school.pot file. A POT file is a "PO
08:08   jordi   Template", that is, an empty PO file ready to be translated.
08:08   jordi   Looking at the contents of the file, you can see the format is
08:08   jordi   pretty straight forward: each original string in English (a
08:08   jordi   msgid) has its corresponding translation (msgstr). While simple,
08:08   jordi   the po format is quite fragile. One missing quote, and your
08:08   jordi   entire application build will fail with a syntax error. There are
08:08   jordi   several very popular PO file editors which help the editing
08:08   jordi   process: KBabel, PoEdit, GTranslator, Emacs PO-mode...
08:09   jordi   Rosetta goes one step further in easing the translation of these
08:09   jordi   PO files, using a clean, web-based interface which hides the
08:09   jordi   format, presenting only sets of string/translation pairs that you
08:09   jordi   can fill up. Once the work is done, it's stored in its database
08:09   jordi   where the information can be exported or shared among other
08:09   jordi   projects.
08:09   jordi   Using Rosetta's Web Interface
08:10   jordi   Rosetta is, as hinted before, divided in two main branches: one
08:10   jordi   serves to translate the applications of the people who request
08:10   jordi   it. For example, the Gobby collaborative editor is being
08:10   jordi   translated by Rosetta contributors, after its authors requested
08:10   jordi   us to set it up for them in Rosetta. On the other hand, Rosetta
08:10   jordi   is the platform from where Ubuntu gets all its translations.
08:10   jordi   We'll focus on Ubuntu a bit more now.
08:10   jordi   Ubuntu translations revolve around the Ubuntu translation teams,
08:10   jordi   which coordinate and produce the translations which get shipped
08:10   jordi   with every new version.
08:10   jordi      https://launchpad.net/rosetta/groups/ubuntu-translators
08:11   jordi   Here you'll see a list of teams which belong to the Ubuntu
08:11   jordi   translation teams. While Rosetta is open enough to let everyone
08:11   jordi   with a Launchpad account contribute, there is need for some
08:11   jordi   access control, to protect quality, avoid vandalism, etc. Being
08:11   jordi   part of one of the translation teams grants you "write" access to
08:11   jordi   every translation for that language in Ubuntu. Still, if you're
08:11   jordi   not a member of your language's team, you can still go ahead and
08:11   jordi   translate. Your contributions will be also stored in Rosetta's
08:11   jordi   database as "suggestions", but won't appear in Ubuntu's language
08:11   jordi   packs until a member of the team reviews and validates them.
08:11   jordi   Rosetta offers a long list of applications that can be
08:11   jordi   translated. Taking the French team as an example,
08:11   jordi   
08:11   jordi      https://launchpad.net/distros/ubuntu/edgy/+lang/fr
08:12   jordi   we can have a look at how their translation status is for the
08:12   jordi   Ubuntu Edgy release. I like showing the French team because they
08:12   jordi   are really an amazing example of completeness.
08:14   jordi   Rosetta presents us a list of applications which are ready to be
08:14   jordi   translated to French, and their current translation status. As
08:14   jordi   you see, the French have done their homework and there's barely
08:14   jordi   no red bars, meaning "untranslated". See the bottom of the
08:14   jordi   page for the meaning of the bar colours.
08:15   jordi   The list is ordered from most important to less important
08:15   jordi   Let's see how we'd translate an application. Close to the top of
08:16   jordi   the list is "launchpad-integration". We'll pick this one as it's
08:16   jordi   easy and short.
08:16   jordi      https://translations.launchpad.net/distros/ubuntu/edgy/+source/launchpad-integration/+pots/launchpad-integration/fr/+translate
08:17   jordi   If instead of French you want to have a look at your own
08:17   jordi   language's translation, simply replace "/fr" in the URL with the
08:17   jordi   corresponding ISO 639 code. You can find the code for your
08:17   jordi   language here:
08:17   jordi   http://www.loc.gov/standards/iso639-2/php/code_list.php
08:18   jordi   In our case, the first string is "The Launchpad helper
08:18   jordi   application failed", which is already translated to French as
08:18   jordi   "L'assistant Launchpad a chou". Below the accepted translation
08:18   jordi   there is a list of alternative translations suggested by other
08:18   jordi   people. You can quickly navigate through the translation fields
08:18   jordi   using the tab key. Once you have completed all the strings in a
08:18   jordi   page, you want to save your work: hit "Save & Continue" at the
08:18   jordi   bottom, and if there are more strings to do, Rosetta will then
08:18   jordi   show them to you.
08:19   jordi   There are other bits that can help the translators while they
08:19   jordi   work on a translation: you might want to see what the translators
08:19   jordi   to a language similar to yours used in a string that is hard to
08:19   jordi   translate, for inspiration. You can get such information using
08:19   jordi   the "Make suggestions from" widget at the top of the string list.
08:19   jordi   Also, you'll be more interested in seeing the strings that need
08:19   jordi   work instead of those which are translated already. You can
08:19   jordi   filter the kind of messages you want to see using the "Show"
08:19   jordi   widget, where you can select from "all", "unstranslated",
08:19   jordi   "translated" and "needs review".

< neophile> Is it possible to search for a string in the translation? That's major drawback when trying to correct translations. Is a search feature planned for rosseta in the near future?

  • It's currently not possible to search for a string easily. it is one of the most requested (and no doubtely most useful) features, and we do plan to add it. the implementation isn't trivial though, as the database is huge and there are some perdformance issues to solve. But yes, the team will focus on providing it as soon as possible

08:22   jordi   Using Rosetta's import/export interface
08:22   jordi   While the web interface has allowed many Ubuntu users help out
08:22   jordi   with the translations to their language, there's certainly
08:22   jordi   die-hard, old-time translators who will prefer using their own
08:22   jordi   tools (obscure emacs modules and weird command line tools!) to
08:22   jordi   work on their translations. Or there might be people who cannot
08:22   jordi   afford to be online during the whole translating session.
08:23   jordi   To help them, Rosetta has an import/export mechanism, which
08:23   jordi   allows you to easily upload translations you have worked on
08:23   jordi   offline, using your own ways, but you still want to see
08:23   jordi   integrated in Rosetta, and download your finalised files so you
08:23   jordi   can do whatever you want with them: back them up, send them to
08:23   jordi   your team's mailing list, send them to the upstream author so
08:23   jordi   they get included in the next release...
08:24   jordi   Importing and exporting is easy: to download your work, use the "Download" and "Upload file" links in the boxes at the left side
08:24   jordi   When requesting a download, Launchpad will prepare the file for you and will email you the location of the desired export.
08:25   jordi   Importing is similar. Just fill in the field with the location path to your file, and rosetta will integrate it in the database
08:26   jordi   I translated all night long. What now?
08:26   jordi   ======================================
08:26   jordi   Okay, so you've worked on the files you were interested in, and
08:26   jordi   Rosetta now has all the info. What happens now?
08:26   jordi   Ubuntu will, on a monthly basis, extract all the translations
08:26   jordi   from the database and put them in the "language packs" for each
08:26   jordi   supported language in the distro, which will automatically hit
08:26   jordi   your Ubuntu mirror the 1st Monday of the month. This way, Rosetta
08:26   jordi   allows people to keep improving the support for their language
08:26   jordi   even after a Ubuntu release has shipped. For example, more than 6
08:26   jordi   months after the release of Ubuntu 6.06 LTS, there's a group
08:26   jordi   working on adding Dzonghka support to Ubuntu, when there was
08:26   jordi   close to nothing included in dapper initially.
08:27   jordi   I see there's many interesting questions going on, so I propose we go on with Q+A, trying to focus on the classics: Rosetta and upstream relationship, etc.

< bugman> In https://translations.launchpad.net/people/bugman/+translations for example, is possible to implements the view of all string translated for a package and not only the last?

  • it's possible, yes. These are wishlist features, though, and will get a lower priority than say "search a string" but the info is in the database, so it's prefectly possible to show the info. it's important that bugs are filed against rosetta requesting these things

< dneary> What's the recommended workflow for updating .po files outside Launchpad for the moment?

  • can you be more specific? there's several scenarios: gnome-panel or launchpad-integration, for example

    [ivoks] he asks if he could see all his translations in one package, just his own, or someone elses.

    [dneary] It's related to bug #68014 It seems like uploading .po files doesn't automatically update translations at the moment (if I understand the problem) right, the import mechanism is restricted right now, while a nasty bug involving reverted translations is tracked down

    [dneary] Some translations were lost, so on the 1st of November, the upload form was disabled.

    for the time being, mailing rosetta@launchpad.net with import requests is the recommended way. although I hope we'll go back to normal operation rsn -- aiui the bug fixing is making progress

< dand> any packages in Rosetta that sync automatically with upstream? if yes, are they marked somehow in Launchpad and how often do they sync?

  • good one: this doesn't happen right now, but is a desired feature. this would allow minimise the "conflicts" between rosetta and upstream preojects (KDE, GNOME...) translations, so if some translator has rights to translate both on Ubuntu and GNOME CVS, a translation inserted in Rosetta could be exported to GNOME. we want this, but it'll won't be here before some time

< Gwaihir> is it possible to have a --use-fuzzy implementation for exporting mo files?

  • as far as I know, we don't do this now because Rosetta generates fuzzies using its own knowledge. I guess it'd be easy to add an option to generate files with fuzzies, yes. Is there a bug filed?

< Gwaihir> is there any difference between "fuzzy string" in po and "Need review" in Rosetta?

  • I've gone over this as I wanted to keep the text dump simple, but yeah, needs review can map to fuzzy. some teams do use it as in the strict "needs review" sense though. ie, theey translate it, but if they are not sure, they use the mark so others can easily find the unsure strings. [danilos] another thing to note: currently, both fuzzy and string needing review is implemented using the same mechanism in rosetta; there is, however, a plan to separate these out and put them to their right meanings (i.e. fuzzy == machine-selected "similar" string; needs review == human selected "unsure" string)

< somerville32> How does a member of a translation team approve suggested translations?

  • The current implementation is built on very complex copy and paste technology Smile :) we do have plans to have the checkbox Smile :) I'm a bit on time pressure I guess, otherwise I'd dig the relevant spec urls

< somerville32> If I suggest a translation and then join the translation team later, are all my suggested translations automatically approved?

  • no, as far as I know. That could be dangerous on some cases, actually, but it might be a good idea to do it if the string was unstranslated.

    [somerville32] So I have to go and redo all my work? haha I can't stress enough that having bug reports for all the requests is really helpful

< dand> any plans so far on opening Rosetta for contributions?

  • "opening", I assume you mean opening the source code. There's sabdfl's statement that Launchpad will be freed when the project is ready to do so. We can't give dates or estimations on when that might happen. There are people helping out with Launchpad on a NDA basis, though.

[dneary] Does Rosetta keep a history of translation updates and who made them?

  • yes, Rosetta keeps a history of all translation contributions. [danilos] dneary: but no, we don't have the level of details you wonder about (when string has moved from suggestion to approved, etc.)

< bugman> It's possibile to hava a Wiki page (or other similar) for see LangPack scheduling?

  • the general rule is "1st Monday of the month". We chose this because it's easy to remember. I'm sure this is written up somewhere though, can't find a pointer irght now. there's a Plan to write docs pointing these things clearly

< dneary> Does Rosetta keep a history of translation updates and who made them?

  • a feature showing how a string has changed over time has just been rolled onto production, and Rosetta is now gathering this info. This will be good to help team leaders track bad translators

< neophile> Where someone should translate a package, in head brunch or in the edgy brunch?

  • this very much depends on the specific packages. we have a feature on our queue, which will allow someone translating gnome-panel in edgy "push" that same translation to dapper, feisty, or GNOME CVS HEAD if applicable, so you just have to translate once. of course, if you translate the Panel in GNOME CVS, that translation will automagically percolate to the next ubuntu release. this gets us to the "working with upstream" chapter, while rosetta is a great tool to get ubuntu translated verfy easily, it's vital that the translation teams cooperate with their upstream teams. ie, the French Ubuntu team should be in contact with the French GNOME team, so they use the same guidelines, etc or don't duplicate efforts. we've had problems with some teams redoing all the work in Rosetta, which resulted in a completely different set of translations in Ubuntu and other distros. The "translation override" feature in rosetta is powerful and useful if used wisely. If it goes out of control, it can cause frictiuons between teams. We need to work on that, I believe it's a procedural problem which can be mostly solved by educating new translators which join the ubuntu teams. there's been quite some talk on this on the rosetta list, and teams like italian or Brazilian have already implemented some measures to work nicely with their upstreams

< aleka> If I am eager to help the Ubuntu community, and the only way I can right now if through translation, What Can I do when the admin of a team does not respond to my requests or approve my membership to the team?

  • we're seeing this every now and then. the best and quickest way is to mail us at rosetta at launchpad.net and we'll mediaate. ie, we'll try to contact the current leader. If he doesn't respont, we'll transfer leadership to whoever we think deserves it (ie, whoever wants to do the work Wink ;)

    [pepsiman] lots of people apply to teams without telling the team leader their email address

<dneary> We have had some issues with translations where people outside the project translated things badly, and we couldn't easily revert to the correct translation - how can I configure our translation to make sure only approved translation team members can do the translation?

  • [danilos] at the moment, members of the translation team should only be those you trust; in other words, anyone who is member of your translation team can approve/modify translations. right, we ACK that there's oversized teams right now. [danilos] dneary: yes; anyone else can post suggestions -- we'll work on improving team layout (and docs), so this is clear to everybody

    [dneary] And only translation team members can do translations?

    • if the given project is setup up like that, yes

< tonyyarusso> Does Rosetta have some method for handling dialects? I'm just getting the ball rolling for a language (oji) that varies widely (esp. for technical terms)

  • not yet, but both danilos and I are involved with teams which would benefit of dialect support. We really want to see this happen, but it's not high priority right now.

< somerville32> How do you know if a translation is suggested or approved?

  • a approved translation appears in the translation field. suggestions appear as suggestions beneath it [danilos] any actual translation is the approved one; suggestions only appear as "suggestions"

09:01 tonyyarusso jordi: What is the best way to approach things for the time being then until it is implemented?

< dneary> Can I export translations in other formats than .po?

  • no, Rosetta currently only groks po files. although it will soon be able to export firefox and openoffice data via langpacks but not via the standard export mechanism as far as I know, danilos can correct me. btw, we're working on adding native support for other formats right now; first to come should be firefox XPIs, OpenOffice GSI's, KDE PO files, but we also have plans for XLIFF and others

< dneary> What happens in Rosetta when an overridden string gets changed in a later revision of the upstream translation?

  • Rosetta has a "tracker" which makes it think it should follow what comes from upstram or not. In short, if you change a string, forking it to something different, it'll stay forked unless you put the upstram string back again.

[dneary] How do you get to decide?

  • We want to implement a filter so it's easy to find these strings and do a "merge with upstream" for them, etc.

    [dneary] Is there a fuzziness thing that shows you changed over-ridden strings?

    • no, there's proposals to mark them as such

< aleka> where can I get help in aquiring fonts that I need for translating (Amharic - ethiopic fonts) that work in Linux. This question emailed to team leader twice and no response..

  • you might want to ask in the ubuntu-devel list. mark shuttleworth has big interest in getting ubuntu working out of the box for special script like yours. if there are free fonts, it should be easy. [danilos] finding that information is sometimes even hard for the team leader, since it might be scattered all over internet


CategoryTranslations

MeetingLogs/openweekedgy/Rosetta (last edited 2009-07-20 13:36:13 by p54A13477)