LanguagePacksForUniverse
Launchpad Entry: https://launchpad.net/distros/ubuntu/+spec/language-packs-for-universe
Created: 2006-06-19 by CarlosPerelloMarin
Contributors: CarlosPerelloMarin, MartinPitt
Packages affected:
Summary
Universe lacks any kind of Rosetta integration feature so it's difficult to get translations update for it. We need to provide a solution like the one for main or an improved one.
Rationale
Although we don't support Universe, people are interested on applications from Universe and thus, there are people interested on translate them and also in see them fully translated.
This would allow a full distribution translation.
Use cases
- Peter wants to translate VLC for dapper. He knows it's not supported by Canonical but he really likes that application so he translates it and with next language packs, he gets his translation deployed.
- Jordi found a bug in the Catalan translation of Galeon, he fixes it and waits for next language pack update to see it fixed for all Catalan users of that application. Galeon is in universe.
Scope
This spec requieres updates in:
- Rosetta.
- Depending on the solution chosen, we would need to update apt-get, update-manager, langpackomatic, libc, etc...
Design
We need a system that allows us to distribute translations for any application in Universe. It should fit the following requirements:
- You would install only translations that you are interested on. No extra languages that are not related to your preferred ones or translations for applications that you don't have installed.
- You should be notified when a new translation is available so you can get updates, even after release time.
- The updates must not overwrite any file stored in our system that is handled by dpkg that are not related with language packs.
- The metadata information to handle translation updates should be as small as possible.
- The system must not be .po/.mo specific, it should update other files like documentation, .desktop files, firefox/openoffice language packs, and any other translatable resource we could have in our system.
- We should be able to expand its usage to support our installer. This will allow us to use this solution with our 'main' repository so even if we choose a different approach from what we have in main, we could migrate main for that new solution.
- The solution should be able to use mirrors to distribute the files so we don't overload launchpad or a single server.
- The translations should be installed and available just after the application is installed. The user should not be aware that the translations are not part of the binary application, they don't care about it, they just want their application full translated just after its installation.
- When an application is removed, its translations should be removed too so we don't waste resources.
We will use exactly the same solution we are using in main. All translations for a group of languages are stored inside a single .deb package with a -base package for the initial distribution release and a -updates package for the monthly updates after release. Also, all translations from the binary packages are stripped so we only ship translations as part of the language packs.
Design Rationale
While thinking on this, we found several solutions. All them with their pros and their contras and we are going to document them here to be sure that we have those present if the requirements change:
- Use exactly the same solution we are using in main. All translations for a group of languages are stored inside a single .deb package with a -base package for the initial distribution release and a -updates package for the monthly updates after release. Also, all translations from the binary packages are stripped so we only ship translations as part of the language packs.
- Pros:
- Simple to manage, we have already all the infrastructure there to create/maintain/deploy it.
- The amount of packages in universe is higher than in main and thus the language packs would be bigger.
- You would need to install a lot of translations even if you are interested in just one package from Universe.
- We would need around 200 new packages (100 for base, 100 for the updates)
- Pros:
- Same solution as before, but without using -base packages and without striping translations from binary packages.
- Pros over previous option:
- The initial language pack will be much more small because will only contain changes done in Ubuntu/Rosetta and the others will be part of the binary packages, so the users will have installed only the translations that they need + all updates for universe translations.
- We will need “only” around 100 new packages for the -updates packages.
- If translators do a good job, the size of the language pack would be directly high and thus we don't get a big win.
- We would have two different language pack behaviours, main must do the strip to save some disk space for the installation/Live CD.
- Pros over previous option:
- We could create as many packages as languages supports a source package so we could have something like:
foo_0.1-0ubuntu1_i386.deb language-pack-af-foo_6.06+20060603_all.deb language-pack-ca-foo_6.06+20060603_all.deb language-pack-es-foo_6.06+20060603_all.deb ... language-pack-pt-BR-foo_6.06+20060603_all.deb language-pack-zh-CN-foo_6.06+20060603_all.deb language-pack-zh-TW-foo_6.06+20060603_all.deb bar_1.3-1ubuntu13_i386.deb language-pack-af-bar_6.06+20060603_all.deb language-pack-ca-bar_6.06+20060603_all.deb language-pack-es-bar_6.06+20060603_all.deb ... language-pack-pt-BR-bar_6.06+20060603_all.deb language-pack-zh-CN-bar_6.06+20060603_all.deb language-pack-zh-TW-bar_6.06+20060603_all.deb
- Pros:
- You will install exactly the translations that you want and nothing more. Also, you would remove them when the application is not being used. This means less disk space wasted.
- We could update language packs more often than currently (once per month) because users will get only exactly the translations that they are using.
- The amount of packages would explode a lot. Say we have around 35 languages on average and after some checks, we know that only 10% of universe's source packages have translatable resources and we have near 12000 source packages in Universe, that would mean that we will need around 38500 new .deb packages to ship translations and that number will grow when people start doing more translations.
- Due the amount of packages we are talking about, the metadata information would be really huge and most of the time will be duplicated.
- apt-get's Packages.gz file will grow a lot.
- Pros:
- Same solution as before, but expanding apt-get to accept Packages-LANG.gz so we add the language packs to their own Packages file. This will be backward compatible because Packages.gz will still be there, but without translations and the new apt-get would get advantage of the language packs if they are present.
- Pros:
- Same as before.
- Mirrors could mirror just the languages they are interested on.
- Same as before except Packages.gz will stay with an acceptable size.
- Pros:
Same solution as before, but using different components instead of adding the Packages-LANG.gz files. The good thing over the previous solution is that you don't need to change apt-get. The bad thing is that you would need around 35 * len(main, universe, multiverse, restricted) new components.
Finally, we could stop using .deb packages to handle translations and create our apt-lite system integrated with update-manager and apt-get that wouldn't have dependencies or versioning system other than a catalogue with a list of source packages and links to its translatable resources. The system will just use MD5 sums to know if you have latest version of a translation and will warn you about new translations available. You will get only translations for installed packages and the system would remove translations for uninstalled applications. Of course, those files will be installed inside a directory (/usr/share/locale-langpack) that will not be handled by dpkg so we don't overwrite anything from dpkg.
- Pros:
- The metadata will be really low.
- We could update all translations daily if we want, that will not produce an update of the other translatable resources because every file is independent of the others.
- Due we don't handle versions, we are not able to restore a distro release to its concrete release status related to translations. Jeff Bailey pointed this problem as a possible problem for the support team.
- We need to do a light implementation of apt-get.
- The mirroring scripts for Ubuntu cannot be reused to mirror translations.
- Pros:
Final decision
After checking all those options, we decided to go with the solution #1. The reasons we decided to choose that one are:
- No new infrastructure is needed.
- After some checks with Universe, we discover that although we have double of packages with translatable resources in universe than in main (around 1100 vs 550), the amount of translations is lower in universe than in main so universe language packs would be smaller than the ones in main. This would change in the future, but in the mean time, we could go with this simple solution.
- The fact that we use -base and -update packages allows us to use exactly what we have in main so we don't need to care about different handling of language packs for main or universe.
Data preservation and migration
Nothing is needed.
Outstanding issues
BoF agenda and discussion
ago 18 13:03:38 <sabdfl> hi guys ago 18 13:03:46 <danilos> hi sabdfl ago 18 13:04:14 <sabdfl> is anyone from the distro side going to join us? ago 18 13:04:26 <carlos> pitti ago 18 13:04:40 --> pitti (n=pitti@ubuntu/member/pitti) ha entrado en #cm ago 18 13:04:42 <pitti> hi ago 18 13:04:43 <carlos> hi ago 18 13:04:50 <pitti> sorry, I have to learn when 'cm' is an abbreviation and when not :) ago 18 13:04:52 <sabdfl> hey pitti ago 18 13:04:56 <pitti> hi sabdfl ago 18 13:05:01 <carlos> pitti: I have always the same problem ;-) ago 18 13:05:05 <sabdfl> ok, let's start ago 18 13:05:08 <danilos> sure ago 18 13:05:11 <sabdfl> thanks for looking at this in paris ago 18 13:05:19 <pitti> so, I gave this some thought ago 18 13:05:20 <sabdfl> i wasn't in the discussions but i have read the spec ago 18 13:05:30 <pitti> in the end, I think only solutions 1 and 6 make sense ago 18 13:05:52 <pitti> the others seem to be abusing the archive structure and apt in nasty ways ago 18 13:06:15 <carlos> sabdfl: our point was not to go with current language packs because we are scared of the other solution, but that we didn't have enough time for Edgy to do all modifications ago 18 13:06:28 <carlos> so we choose the only realistic solution for Edgy ago 18 13:06:29 <sabdfl> carlos: then drop universe for edgy ago 18 13:06:38 <pitti> sabdfl: I think it boils down to: (1) can be done with a snap of the finger and is known to work, but not really pretty ago 18 13:06:48 <carlos> and documented the others for edgy + 1 or any other release when we get some extra time ago 18 13:06:54 <pitti> (6) is a lot of work, but offers better features ago 18 13:07:21 <pitti> like direct integration into apt/synaptic/etc., so that when you install a package, you'd automatically get translations, too, etc. ago 18 13:07:24 <sabdfl> i would like 6, but understand that it cannot be done for edgy ago 18 13:07:34 <carlos> pitti: I think so, yes ago 18 13:07:48 <sabdfl> the "translation manager" would need to get notifications, so it would know how to go look for relevant translations, yes ago 18 13:08:02 <pitti> sabdfl: I also think (6) should be discussed with the admins to judge the ramifications of mirroring, etc. ago 18 13:08:02 <carlos> danilos: just in case you don't know the document we are talking about... https://wiki.ubuntu.com/LanguagePacksForUniverse ago 18 13:08:14 <danilos> carlos: yeah, I know what you're talking about :) ago 18 13:08:43 <sabdfl> how big is the pressure for SOMETHING in edgy for universe? ago 18 13:08:53 <carlos> not too big ago 18 13:08:56 <pitti> btw, carlos and I shortly discussed solution (2) yesterday ago 18 13:09:00 <sabdfl> will people be happier if we deliver better translation of main, for example, instead? ago 18 13:09:07 <carlos> and we are in time to remove it now (nothing has been imported yet) ago 18 13:09:13 <pitti> sabdfl: right now folks are crying for edgy translations ago 18 13:09:19 <sabdfl> of main ago 18 13:09:20 <pitti> for dapper the process works reasonably well ago 18 13:09:55 <danilos> sabdfl: translation is always measured in its completeness, and it depends on all software being translated ago 18 13:09:57 <sabdfl> are there any other nice features planned for edgy, like package description translations, or a better way to translate the menu items? ago 18 13:10:04 <pitti> sabdfl: I didn't hear particular requests for universe translations yet ago 18 13:10:08 <carlos> We already promissed universe lang packs not sure if officially, but people know about it, but I don't think it would be too bad to defer that feature ago 18 13:10:11 <danilos> sabdfl: especially with things such as "stock" gtk+ buttons, menus, which end up translated even in non-translated stuff ago 18 13:10:14 <pitti> but that might also be due to the fact that you don't miss what isn't there :) ago 18 13:10:21 <sabdfl> danilos: we have SUCH a lot of stuff in main that could be better translated, i think universe is not the highest priority ago 18 13:10:50 <danilos> sabdfl: it's not about A LOT, it's about WHAT: if people use regularly some stuff from universe, they'd want to have it translated, IMHO ago 18 13:11:00 <carlos> sabdfl: yes, I think the package descriptions will be part of Edgy ago 18 13:11:04 <pitti> sabdfl: e. g. I think getting a better process for firefox and OpenOffice should be higher prioritized than getting an awesome universe translation structure ago 18 13:11:09 <carlos> at least I saw movement in that spec ago 18 13:11:10 <sabdfl> danilos: we won't strip the translations from the packages, so they will get pre-release translations ago 18 13:11:18 <danilos> sabdfl: yeah, right ago 18 13:11:20 <sabdfl> could we stuff rosetta translations INTO packages at build time? ago 18 13:11:21 <carlos> pitti: we are doing that already for Edgy ago 18 13:11:40 <pitti> btw, can we shortly discuss the 'not strip packages' case? ago 18 13:11:41 <carlos> sabdfl: I don't think that's possible until bazaar integration is done ago 18 13:11:41 <sabdfl> so, for example, if there is a security update then you get newer translations too as a bonus? ago 18 13:11:44 <pitti> this would be solution (2) ago 18 13:12:06 <danilos> I don't know that much about our packaging infrastructure to be able to answer any of this :) ago 18 13:12:09 <pitti> sabdfl: this would work, it would just be a huge hit if we updated translations without other updates ago 18 13:12:13 <sabdfl> carlos: why, surely it can be done as an explicit msgset merge command during the package build? ago 18 13:12:16 <pitti> and universe gets *very* few updates post-release ago 18 13:12:47 <pitti> users will rightfully yell at us if they have to re-download all their universe packages just to get some new strings ago 18 13:12:51 <sabdfl> pitti: well, it would allow us to say "get in and translate universe before release and most of those translations will go into the release itself" ago 18 13:12:52 <danilos> pitti: wouldn't that solution involve patching all the software in universe? ago 18 13:13:01 <pitti> that seems much worse to me than downloading an 1 MB update langpack ago 18 13:13:04 <sabdfl> i agree, pakcage update just for translations would be bong ago 18 13:13:37 <carlos> sabdfl: we don't have a way to get .po files directly from rosetta so we would need to develop a new script for that with some kind of XMLRPC API because parsing emails doesn't sounds like a good solution... ago 18 13:13:39 <pitti> sabdfl: right, pre-release that's certainly fine, but still requires package rebuilds, version bumps, and lots of bandwith waste ago 18 13:14:08 <sabdfl> pitti: true - a huge rebuild at rc1 would cause potential issues ago 18 13:14:17 <pitti> sabdfl: TBH I'd prefer update langpacks (solution 2) over rebuilding packages, at least with our current way of doing source packages ago 18 13:14:19 <danilos> i.e. I don't understand how did anyone think solution 2 would work: more patching of glibc gettext() stuff? ago 18 13:14:22 <sabdfl> and we would have no easy way to correct bad rosetta translations discovered after release ago 18 13:14:42 <pitti> danilos: yes, that's what I wanted to discuss ago 18 13:14:49 <pitti> I think we have to do a glibc change anyway ago 18 13:15:17 <pitti> I think it's a bad idea to rebuild all 1100 universe packages just to strip all universe translations ago 18 13:15:24 <pitti> (1100 universe packages that have translations, that is) ago 18 13:15:38 <danilos> pitti: but this would be for using translations from two different domains, or simply including a complete PO/MO file if there was a change in Rosetta? ago 18 13:15:45 <pitti> so with both solution (1) and (2) we need that glibc hack for a transition period ago 18 13:16:04 <carlos> danilos: the latter ago 18 13:16:08 <pitti> danilos: I'd go with updating a complete mo ago 18 13:16:09 <danilos> I mean, after a while, it would be simply including all stuff in langpacks ago 18 13:16:15 <carlos> danilos: right ago 18 13:16:16 <pitti> danilos: otherwise the runtime impact is too high ago 18 13:16:22 <danilos> pitti: agreed ago 18 13:16:26 <pitti> danilos: with solution 1 that is ago 18 13:16:48 <pitti> but we should first agree to a high-level goal ago 18 13:16:55 <danilos> sure ago 18 13:17:00 <sabdfl> is (2) ever going to be better than (6) if we can actually do (6)? ago 18 13:17:15 <sabdfl> if not, then I would rather defer, and do (6) for edgy+1 ago 18 13:17:16 <pitti> do we want universe translations, like *now*, or do we want to do them in a really sophisticated and cool way, but in maybe one year? ago 18 13:17:23 <sabdfl> pitti: the latter ago 18 13:17:28 <danilos> btw, do you have any estimate of the size of per-language langpacks for universe? I figure they'd be huge ago 18 13:17:33 <carlos> then edgy should not have universe translations ago 18 13:17:36 <carlos> updates ago 18 13:17:36 <pitti> i. e. (1)/(2) now or (6) later ago 18 13:17:44 <pitti> danilos: no, they aren't ago 18 13:17:46 <carlos> danilos: less than main ago 18 13:17:50 <pitti> danilos: in fact ATM they would be smaller than main ago 18 13:17:55 <danilos> hum, nice :) ago 18 13:18:08 <pitti> the most translation-laden stuff is desktop stuff, which is mostly in main ago 18 13:18:11 <carlos> danilos: universe packages translations are really bad compared with main ones ago 18 13:18:11 <danilos> I'd vote for better implementation, even if it delays it ago 18 13:18:13 <sabdfl> danilos: the packages are less popular, therefor less translated (and often less likely to be translatable) ago 18 13:18:39 <sabdfl> ok, this looks clear to me - don't try to do it badly for edgy, focus on better quality translations of main ago 18 13:18:53 <pitti> that's why we stated that global universe langpacks would be a possible alternative ago 18 13:19:10 <carlos> ok ago 18 13:19:25 <pitti> if we do (6), then we probably lose some of the infrastructure features of apt ago 18 13:19:38 <sabdfl> if we do (6) we should keep it super simple ago 18 13:19:41 <pitti> like, straightforward mirroring, CD burning, debmirror, etc. ago 18 13:19:54 <carlos> pitti: we added a solution for that, right? ago 18 13:19:55 <danilos> jigdo? :) ago 18 13:20:01 <sabdfl> the "packages" file should just be names and dates ago 18 13:20:02 <pitti> but if we do a simple and efficient HTTP lookup/download and a nice integration into apt, this should cover the common case as well ago 18 13:20:10 <carlos> prepare a .deb package that installs the initial .mo files ago 18 13:20:18 <carlos> to include it in the installation CD ago 18 13:20:18 <sabdfl> i would not integrate into apt, have a separate tool which handles it ago 18 13:20:25 <sabdfl> carlos: +1 ago 18 13:20:45 <pitti> sabdfl: apt integration> I meant 'apt-get install foo' would also take care of downloading the corresponding translations for your selected languages ago 18 13:20:46 <danilos> what initial .mo files? are there any from universe on installation CD? ago 18 13:20:48 <pitti> I think that would be cool ago 18 13:21:01 <danilos> pitti: +1 ago 18 13:21:03 <pitti> sabdfl: likewise for g-a-i, synaptic, etc. of course ago 18 13:21:14 <pitti> danilos: universe packages aren't shipped on CDs ago 18 13:21:35 <danilos> pitti: so why the need for "initial .mo files" (I may be misunderstanding this altogether) ago 18 13:21:38 <danilos> ? ago 18 13:21:42 <carlos> danilos: we start with universe and if it works, we will move it to main ago 18 13:21:53 <carlos> so we need a solution for main too ago 18 13:21:56 <danilos> carlos: ah, ok ago 18 13:22:02 <carlos> even if we are not going to use it with first release ago 18 13:22:22 <pitti> carlos: that's what I mean with 'losing infrastructure features' ago 18 13:22:26 <carlos> sabdfl: the idea is to use an apt-get hook to do that installation that pitti said ago 18 13:22:34 <pitti> i. e. it's nontrivial to ship langpacks on CDs that way ago 18 13:22:38 <pitti> of course it's possible ago 18 13:22:44 <carlos> yeah ago 18 13:22:59 <carlos> pitti: but that's the same problem as no using apt-get and dpkg standard infrastructure ago 18 13:23:05 <danilos> carlos: isn't it simpler to provide a file:/// handler at the same time as http:///, so you can install them in the same way from CD? ago 18 13:23:18 <danilos> if we're to go with patching apt-get, that is ago 18 13:23:38 <pitti> apt-get should just call a hook, which then calls 'update-translations pkg-foo' or so ago 18 13:23:39 <carlos> danilos: not sure, we should expand the spec after edgy release (or once we get all our tasks done) ago 18 13:23:51 <pitti> apt-get itself should know as little about this langpack thingy as possible IMHO ago 18 13:24:01 <carlos> pitti++ ago 18 13:24:29 <carlos> we should try to get this solution working without patching a bunch of packages ago 18 13:24:32 <pitti> so, solution (6) needs some long and intense discussion, we should do this in person on the next meeting with admins involved, too ago 18 13:24:58 <carlos> pitti: agreed ago 18 13:25:11 <carlos> also, I think Jeff should be present too ago 18 13:25:14 <pitti> carlos: yes, the initial infrastructure would boil down to 'download de/es translations from this server and put them into this path', and the matchign glibc patch ago 18 13:25:26 <danilos> pitti: yeah, and I'd still want to push some more features into language packs when that discussion happens :) ago 18 13:25:28 <pitti> and then we need the proper integration into the package managers and language selector ago 18 13:25:45 <sabdfl> we could over-engineer this very easily :-) ago 18 13:25:53 <sabdfl> here's what I would care about ago 18 13:26:10 <sabdfl> (1) the index file has a date, so you always know if one is newer than another ago 18 13:26:25 <pitti> (or just a counter) ago 18 13:26:45 <sabdfl> (2) the index file is as small as possible, so we don't get a Packages.gz style multi-mb download just to find out if you need to fetch new translations ago 18 13:27:28 <sabdfl> (3) each updated PO/MO file is listed with just distrorelease, name, language and date, so again you always know if you have the newest ago 18 13:27:35 <pitti> it should become something like 2000 packages * 100 languages * 50 byte per entry = 10 MB ago 18 13:27:49 <sabdfl> hmm... index per language, then ago 18 13:27:53 <pitti> right ago 18 13:28:10 <danilos> I doubt there'd be 100 translations per package in universe ago 18 13:28:12 <pitti> this will spread the load on the mirrors and bw requirement ago 18 13:28:17 <danilos> only best translated packages have that many ago 18 13:28:20 <pitti> no, this was just a maximum estimation ago 18 13:28:22 <sabdfl> so system needs to track "which mo files do I need, and which do I have, and what dates are on them" ago 18 13:28:26 <carlos> danilos: but we would have them for main ago 18 13:28:45 <sabdfl> danilos: it should scale by the number of languages installed, not available ago 18 13:28:53 <pitti> max. 100 kB index per language sounds reasonable ago 18 13:29:00 <sabdfl> so if people install more languages, they need more index files, not the other way round ago 18 13:29:07 <pitti> most users will only care about one or two languages anyway ago 18 13:29:12 <sabdfl> exactly ago 18 13:29:17 <danilos> sabdfl: that's a better reason ago 18 13:29:31 <pitti> and 100 index files on the mirrors sounds reasonable, too ago 18 13:29:31 <sabdfl> another nice thing is that we could make it easy to turn on automatic language updates ago 18 13:29:40 <sabdfl> because it's not like packages where you definitely want user approval ago 18 13:29:44 <pitti> update-notifier! :) ago 18 13:29:45 <sabdfl> newer translations are better ago 18 13:29:51 <sabdfl> it should just DOIT ago 18 13:29:57 <sabdfl> unless the sysadmin has turned that off ago 18 13:30:03 <danilos> yeah ago 18 13:30:10 <carlos> sounds good ago 18 13:30:14 <pitti> sabdfl: well, a 'do this automatically from now on' checkbox ago 18 13:30:17 <sabdfl> i would like us to develop this, use it for universe, and then use it for main too if it is smooth ago 18 13:30:19 <pitti> for the poor modem users ago 18 13:30:28 <sabdfl> sure ago 18 13:30:45 <carlos> pitti: just like update-manager handles automatic .deb installation ago 18 13:30:51 <sabdfl> we can also then to langpack updates on a per-package, per-language basis ago 18 13:31:00 <pitti> for this solution we would gradually strip universe packages AFAICS ago 18 13:31:04 <danilos> sabdfl: well, that's what we're getting with this :) ago 18 13:31:19 <danilos> pitti: with the next update, I'd say ago 18 13:31:20 <pitti> per-package> yes, that's the whole point of (6) ago 18 13:31:28 <sabdfl> so translation teams can sign off on an updated package-language, then press a button and queue it for publishing ago 18 13:31:50 <pitti> so at first we wouldn't do any apt-get magic, but just do a little u-n integration and some backend which downloads and installs tarballs with mo files ago 18 13:31:55 <sabdfl> pitti: i meant the timing of the updates, not the downloading of them ago 18 13:32:01 <sabdfl> at the moment, we do them all once a month ago 18 13:32:15 <sabdfl> but it would be better if a translation team completes a package for them to say "publish now, please" ago 18 13:32:16 <pitti> sabdfl: I think with that granularity they could even be updated daily ago 18 13:32:21 <carlos> sabdfl: well, we can update just the ones updated that week... ago 18 13:32:27 <carlos> sabdfl: even daily ;-) ago 18 13:32:42 <carlos> sabdfl: we only update the ones that got new translations ago 18 13:32:44 <sabdfl> yes, there should be a sweeper process which pushes newer translations out automatically after a while ago 18 13:32:46 <pitti> well, the period doesn't matter so much, it can be adapted to the mirror load ago 18 13:32:56 <sabdfl> but i think the translation teams will like being able to work, test work, test, "release" ago 18 13:33:23 <sabdfl> pitti: will you talk to archive managers about putting these in the dists tree? ago 18 13:33:27 <danilos> aha, option "pull directly from rosetta"? if we didn't have async exports, it would be possible today :) ago 18 13:33:41 <sabdfl> do we want mirrors to be able to mirror just certain languages? ago 18 13:33:49 <carlos> sabdfl: I think so ago 18 13:33:52 <pitti> sabdfl: I don't think they will like it there; should probably become translations.launchpad.net or translations.ubuntu.com or so ago 18 13:34:02 <danilos> sabdfl: if we have indexes per language, they'd already be able to ago 18 13:34:06 <pitti> danilos: no, not from rosetta; the DDoS will kill it ago 18 13:34:06 <sabdfl> danilos: we will push daily or on-demand to a staging server, and translators can fetch their updates from there ago 18 13:34:24 <sabdfl> danilos: not just the indexes, but the actual mo files ago 18 13:34:33 <sabdfl> hmm.. that's probably so little data it does not matter hugely ago 18 13:34:49 <danilos> pitti: I am thinking only of the "work, test, work, test" part of sabdfl's idea: pull directly from rosetta for testing by translators ago 18 13:34:56 <sabdfl> pitti: it could be even smaller than 100kb ago 18 13:34:59 <pitti> danilos: right, that would work ago 18 13:35:15 <pitti> danilos: but we shouldn't have a single server for the default updating mechanism ago 18 13:35:32 <danilos> pitti: right, anyway, more discussions with infrastructure pending ago 18 13:35:51 <sabdfl> if the index is just domain (mo template filename) and date (or counter) ago 18 13:35:52 <pitti> that being said, yes, I can talk to the soyuz and admin guys about a proper place to stick them to ago 18 13:36:02 <sabdfl> ok, thanks ago 18 13:36:12 <carlos> sabdfl: the index should have also a link with the sourcepackagename ago 18 13:36:15 <pitti> the archive itself, a separate hierarchy on archive mirror, network, or a parallel network ago 18 13:36:36 <carlos> sabdfl: so we know the .mo files to install when we get a new package installed ago 18 13:36:41 <sabdfl> pitti: i would like the default to be that current mirrors just acquire this data ago 18 13:36:52 <sabdfl> because we are going to be taking it out of packages, which they would otherwise mirror anyway ago 18 13:37:13 <sabdfl> carlos: the package can register those on installation ago 18 13:37:28 <pitti> sabdfl: true; it should work with rsync, won't work with debmirror (I don't know which is used where) ago 18 13:37:31 <sabdfl> so install the package, it tells the translation manager "i would use translation domains X, Y and Z" ago 18 13:37:38 <carlos> we don't have that information on build time, we can guess it ago 18 13:37:44 <carlos> sabdfl: but Rosetta knows it for sure ago 18 13:37:47 <danilos> can't we have a simple "install-translations-for-current-languages MOFILE" which would be run in post-install? ago 18 13:38:01 <sabdfl> right - and we can also fix packages like that by hand, it just has to be done once ago 18 13:38:06 <pitti> sabdfl: hm, the index file should just be per-package, not per-domain ago 18 13:38:06 <carlos> sabdfl: and we can expand this to support documentation translations without needed to rebuild the packages ago 18 13:38:30 <sabdfl> pitti: i don't think it should be per-package, because translation domains are required unique in any event ago 18 13:38:41 <carlos> sabdfl: that's not always true... ago 18 13:38:43 <sabdfl> and a single package might have several domains ago 18 13:38:44 <pitti> danilos: please do not mention 'simple' and 'modify all package's postinst files' in one sentence :) ago 18 13:38:56 <carlos> sabdfl: it should, but some upstreams doesn't follow that rule ago 18 13:39:00 <danilos> pitti: heh, ok :) ago 18 13:39:05 <sabdfl> it should not be in the postinst anyhow ago 18 13:39:10 <sabdfl> it should be async on package install ago 18 13:39:14 <pitti> no, it shouldn't touch source packages at all ago 18 13:39:16 <sabdfl> install the package, then later update translations ago 18 13:39:32 <pitti> sabdfl: I agree ago 18 13:39:38 <sabdfl> pitti: or touch them in an automated way that the builder can do for all packages, like stripping ago 18 13:39:38 <pitti> update-notifier updates daily anyway ago 18 13:39:40 <pitti> that should be enough ago 18 13:39:44 <danilos> sabdfl: doesn't that disconnect "list of domains" and "sourcepackagename"? ago 18 13:39:53 <pitti> and synaptic/g-a-i can also trigger it manually ago 18 13:40:09 <pitti> btw, domain/package name doesn't matter so much ago 18 13:40:16 <pitti> rosetta can export a mapping between them easily ago 18 13:40:19 <sabdfl> danilos: the sourcepackage name is irrelevant. the binary package is the "user" of the translation data ago 18 13:40:31 <danilos> I should probably not comment on packaging issues which I am not very familiar with :) ago 18 13:40:34 <sabdfl> the binary package should just know which .mo files it *would* use ago 18 13:40:39 <danilos> sabdfl: yeah, that's what I meant ago 18 13:40:54 <sabdfl> and the translation manager can then go "ok, i'll see if I can find those for the languages we need on this system" ago 18 13:41:07 <sabdfl> but that shoudl happen separately from package install ago 18 13:41:12 <pitti> sabdfl: the 'would' is tricky :) ago 18 13:41:17 <pitti> but why not per source packae ago 18 13:41:24 <pitti> that wouldn't be much worse IMHO, but much easier ago 18 13:41:33 <carlos> sabdfl: I still think that's rosetta's job ago 18 13:41:33 <danilos> sabdfl: yeah, I just don't know enough of how one would find that out: "get-mo-files-for-package PACKAGE" ago 18 13:41:46 <danilos> know enough about packaging, that is ago 18 13:41:47 <pitti> often a source package shares a translation domain with several binaries anyway ago 18 13:41:47 <sabdfl> pitti: it's no problem for the builder to say "ok, you didn't say anything about mo files, and I see your source package has these domains, you are going to say you need all of them" ago 18 13:42:04 <sabdfl> so by default, we jam all binaries to say they need all the mo files from their source package ago 18 13:42:22 <pitti> sabdfl: right, we can create such a mapping in pkgstriptranslations ago 18 13:42:30 <sabdfl> but the developers could tweak that if its appropriate, in the small number of cases where one binary uses one mo file, and another uses a different one, all fomr the same source package ago 18 13:42:31 <pitti> it knows the domain<->binary mapping ago 18 13:42:33 <danilos> for perfectionists, let them run strace on programs and let us know of any differences :) ago 18 13:42:33 <carlos> but we already have it in Rosetta.... ago 18 13:42:48 <sabdfl> carlos: at the source package level, yes, not the binary package level ago 18 13:43:05 <pitti> if we need a binary->domain mapping, let's create it in pkgstriptranslations ago 18 13:43:06 <sabdfl> so, by default, the best we can do is assume that any binary produced by a source package, needs all the mo files produced from the same source package ago 18 13:43:35 <sabdfl> then maintainers who care can tweak the packages so that binaries know *exactly* which subset of mo files they actually need ago 18 13:43:41 <carlos> sabdfl: there are some situations ago 18 13:43:43 <danilos> sabdfl: right ago 18 13:43:45 <pitti> and then the client tool would download that mapping (tiny), then check out which domains it wants, and downloads the domain data ago 18 13:43:50 <carlos> when a package doesn't have any .po file ago 18 13:43:58 <carlos> but Rosetta have them ago 18 13:44:03 <pitti> carlos: true ago 18 13:44:03 <carlos> what would happen in that situation? ago 18 13:44:11 <sabdfl> good point ago 18 13:44:14 <carlos> the package will think there aren't translations ago 18 13:44:28 <sabdfl> rosetta will need to tell the build system this information ago 18 13:44:28 <carlos> but Rosetta knows that there are such translations ago 18 13:44:32 <pitti> oh, and binaries might do bindtextdomain() which is shipped by other debs ago 18 13:44:43 <danilos> pitti: yeah, iso_codes :) ago 18 13:44:45 <pitti> well, let's do it by source package then ago 18 13:44:47 <danilos> pitti: among the others ago 18 13:45:01 <pitti> we can always switch the implementation transparently to per-binary on the clients ago 18 13:45:08 <carlos> or php applications using .po files directly instead of .mo files... ago 18 13:45:22 <sabdfl> it should definitely be possible for a binary to say "i would use mo file X" ago 18 13:45:43 <sabdfl> and it should also require no work for maintainers, for it to "just work" by default, even if that results in slightly more mo files installed than needed ago 18 13:45:45 <danilos> sabdfl: yeah, that was my idea of postinst, but there's probably a better way to do it :) ago 18 13:45:53 <sabdfl> and then we can tweak the packaging to get the dependencies just right ago 18 13:45:57 <pitti> sabdfl: at runtime, this is easy, but we need to figure it out statically ago 18 13:46:16 <sabdfl> pitti: how do you mean, statically? ago 18 13:46:29 <pitti> sabdfl: i. e. determine the domains an executable wants without executing it ago 18 13:46:35 <pitti> on the buildds, for example ago 18 13:46:41 <danilos> pitti: how hard would that be? if sourcepackage -> binarypackage1, binarypackage2, then domains for both binarypackages are same as source's ones ago 18 13:47:02 <pitti> danilos: well, that's exaclty source package granularity, or am I missing something? ago 18 13:47:15 <danilos> pitti: this is the first approximation, that's what sabdfl is proposing afaigi ago 18 13:47:30 <danilos> and simply allow binary packages to be more specific ago 18 13:47:33 <carlos> danilos: mark wants to install only a .mo file if the binary package uses it, so if binarypackage2 doesn't use any, we don't install any .mo file ago 18 13:47:43 <danilos> i.e. you generate a static list like this, you can later modify it by hand ago 18 13:47:49 <sabdfl> pitti: well, the binary package would either: ago 18 13:47:49 <sabdfl> (a) have the mo files included, in which case we can strip them, and tell it to register its "dependency" ago 18 13:47:49 <sabdfl> (b) depend on another binary package, which provides them, which would have the same thing happen to it ago 18 13:48:00 <danilos> in most cases, you'd need no changes for most binary packages ago 18 13:48:17 <pitti> sabdfl: unless we only have translations in Rosetta ago 18 13:48:21 <pitti> sabdfl: i. e. for new languages ago 18 13:48:46 <sabdfl> pitti: if there are ANY translations, we register the domain name ago 18 13:48:47 <danilos> pitti: wouldn't you generate indexes and langpacks based on rosetta data anyway? ago 18 13:48:52 <carlos> well ago 18 13:48:54 <carlos> and KDE packages ago 18 13:48:55 <pitti> sabdfl: yes, I agree it's a corner case ago 18 13:49:02 <sabdfl> then if we add translations for new languages, the system translation manager would see them and fetch them ago 18 13:49:07 <carlos> the binaries doesn't include any .po/.mo files ago 18 13:49:13 <carlos> they have their own language pack ago 18 13:49:15 <pitti> carlos: right, KDE is spethial, too ago 18 13:49:19 <sabdfl> should we spec this in november? ago 18 13:49:25 <sabdfl> we know we won't get to it for edgy ago 18 13:49:32 <pitti> sabdfl: november> yes, would be nice ago 18 13:49:37 <sabdfl> and we have 1.0 goals for rosetta ago 18 13:49:44 <sabdfl> ok, then i'm very happy we had this conversation ago 18 13:49:48 <carlos> sabdfl: yeah, I prefer to defer this until next meeting and talk a bit more face to face ago 18 13:50:01 <pitti> if we all agree to do it right instead of do it now, then we should put more brain into this ago 18 13:50:03 <sabdfl> carlos, will you summarise for the lists, and update the spec to say we will talk about it at the whole company meet in november? ago 18 13:50:08 <carlos> I'm going to add this log to current spec ago 18 13:50:12 <sabdfl> pitti: agreed ago 18 13:50:15 <carlos> so we don't miss any of the points we did ago 18 13:50:15 <danilos> carlos: great, thanks ago 18 13:50:22 <sabdfl> thanks guys! ago 18 13:50:27 <pitti> thank you, too ago 18 13:50:33 <pitti> I'll let this mature a bit in my head ago 18 13:50:36 <danilos> yeah :) ago 18 13:50:38 <danilos> same here ago 18 13:50:46 <carlos> pitti: could you talk with mdz about this? ago 18 13:50:48 <pitti> it usually needs a few iterations :) ago 18 13:50:53 <carlos> yeah ago 18 13:50:58 <pitti> carlos: about the plan in general? sure ago 18 13:51:07 <carlos> and the defer of the spec ago 18 13:51:21 <pitti> we also have our admins at the allhands meeting, we'll need them, too ago 18 13:51:28 <carlos> I will add again the restriction to import universe packages into Rosetta ago 18 13:52:02 <carlos> and request Stuart to remove the entries we already have pending to be imported ago 18 13:52:19 <pitti> carlos: what's wrong with importing universe translations at this point? ago 18 13:52:35 <carlos> pitti: that people will translate something that never will be used ago 18 13:52:48 <danilos> not exactly never, but not before edgy+1 ago 18 13:52:55 <carlos> that's why we didn't import it for Dapper ago 18 13:52:58 <pitti> well, not displaying it is a different point ago 18 13:52:59 <danilos> and some translations may "rot" until that happens ago 18 13:53:03 <carlos> danilos: but we prefer people translating main ago 18 13:53:08 <danilos> carlos: right ago 18 13:53:19 <pitti> carlos: OTOH I assume you can import the existing universe later without new uploads ago 18 13:53:23 <danilos> carlos: and that would likely move serbian off the first position :) ago 18 13:53:36 <carlos> pitti: it would be a bunch of work in our side that I would prefer to save if we are not going to use them right now ago 18 13:53:48 <carlos> pitti: no, we cannot ago 18 13:53:55 <carlos> pitti: do you want it backported to dapper?? ago 18 13:54:11 <carlos> danilos: ;-) ago 18 13:54:17 <carlos> pitti: or edgy? ago 18 13:54:34 <danilos> dapper likely, if it's really to be lts ago 18 13:54:53 <carlos> I think it changes a lot of things, we cannot add that to an stable release... ago 18 13:55:13 <pitti> carlos: no, probably no backports ago 18 13:55:13 <danilos> hum, possibly ago 18 13:55:20 <carlos> pitti: then? ago 18 13:55:20 <pitti> carlos: too intrusive changes on clients ago 18 13:55:34 <carlos> why would you want to import universe later? ago 18 13:55:38 <pitti> new glibc, new backend, new update-notifier, etc. ago 18 13:55:58 <danilos> pitti: right, so lets forget about backporting ago 18 13:56:02 <pitti> carlos: hm, true, just ignore me on this point ago 18 13:56:05 <carlos> ;-) ago 18 13:56:52 <carlos> ok, then do we agree on adding again the restriction for universe imports for Edgy and remove anything already in the queue for Universe? ago 18 13:57:09 <danilos> carlos: yeah, I guess so ago 18 13:57:22 <danilos> how do I find out what packages I have installed from universe? :) ago 18 13:57:23 <carlos> I will remove it again when Edgy is released so we can import Edgy+1's universe ago 18 13:58:06 <carlos> in fact... I think I'm just going to block dapper and edgy's universe ago 18 13:58:19 <carlos> so nothing needs to change when we open edgy+1 ago 18 13:59:09 <carlos> ok, thanks for all ago 18 13:59:15 * carlos updates the spec and prepares the email ago 18 13:59:43 <pitti> thanks, guys
LanguagePacksForUniverse (last edited 2008-08-06 16:20:06 by localhost)