Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.
Launchpad Entry: dpkg-lzma
Packages affected: apt, dpkg, some other large packages
Summary
The Ubuntu alternate cd is running low on space as always and LZMA can compress much tighter than gzip or bzip2, so we intend to allow packages to use LZMA compression as an alternative to gzip and bzip2. Switching to lzma could save us up to 175MB on the alternate cd over what we are currently saving with using gzip and bzip2, depending on which and how many packages we convert.
Release Note
Packages will now be allowed to use lzma compression in addition to the already supported gzip and bzip2 methods.
Rationale
This change is being proposed primarily to save space on the CDs. It would also save bandwidth for downloads from Ubuntu servers.
Just converting the top ten packages alone would save 51,552,804 bytes:
|
original |
none |
gzip |
bzip2 |
lzma |
lzma saves |
openoffice.org-core |
37215228 |
112219498 |
38941278 |
37319546 |
27002374 |
10212854 |
libgl1-mesa-dri |
12957664 |
36650404 |
12957398 |
13335790 |
2790486 |
10167178 |
linux-restricted-modules-2.6.22-14-generic |
16532074 |
42184742 |
16533274 |
15971038 |
7212440 |
9319634 |
linux-image-2.6.22-14-generic |
18541338 |
58920482 |
20914612 |
18646334 |
15008924 |
3532414 |
openoffice.org-help-en-us |
10914796 |
26587030 |
12233476 |
11064148 |
7619690 |
3295106 |
gnome-games-data |
9901616 |
16857246 |
9909868 |
8605290 |
6652348 |
3249268 |
smbclient |
4885878 |
12474020 |
4885590 |
4550400 |
1754328 |
3131550 |
libgcj8-1 |
10445312 |
35646900 |
10449232 |
10370336 |
7315722 |
3129590 |
linux-headers-2.6.22-14 |
7775604 |
38287058 |
7912042 |
6404286 |
5016960 |
2758644 |
openoffice.org-common |
12254184 |
30789672 |
15032414 |
12226340 |
9497618 |
2756566 |
Note: Some packages may not be able to be compressed using lzma due to installation bootstrapping issues.
The trade-off of using lzma is that it is much slower to compress and uses much more memory during compression and decompression. However, lzma is more than twice as fast at decompressing as bzip2, but is still slower than gzip.
Example: openoffice.org-core
type |
size |
c time |
c memory |
d time |
d memory |
gzip |
38,941,278 |
26s |
1M |
1.0s |
1M |
bzip2 |
37,319,546 |
24s |
8M |
7.5s |
4M |
lzma |
27,002,374 |
107s |
370M |
3.3s |
34M |
The following ODS file has a comparison of the sizes of all packages on the Ubuntu 7.10 Alternate CD compressed with gzip, bzip2, lzma compared to what is currently in the archive:
If we were to convert the entire Ubuntu 7.10 alternate CD to each of the following formats it would take the following amount of space:
original |
none |
gzip |
bzip2 |
lzma |
675,113,008 |
1,816,544,474 |
708,735,160 |
643,077,920 |
501,460,266 |
Use Cases
We would like to fit more packages onto the release CDs.
Assumptions
The primary assumption is that users will have the extra 30MB of RAM (over bzip2) needed to decompress packages.
Design
None (bulk of design work has already been done; this is chiefly deployment)
Implementation
UI Changes
None
Code Changes
The following packages need to have lzma support and currently have it:
- dpkg
- installing - 1.13.25 (Tue, 2 Jan 2007 00:23:57 +0200)
- creating - 1.14.0 (Tue, 08 May 2007 11:11:50 +0300)
- apt - 0.7.5 (Wed, 25 Jul 2007 20:16:46 -0300)
However, apt doesn't fully support lzma yet, but this doesn't appear to affect package upgrades. In particular apt-ftparchive contents did not work. I have included a patch that supports lzma except for one function in apt-pkg/acquire-item.cc 'pkgAcqIndex::Failed' that someone more familiar with apt should look over.
dpkg needs a Depends on lzma.
Launchpad will need to be modified to support lzma uploads and to check that the needed predepends listed in the package.
To build packages, when using debhelper, using lzma compression you need to use: dh_builddeb -- -Zlzma
Migration
To support migrating to using lzma compression dpkg should depends on lzma and packages that use lzma compression should pre-depends on that version of dpkg or newer.
Test/Demo Plan
I tested locally converting all the packages from the Ubuntu 7.10 Alternate CD to lzma compression and tried installing several which 'just worked'.
Future Work
If we also start using lzma for Packages files we could save ~ 70K for main and ~ 430K for the universe Packages file. That might not sound like much but adds up since it is downloaded frequently.
Packages files:
|
none |
gzip |
bzip2 |
lzma |
main |
6677465 |
1417147 |
1090948 |
1022666 |
restricted |
25680 |
6937 |
6731 |
6419 |
universe |
20146818 |
5538514 |
4236954 |
3803292 |
multiverse |
692816 |
199044 |
153630 |
156726 |
Comments
* "If we also start using lzma for Packages files", These are probably updated much more often than they are downloaded. Perhaps gzip+zsync would be more appropriate than lzma? -- JohnMccabeDansted
**if you use -f then compression time is more comparable with gzip at the expense of size (but still better than gzip) --instigatorirc@gmail.com