dpkg-lzma

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

  • Launchpad Entry: dpkg-lzma

  • Packages affected: apt, dpkg, some other large packages

Summary

The Ubuntu alternate cd is running low on space as always and LZMA can compress much tighter than gzip or bzip2, so we intend to allow packages to use LZMA compression as an alternative to gzip and bzip2. Switching to lzma could save us up to 175MB on the alternate cd over what we are currently saving with using gzip and bzip2, depending on which and how many packages we convert.

Release Note

Packages will now be allowed to use lzma compression in addition to the already supported gzip and bzip2 methods.

Rationale

This change is being proposed primarily to save space on the CDs. It would also save bandwidth for downloads from Ubuntu servers.

Just converting the top ten packages alone would save 51,552,804 bytes:

original

none

gzip

bzip2

lzma

lzma saves

openoffice.org-core

37215228

112219498

38941278

37319546

27002374

10212854

libgl1-mesa-dri

12957664

36650404

12957398

13335790

2790486

10167178

linux-restricted-modules-2.6.22-14-generic

16532074

42184742

16533274

15971038

7212440

9319634

linux-image-2.6.22-14-generic

18541338

58920482

20914612

18646334

15008924

3532414

openoffice.org-help-en-us

10914796

26587030

12233476

11064148

7619690

3295106

gnome-games-data

9901616

16857246

9909868

8605290

6652348

3249268

smbclient

4885878

12474020

4885590

4550400

1754328

3131550

libgcj8-1

10445312

35646900

10449232

10370336

7315722

3129590

linux-headers-2.6.22-14

7775604

38287058

7912042

6404286

5016960

2758644

openoffice.org-common

12254184

30789672

15032414

12226340

9497618

2756566

Note: Some packages may not be able to be compressed using lzma due to installation bootstrapping issues.

The trade-off of using lzma is that it is much slower to compress and uses much more memory during compression and decompression. However, lzma is more than twice as fast at decompressing as bzip2, but is still slower than gzip.

Example: openoffice.org-core

type

size

c time

c memory

d time

d memory

gzip

38,941,278

26s

1M

1.0s

1M

bzip2

37,319,546

24s

8M

7.5s

4M

lzma

27,002,374

107s

370M

3.3s

34M

The following ODS file has a comparison of the sizes of all packages on the Ubuntu 7.10 Alternate CD compressed with gzip, bzip2, lzma compared to what is currently in the archive:

ubuntu-alternate-cd-lzma.ods

If we were to convert the entire Ubuntu 7.10 alternate CD to each of the following formats it would take the following amount of space:

original

none

gzip

bzip2

lzma

675,113,008

1,816,544,474

708,735,160

643,077,920

501,460,266

Use Cases

We would like to fit more packages onto the release CDs.

Assumptions

The primary assumption is that users will have the extra 30MB of RAM (over bzip2) needed to decompress packages.

Design

None (bulk of design work has already been done; this is chiefly deployment)

Implementation

UI Changes

None

Code Changes

The following packages need to have lzma support and currently have it:

  • dpkg
    • installing - 1.13.25 (Tue, 2 Jan 2007 00:23:57 +0200)
    • creating - 1.14.0 (Tue, 08 May 2007 11:11:50 +0300)
  • apt - 0.7.5 (Wed, 25 Jul 2007 20:16:46 -0300)

However, apt doesn't fully support lzma yet, but this doesn't appear to affect package upgrades. In particular apt-ftparchive contents did not work. I have included a patch that supports lzma except for one function in apt-pkg/acquire-item.cc 'pkgAcqIndex::Failed' that someone more familiar with apt should look over.

apt_0.7.9-lzma.diff

dpkg needs a Depends on lzma.

Launchpad will need to be modified to support lzma uploads and to check that the needed predepends listed in the package.

To build packages, when using debhelper, using lzma compression you need to use: dh_builddeb -- -Zlzma

Migration

To support migrating to using lzma compression dpkg should depends on lzma and packages that use lzma compression should pre-depends on that version of dpkg or newer.

Test/Demo Plan

I tested locally converting all the packages from the Ubuntu 7.10 Alternate CD to lzma compression and tried installing several which 'just worked'.

Future Work

If we also start using lzma for Packages files we could save ~ 70K for main and ~ 430K for the universe Packages file. That might not sound like much but adds up since it is downloaded frequently.

Packages files:

none

gzip

bzip2

lzma

main

6677465

1417147

1090948

1022666

restricted

25680

6937

6731

6419

universe

20146818

5538514

4236954

3803292

multiverse

692816

199044

153630

156726

Comments

* "If we also start using lzma for Packages files", These are probably updated much more often than they are downloaded. Perhaps gzip+zsync would be more appropriate than lzma? -- JohnMccabeDansted

**if you use -f then compression time is more comparable with gzip at the expense of size (but still better than gzip) --instigatorirc@gmail.com


CategorySpec

dpkg-lzma (last edited 2009-03-21 12:57:20 by 75-165-16-192)