Dpkg7Zip

Differences between revisions 2 and 3
Revision 2 as of 2005-11-05 18:31:08
Size: 2398
Editor: 209
Comment: pending review
Revision 3 as of 2005-11-05 20:34:56
Size: 3027
Editor: 209
Comment: Review
Deletions are marked like this. Additions are marked like this.
Line 18: Line 18:
 * Colin is an Ubuntu Developer who constructs the CD images, he notices that the amd64 CD is too large; instead of removing another language pack, he'd like to increase the compression of existing packages to fit them.  * Colin is an Ubuntu Developer who constructs the CD images. He notices that the amd64 CD is too large; instead of removing another language pack, he'd like to increase the compression of existing packages to fit them.

{{{XXX:smurf:What about people behind a slow network connection who want to get their updates faster?}}}
Line 23: Line 25:

{{{XXX:smurf: 100% unscientific test, compressing /var/cache/apt/a*/[a-g]* on my laptop:
 size directory decompress in
58176 repo.7z 19 sec
68032 repo.bz 25 sec
75872 repo.gz 4 sec
}}}
Line 52: Line 61:
{{{XXX:smurf:... especially given that we intend to ship UbuntuExpress, which *has* no .debs.
Maybe we can compress the livecd that way? (Yes I know that this is out of scope for this spec.
So create another one. ;-) }}}
Line 53: Line 66:

{{{XXX:smurf: empty sections should be either populated or deleted }}}

Summary

Evaluate 7zip compression for use in debs, as an alternative to gzip or bzip2.

Rationale

7zip is a new compression algorithm that boasts reduced file size over the existing gzip and bzip2 schemes. If this can be used to reduce the size of packages, it frees up space on the CD for more packages.

Use cases

  • Colin is an Ubuntu Developer who constructs the CD images. He notices that the amd64 CD is too large; instead of removing another language pack, he'd like to increase the compression of existing packages to fit them.

XXX:smurf:What about people behind a slow network connection who want to get their updates faster?

Design

First evaluation of the compression needs to be performed. A selection of different packages should be collated and recompressed with 7zip instead; if a significant size benefit is gained without incurring a significant time or cost benefit for creation or unpacking, then we could consider compressing packages of that type with 7zip instead.

{{{XXX:smurf: 100% unscientific test, compressing /var/cache/apt/a*/[a-g]* on my laptop:

  • size directory decompress in

58176 repo.7z 19 sec 68032 repo.bz 25 sec 75872 repo.gz 4 sec }}}

Particular types:

  • Small binary packages such as dpkg, coreutils, etc.
  • Large binary packages such as firefox and openoffice.
  • Documentation packages.
  • Language packs.

Implementation

Code

The inclusion of bzip2 support into dpkg introduced a generic compression layer, in lib/compression.c. 7zip support can be added in a similar way:

  • Add the location of the 7zip support as a ZIP7 macros in lib/dpkg.h.

  • Add ZIP7 to the compression_type enum in lib/dpkg.h

  • Define data member name macros in dpkg-deb/dpkg-deb.h.

  • Handle the new data member in dpkg-deb/build.c and dpkg-deb/extract.c.

  • Add both "by exec" and "by library" support to lib/compression.c

  • Add format selection options in dpkg-deb/main.c

Data preservation and migration

Packages that would benefit from the conversion would select it when building in their debian/rules as we did for the bzip2 change; they would also Pre-Depend on the appropriate version of dpkg.

Outstanding issues

Is there actually any benefit to this? Testing needs to be done to find out whether it's worthwhile.

{{{XXX:smurf:... especially given that we intend to ship UbuntuExpress, which *has* no .debs. Maybe we can compress the livecd that way? (Yes I know that this is out of scope for this spec. So create another one. Wink ;-) }}}

BoF agenda and discussion

XXX:smurf: empty sections should be either populated or deleted 

Dpkg7Zip (last edited 2008-08-06 17:00:44 by localhost)