ReducingDuplication

Differences between revisions 14 and 16 (spanning 2 versions)
Revision 14 as of 2005-11-09 00:11:18
Size: 7881
Editor: 209
Comment: proofreading
Revision 16 as of 2006-01-01 05:20:33
Size: 2964
Editor: S0106000000cc07fc
Comment: cat spec
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
 * '''Contributors''': AdamConrad  * '''Contributors''': AdamConrad, JamesTroup, ColinWatson
Line 7: Line 7:
Line 11: Line 10:
We need to reduce library (and app) duplication in main as much as possible to make dapper's "supported for 5 years on the server" goal easier on all of us. We need to reduce library (and application) duplication in main as much as possible to make dapper's "supported for 5 years on the server" goal easier on all of us.
Line 27: Line 26:
 * Identify duplicate libs, make sure everything in main links to the "best" version, and kick the rest out.
 *
The "best" version is almost always the newest library or app version, except for special cases where we may be concerned about stability or reduced feature sets. Those should be documented and discussed below (an example here is libmysqlclient).
 * Identify duplicate libs, make sure everything in main links to the "best" version, and kick the rest out.   The "best" version is almost always the newest library or app version, except for special cases where we may be concerned about stability or reduced feature sets. Those should be documented and discussed in DapperDuplicatedPackages (an example here is libmysqlclient). [ONGOING]
Line 31: Line 30:
== Static linking ==  * Audit the seeds as they've built up a lot of cruft. We will add a rationale for every library package in the seeds since as a rule there should be no libraries listed in the seeds (they should all be pulled in as Depends or Build-Depends of seeded packages).
Line 33: Line 32:
=== zlib ===  * Germinate will be improved so that if a package is in extra and matches:
Line 35: Line 34:
 * dpkg (justified)
 * aide (justified)
 * ia32-libs (warty, hoary)
 * amd64-libs (warty, hoary)
 * rpm (in lsb-rpm, might be justified)

=== libpcre3 ===

 * python2.{1,2,3} (hacked upstream, fixed in 2.4)
 * gnumeric (warty, hoary, breezy)
 * apache2 (warty)

=== modified copies of xpdf in source ===

 * cupsys (warty, hoary)
 * tetex-bin (probably not really justified)

=== libnspr4/libnss3 ===

 * We need to pull these from upstream, in versions new enough to make firefox 1.5 and thunderbird 1.5 happy, so mozilla can migrate to universe.

=== mozilla-dev ===

 * enigmail, librsvg2, openoffice.org2 all build-dep on mozilla-dev currently
 * librsvg2 works fine with firefox-dev; enigmail could require a package split (e-thunderbird and e-mozilla); for OO.o we should check whether it builds and works with firefox-dev, and if not, split out the three shared libraries in the mozilla-browser package into a mozilla-libs package, which would allow us to demote at least mozilla-browser to universe

== Seed Justification ==

The seeds should be audited, as they've built up a lot of cruft. We should add a rationale for everything in the seeds. This isn't directly 'code duplication' but it's fairly obviously related.

(Even more orthogonally, 'extra' should be cleared up as we're missing a bunch of -doc and -dev packages for libraries in main -- see the section on germinate improvements)

As a rule there should be no libraries listed in the seeds; they should all be pulled in as Depends or Build-Depends of seeded packages.

== Multiple versions ==

=== libmysql* ===

 * libmysqlclient10 and libmysqlclient12 are going away completely.
 * If MySQL 5.0 proves stable, MySQL 4.1 and libmysqlclient14 will also go away
   in favour of 5.0 and client15.

=== libdb* ===

 * We will standardize on libdb4.3 (groundwork for this has already been laid in Debian), and attempt to get rid of libdb1-compat, libdb2, libdb3 and libdb4.[012]

=== gnutls/gcrypt ===

 * gnutls10 should go away (if it isn't already)
 * gnutls11 is currently necessary for openldap, needs an interface rewrite (old openssl compatibility code)
 * everything else should eventually use gnutls12 (which isn't in dapper yet ...)
 * gcrypt{!11} should go away if it hasn't already

=== Python2.* ===

 * Python2.3 is still required for Zope2, which it was decided last night can be demoted to universe. (this has now been done)
 * doko says that the 2.3->2.4 transition in Debian will be happening any day now, along with automation of default python version selection, which will make it so we don't have to touch each package to drop 2.3 support. If this doesn't materialise by Christmas, we will have to fall back to touching each package by hand and dropping 2.3 support.

=== libgtk.* ===

 * 1.2 still used by xmms, kicker-applets, and evms-gui

=== libmpeg1 ===

 * gimp build-depends on it, but doesn't appear to USE it, so test dropping the build-dep

=== libmpeg3 ===

 * The only thing using this is directfb, the only thing using that is libggi.
 * Nothing is using libggi, it's only pulled in because it's directly seeded and shouldn't be.
 * ColinWatson: cdebconf in dapper will want directfb for the prototype graphical installer work. We should decide whether we want this; I think we probably will want it eventually, but if not cdebconf will need to be modified to avoid building cdebconf-gtk-udeb and remove the build-dependency.

=== libgd* ===

 * python-gdchart should be transitioned to libgd2

=== gnome1 ===

 * glife (Ogra says, "DROP IT, DO IT, POP THE TRUNK")
 * gnome-pilot (Pitti says, "Doesn't evolution do that these days?")
 * unixodbc (Adam says he'll fix)

=== postgresql-7.4 ===

 * Obsolete versions and upgrades are now handled entirely by postgresql-common, so this version can be dropped. pitti just got the okay from mdz to drop the transition packages as well. (now done)

=== libnetpbm9 ===

 * Seeded without any main package that rdepends on it. Should just be unseeded.

=== libsnmp4.2 ===

 * cyrus21-imapd build-depends on it, Debian uses 5.0 now, so we should just merge

=== libsigc++{1.2,2.0} ===

 * aptitude needs to transition to 2.0, it already has in experimental.

=== libsqlite{0,3} ===

 * The world depends on 3, these depend on 0:
  * python2.4-sqlite (maybe this can be demoted, since there are other python sqlite bindings in main)
  * php5-sqlite (Adam will transition this one)
  * qt-x11-free (not sure why, need to investigate)

=== libreadline{4,5} ===

 * We have about ~100 packages in main depending on libreadline4, this may go down significantly after we process all of MOM's pending merges.
 * Due to compatible APIs, in 99% of cases, it should be enough to change build-deps and recompile the packages Debian hasn't gotten to yet.
 * doko says this is easy, and we can bug him to do it.

=== Semi-automated package duplication finding for ongoing work ===

grep "^Package: " /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_dapper_main_binary-i386_Packages | awk '{print $2}' | sort | sed -e "s/c2//" -e "s/c102//" -e "s/[-0-9.]//g" | sort | uniq -d

== Germinate Improvements ==

Colin Watson has kindly volunteered to improve germinate thusly:

 * If a package is in extra and matches:
{{{
Line 159: Line 38:
}}}
Line 160: Line 40:
 They get promoted if the source package is going to main.    It will be automatically seeded.
Line 162: Line 42:
Adam Conrad would like germinate to support promotion of whole source packages, perhaps by prefixing the package name with '%', à la quinn-diff.  * Germinate will be improved to support the seeding of whole source packages (rather than having to explicitly list all the binary children). This could be done by prefixing the package name with '%', à la quinn-diff. [DONE]
Line 167: Line 47:
----
CategorySpec

Summary

We need to reduce library (and application) duplication in main as much as possible to make dapper's "supported for 5 years on the server" goal easier on all of us.

Rationale

Supporting multiple versions of similar codebases can be incredibly difficult and time-consuming. We should, rather, be concentrating on keeping ONE of everything (one libdb, one libmysqlclient, one libssl, one libpng, one set of mozilla libs, etc) in main, and punting the rest to universe (or oblivion).

Use cases

  • A security problem in libpcre requires an urgent update. Pitti is rather upset that he has to update several packages which have a static copy of libpcre in hoary breezy, but is then pleasantly surprised that due to this reduction effort, he only has to update one package (the actual library) in dapper.
  • A support company wants to commit to providing technical support for Dapper, but only wants to support packages that have security support. Jeff, the support person, no longer cries himself to sleep at night because Dapper's set of supported packages is much lower than Breezy's was.

Implementation

  • Identify duplicate libs, make sure everything in main links to the "best" version, and kick the rest out. The "best" version is almost always the newest library or app version, except for special cases where we may be concerned about stability or reduced feature sets. Those should be documented and discussed in DapperDuplicatedPackages (an example here is libmysqlclient). [ONGOING]

  • Audit packages for local copies of libraries (static libz, libdb, libpng, imlib, libpcre, and libneon have all been common in the past), and get us linking dynamically to packaged libraries everywhere possible.
  • Audit the seeds as they've built up a lot of cruft. We will add a rationale for every library package in the seeds since as a rule there should be no libraries listed in the seeds (they should all be pulled in as Depends or Build-Depends of seeded packages).
  • Germinate will be improved so that if a package is in extra and matches:

    ^*-dev$,
    ^*-doc$,
    ^*-dbg$
  • It will be automatically seeded.
  • Germinate will be improved to support the seeding of whole source packages (rather than having to explicitly list all the binary children). This could be done by prefixing the package name with '%', à la quinn-diff. [DONE]

Outstanding issues

Packages which ship their own version of a library, as opposed to statically linking with the Ubuntu version, might have local patches which need to be audited while we're doing this.


CategorySpec

ReducingDuplication (last edited 2008-08-06 16:41:21 by localhost)