Created: 2006-06-09 by MatthiasKlose
Packages affected: glibc, binutils, gcc.
This is a high-level overview of the work to be done for the toolchain in the release after Edgy.
- Have a tested toolchain ready for the release after Edgy.
- Plan ABI changes/change of default options.
- Identify new upstream versions.
Ensure that anything missed in https://launchpad.net/distros/ubuntu/+spec/gcc-ssp gets completed.
Implement changes deferred from EdgyToolchainRoadmap.
The toolchain needs to be ready as of the openning of the release. Because of this, toolchain changes are sketched out at the beginning of the release, finalised near feature freeze, and uploaded to the archive because merges, syncs and uploads start.
The next release of Ubuntu.
- Update to current upstream versions. As of this writing, we expect these to be:
- tune with -mtune=generic by default
- gcj-4.2.x, follow the development of the ecj branch for 1.5 language and runtime features
- Enable long-double-128 support in glibc and gcc on powerpc and sparc.
Perform test rebuild of the edgy archive against these new toolchain releases. we need something like https://launchpad.net/distros/ubuntu/+spec/frequent-rebuild-testing for the rebuilds.
Features in the newer versions
* For Linux, the sorting of addresses returned by getaddrinfo now also
- handles rules 3 and 7 from RFC 3484. Implemented by Ulrich Drepper.
* Allow system admin to configure getaddrinfo with the /etc/gai.conf file.
- Implemented by Ulrich Drepper.
* New Linux interfaces: splice, tee, sync_file_range, vmsplice.
* New iconv module for MIK. Contributed by Alexander Shopov.
* For sites with broken group and/or passwd database, the auto-propagate
- option of nscd can prevent creating ID lookup entries from the results of a name lookup and vice versa. This usually is no problem but some site might have problems with default behavior. Implemented by Ulrich Drepper.
* Iterating over entire database in NIS can be slow. With the
- SETENT_BATCH_READ option in /etc/default/nss a system admin can decide to trade time for memory. The entire database will be read at once. Implemented by Ulrich Drepper.
* The interfaces introduced in RFC 3542 have been implemented by
- Ulrich Drepper.
* Support for the new ELF hash table format was added by Ulrich Drepper.
* Support for priority inheritance mutexes added by Jakub Jelinek and
- Ulrich Drepper.
* Support for priority protected mutexes added by Jakub Jelinek.
Look into --hash-style=both to get binutils to build binaries that load much faster.
See http://lwn.net/Articles/192624/; this is upstream-accepted combination of -zdynsort and -hashvals.
If this is not in current binutils, a patch is at http://sourceware.org/ml/libc-alpha/2006-06/msg00095.html
- If we're lucky, released binutils will contain the patch. It went into CVS Mon Jul 10 21:40:24 2006 UTC by jakub.
- In support of reducing the number of packages in main and for reducing build time and complexity for security updates, we want to split additional languages into separate packages. At this point, that causes support to be dropped from the main driver, and causes conflicts. We need to investigate if there is a way around this. If not, this issue can be dropped.
- -mtune=generic is a new feature in gcc which produces better optimized code on Intel chips. We need to confirm that there are no regressions on 32-bit AMD chips before enabling this.
- We are often asked for using --as-needed by default. We need to provide documentation on why this isn't enabled by default so that it can be made clear. It would also be useful to setup a "Janitors Team" that would go through and check for text relocations or an excessive number of visible symbols in common libraries and would work with upstreams to make tighter DSOs. This group could also expore using --as-needed and linker optimization as well.
Explore the use of --as-needed not only as "does --as-needed break code;" but also as "should this code work with --as-needed?" Often times optimizations and features like -Bdirect, GccSsp, or FORTIFY_SOURCE can expose REAL bugs by "breaking" code.
Look into utilizing -Bdirect linking.
Applications will load much faster (75% reduction in looking up a symbol binding) http://sourceware.org/ml/binutils/2005-10/msg00436.html
Has a wider base effect than prelink (affects dlopen() loaded libraries)
- Only has to be used at link time; no end-user maintenance
- Creates some excess maintenance
glibc breaks without special attention
binutils and glibc have to be patched with support
- May expose bugs in or cause breakage in other libraries
- ABI hasn't been solidified
- May create a useless ELF section that will just take up space in the future.
- Creates some excess maintenance
It's sad, but unless this optimization gets major support behind it, Drepper is going to kill it (he refuses to accept it even if it works right); yet at the same time, Meeks is still working on making it 'just work'. --JohnMoser
Look into utilizing FORTIFY_SOURCE
Building C files with -D_FORTIFY_SOURCE=1 replaces some standard calls with fortified ones, and does compile-time checking.
Building C files with -D_FORTIFY_SOURCE=2 also does some printf() et al checking, which may break legitimate code.
FORTIFY_SOURCE is only useful for a handful of standard functions being used directly on data where the object size can be known. This includes fresh malloc() memory and arrays in code that expects a fixed size. Anything that goes through, for example, glib or hand-written strcpy() wrappers will not benefit; alternate allocators probably won't work either.
- On the topic of a "Janitor Team"
Gentoo's security team at http://hardened.gentoo.org/ is interested in eliminating TEXTRELs; I've packaged their pax-utils tools, including scanelf which is very good for finding TEXTRELs. I'm sure they would love some assistance. --JohnMoser
It would be nice to have a similar team but specifically for security purposes; spec at HardenedUbuntu/Audit. This can be slightly more difficult and time-consuming. They would have a few tasks.
Check the application of GccSsp.
Eliminating TEXTRELs. Both a performance and security value; TEXTRELs can be used to get TEXTREL detectors to allow return-to-libc chains to mprotect() code writable to copy shellcode to it, then mprotect() it back to executable.
Executable stacks. fixing these can be trivial, but occasionally requires rewriting code to remove trampolines/nested functions. I've written a couple scripts of my own that find processes with executable stacks and tell you what DSOs cause them as well. --JohnMoser