Performance

This page is here to collect together proceedings from sessions as part of the 'Performance' track at the Natty UDS in Orlando, Florida.

Please add proceedings by doing the following:

  • Add a new section with the name of the session.
  • Add the outcomes of the session as a collection of bullet points. The goal of the proceedings is to really focus on decided outcomes, so please try to keep them crisp.

Thanks!

Proceedings

Desktopcouch speed and performance

  • Slim down packages (ad-hoc couch-specific erlang build conflicting, libjs without xulrunner).
  • Lightweight system-level proxy that handles deferred starting of couch, and multiplexing to individual per-user couches.
  • Auto-compaction (opt-out).
  • Investigate erlang's ability to have erlang processes limit cpu/memory use.
  • Investigate tracemonkey (tracing), updated json serializer

NEON/Vectorizer Direction

  • multiple sizes first
  • special loads and stores
  • next is investigation into what to knock of next
  • look into benchmarks and what could be vectorised, e.g. widening multiply (short[n] * short[n] -> int[n]), currently done by a lo/hi multiply

Performance inside GCC

  • continue with the current investigate/implement/repeat cycle
  • start measuring performance with CI and check for regressions

Improve boot-time performance on ARM, from power-on to user-space

  • unless you use lvm, cryptsetup, or raid - which we don't - the kernel has all the information to be able to resolve an (fstype,uuid) pair itself! fix the kernel to do this so we don't need to use initramfs at all
  • reduce the default ^C timeout on u-boot to something less than 10 seconds
  • u-boot shouldn't initialize subsystems before it knows it needs them

Performance improvement areas outside GCC

  • get GOLD on ARM up and going as it's the best platform for other areas
  • add IFUNC support as the best way of handling A8/A9/NEON variants
  • ignore the prelinker for now

Kernel Storage Performance

  • SD and eMMC: Add support for background operations (such as internal-maintenance operations including erase) to drivers and to filesystems.
  • SD and eMMC: Add support for reliable write (NAND Flash not reliable in the rotating-mass-storage sense).
  • SD and eMMC: Investigate adding support for high priority interrupt (requires Linux vfs enhancements). High priority interrupt allows a low-priority large read/write to be interrupted to make way for a high-priority operation.
  • Add "trim" support (similar to SATA standard) to SD. (Add to SD standard, implement in drivers, and appropriate filesystems (e.g., ext4, btrfs, and other rootable filesystem.))
  • SD: Investigate eMMC (version 4.4 standard) "trim" support in Linux kernel and ARM SoC. Fill out implementation and fix bugs as needed.
  • Need defined benchmark(s) for SD and eMMC storage for root-filesystem usage, possibly based on IOZone. These must account for wear effects, which cause system performance to decrease as the cumulative number of write operations increases. Such benchmarks also must be automated.

Improve Memory Footprint for ARM

Discussed how to build on the investigation work from the last cycle. Brief summary of proposed actions (assignees TBD):

  • Sample kernel memory usage as well as userspace
  • Compare x86<->ARM (helpful to flag up arch-specific regressions)

  • Discuss with the Ubuntu packaging folks about the footprint of update-manager, apt-xapian-index
  • Define and sample some particular app use cases (web browser, media playback etc.)
  • Follow up on the status of work on link-time dead code elimination, and hot/cold data partitioning and prelinking
  • Analyse X resource consumption

Reducing Installation/CD Footprint

  • Need space for unity (2.5 MB), banshee (~ 6 MB), GTK3 stack (order of 25 MB), gallium drivers (14 MB)
  • Remove changelogs.Debian.gz (after a final legal check) [11 MB]
  • Optimized PNGs and SVGs [12.5 MB]
  • Remove perl and perl-modules (keep perl-base) [8 MB]
  • Remove DRI MESA drivers for old cards, add Jockey handler [12 MB]
  • If necessary, drop Evolution contacts sync from default install, which will drop couchdb/erlang
  • Install footprint optimizations which don't affect CD size: compressed apt indexes, remove apt source package cache, remove rsyslog redundancy

Kernel Memory Regions

  • Calculations showed 15% battery-lifetime extension from DRAM power-off use case.
  • Additional use case identified: having drivers announce contiguous memory-needs during boot.
  • Additional use case identified: changing memory-region attributes at run time in order to compact non-movable memory.
  • Paul to communicate use cases to IBMer working this to see if they can be accommodated.

UDSProceedings/N/Performance (last edited 2010-10-29 18:38:03 by host194)