Summary

Identify and implement memory footprint optimisations for Maverick on ARM. This includes identifying the target package profile and tools for measuring the memory footprint of packages/processes. Also identify areas of potential improvement for future releases.

Release Note

There should be minimal end-user impact other than improved performance.

Other end-user impact TBD.

Rationale

Run-time memory optimisations help to get the best performance and battery life from lightweight and mobile devices. Mobile devices have sensitivity to RAM usage which differs from heavyweight laptops / desktops due to among other things reduced RAM, TLB and cache sizes. Optimising memory footprint has potential to bring speed improvements (particular boot, login and application launch times) and improved power efficiency + battery life.

Implementation

Implementation requires identifying a target profile, defining what is meant by memory footprint, identifying tools for measuring memory footprint, using the tools to find areas of potential improvement and finally implementing those improvements. See Work items in blueprint whiteboard: https://blueprints.launchpad.net/ubuntu-arm/+spec/arm-m-memory-footprint.

Target Profile Selection

For now, the target profile will be as follows:

See arm-m-ui-and-test-heads for details.

Tools for Instrumenting Memory Footprint

This section proposes:

Definition of Memory Footprint

Goal: provide a definition of the memory footprint of each process in the system which can be used as a basis for profiling and optimisation.

Proposed definition:

Tools Survey

A quick review of some relevant tools and utilities:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0  32372 784880 154488 832572    0    0     7    10   21   46  0  0 99  0

             total       used       free     shared    buffers     cached
Mem:       1896300    1111296     785004          0     154492     832592
-/+ buffers/cache:     124212    1772088
Swap:      5582548      32372    5550176

$ sar -r

Linux 2.6.32-22-generic (e200948)       05/28/10        _i686_  (4 CPU)
13:38:33    kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
13:38:43       761320   1134980     59.85    155548    854884    224820      3.01
13:38:53       761320   1134980     59.85    155552    854888    224820      3.01
13:39:03       760356   1135944     59.90    155572    855784    224820      3.01
13:39:13       759884   1136416     59.93    155608    856136    224820      3.01
13:39:23       759660   1136640     59.94    155688    856364    224820      3.01
13:39:33       759660   1136640     59.94    155692    856364    224820      3.01

$ ipcs -m
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
0x00000000 131072     ubuntu     600        393216     2          dest         
0x00000000 163841     ubuntu     600        393216     2          dest         
0x00000000 196610     ubuntu     600        393216     2          dest         

$ ls -al /dev/shm
total 1192
drwxrwxrwt  2 root  root       120 May 28 15:06 .
drwxr-xr-x 17 root  root      3960 May 28 13:02 ..
-r--------  1 ubuntu ubuntu 67108904 May 28 15:06 pulse-shm-1306850678
-r--------  1 ubuntu ubuntu 67108904 May 28 15:06 pulse-shm-2493460948
-r--------  1 ubuntu ubuntu 67108904 May 28 15:06 pulse-shm-3315159511
-r--------  1 ubuntu ubuntu 67108904 May 28 15:06 pulse-shm-549372063

# find /proc -type d ! -regex '/proc\(/[1-9][0-9]*\)?' -prune -o -type f -name maps -exec grep '' {} +

/proc/3856/maps:00008000-00093000 r-xp 00000000 b3:0a 207460     /usr/bin/gnome-keyring-daemon
/proc/3856/maps:0009a000-0009f000 r--p 0008a000 b3:0a 207460     /usr/bin/gnome-keyring-daemon
/proc/3856/maps:0009f000-000a1000 rw-p 0008f000 b3:0a 207460     /usr/bin/gnome-keyring-daemon
/proc/3856/maps:000a1000-000e5000 rwxp 000a1000 00:00 0          [heap]
/proc/3856/maps:40000000-4001c000 r-xp 00000000 b3:0a 202828     /lib/ld-2.10.1.so
/proc/3856/maps:4001c000-40023000 rw-p 4001c000 00:00 0 
/proc/3856/maps:40023000-40024000 r--p 0001b000 b3:0a 202828     /lib/ld-2.10.1.so
/proc/3856/maps:40024000-40025000 rw-p 0001c000 b3:0a 202828     /lib/ld-2.10.1.so
/proc/3856/maps:40025000-40026000 r--p 00000000 b3:0a 398963     /usr/lib/locale/en_GB.utf8/LC_IDENTIFICATION
/proc/3856/maps:40026000-4002d000 r--s 00000000 b3:0a 301799     /usr/lib/gconv/gconv-modules.cache
[...]

b77b8000-b77b9000 r--p 00000000 08:01 7202724    /usr/lib/locale/en_GB.utf8/LC_PAPER
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         4 kB
Private_Dirty:         0 kB
Referenced:            4 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

deb http://repository.maemo.org fremantle/tools free
deb-src http://repository.maemo.org fremantle/tools free

# apt-get install sp-memusage sp-smaps-measure sp-smaps-visualize

TODO: still to investigate

Build Time Tools for Reducing Memory Footprint

Using definitions and tools determined above, investigate the effect of various build time tools and compiler options on memory footprint.

Kernel Tuning for Optimised Memory Utilization

Investigate how kernel tuning effects memory use.

Targets for Improvement

Actual targets will be based on data gathered using tools from previous sections.

UI Changes

Memory concerns could influence the choice of graphics stack.

Code Changes

TBD

Migration

Probably no end user impact, but there may be software migration issues.

Cross-References

Test/Demo Plan

Before/after demonstration on chosen target profile on OMAP (256MB)?

TODO: Add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage. This need not be added or completed until the specification is nearing beta.

Unresolved Issues / Future Work

It is likely that more areas of improvement will be identified than can be addressed in this release. This effort should continue in future releases.

BoF agenda and discussion

 * work on low hanging fruits now and improve measurement techniques and tools for
   future integration
 * minimal filesystem with just a shell would allow to run something on low mem
   platforms. constraints:
   * low amount of memory
   * smaller caches
 * library duplication
  * openssl / gnutls
 * -Os needs to be measured to understand the runtime performance impact
   (but we build for Thumb2, so we reduce space and sometimes gain speed)
 * finding dead library tool code: callgrind/valgrind
  * massif (from valgrind) also potentially useful for heap profiling
 * Link Time Dead Code and Data Elimination Using GNU Toolchain
  * http://elinux.org/images/2/2d/ELC2010-gc-sections_Denys_Vlasenko.pdf
   (not available yet for ARMv7 but being worked on)
 * Reprofiling the boot process for ARM could be useful
  * upstart currently optimised mainly based on x86 world ... arm might benefit from benchmarking
    and tuning this arch specific.
 * Default VM tuning parameters may benefit from profiling for ARM - use different defaults for ARM versus x86, or per platform?
 * Build kernel for thumb2?
  * Likely to run into alignment bugs, so sooner rather than later
 * Coverage measurement?
  * Haphazard in Ubuntu
 * how much of libc is used by the standard use case
   * mklibs on installed/embedded systems
   * the libc.a could be kept on disk and after installing packages, mklibs could be run to
     produce a lib with exactly what is currently used
   * doubtful whether there would be a consierable win if you do this for a real install, e.g.
     just the installer atm already pulls in 70% -> could be investigated in an experiment
 * Ubuntu/Debian has a library reduction tool - mklibs
 * uclibc? (Nicolas says it can run firefox) - internationalisation limitations
 * exmap(-console) may be a useful memory measurement tool
 * Additional possible memory profiling tools
  * sar
  * frysk


CategorySpec

Specs/M/ARMMemoryFootprint (last edited 2012-10-10 15:54:34 by c-75-71-83-192)