stress-ng

Revision 1 as of 2016-01-29 17:50:26

Clear message

Introduction

stress-ng will stress test a computer system in various selectable ways. It was designed to exercise various physical subsystems of a computer as well as the various operating system kernel interfaces. stress-ng also has a wide range of CPU specific stress tests that exercise floating point, integer, bit manipulation and control flow.

stress-ng was originally intended to make a machine work hard and trip hardware issues such as thermal overruns as well as operating system bugs that only occur when a system is being thrashed hard. Use stress-ng with caution as some of the tests can make a system run hot on poorly designed hardware and also can cause excessive system thrashing which may be difficult to stop.

The tool has a wide range of different stress mechanisms (known as "stressors") and a full description of these is included in the man page. This document is a quick-start reference guide and covers some of the more typical use cases for stress-ng.

A simple example

The matrix stressor is a good way to exercise the CPU floating point operations as well as memory and processor data cache. Of all the tests, this one generally heats x86 CPUs the best.

To run 1 instance of this for 60 seconds, use:

stress-ng --matrix 1 -t 1m

If you want to run an instance of this on ALL the CPUs on your machine, specify 0 instances and stress-ng will figure out how many to run:

stress-ng --matrix 0 -t 1m

You can get an idea of how much user time and system (kernel) time is being used via the --times option:

stress-ng --matrix 0 -t 1m --times
stress-ng: info:  [16783] dispatching hogs: 4 matrix
stress-ng: info:  [16783] successful run completed in 60.00s (1 min, 0.00 secs)
stress-ng: info:  [16783] for a 60.00s run time:
stress-ng: info:  [16783]     240.00s available CPU time
stress-ng: info:  [16783]     205.21s user time   ( 85.50%)
stress-ng: info:  [16783]       0.32s system time (  0.13%)
stress-ng: info:  [16783]     205.53s total time  ( 85.64%)
stress-ng: info:  [16783] load average: 3.20 1.25 1.40

In the above example, I ran this on a machine that wasn't particularly idle with 4 CPU threads, so 4 instances were executed. The total CPU time was 4 x 60 seconds (240 seconds), of which 0.13% was in the kernel, and 85.50% in user time and stress-ng only got 85.64% of all the CPUs (since the machine was a bit busy doing other work at the same time).

Now consider a more interesting stress test, such as passing messages between processes using a POSIX message queue. We can run the mq stressor with the --perf option to see some more detail on what the machine is doing during the run:

stress-ng --mq 0 -t 30s --times --perf
stress-ng: info:  [16973] dispatching hogs: 4 mq
stress-ng: info:  [16973] successful run completed in 30.00s
stress-ng: info:  [16973] mq:
stress-ng: info:  [16973]            290,423,383,332 CPU Cycles                     9.68 B/sec
stress-ng: info:  [16973]            223,288,693,644 Instructions                   7.44 B/sec (0.769 instr. per cycle)
stress-ng: info:  [16973]                138,916,980 Cache References               4.63 M/sec
stress-ng: info:  [16973]                  5,305,248 Cache Misses                   0.18 M/sec ( 3.82%)
stress-ng: info:  [16973]            183,625,100,272 Stalled Cycles Frontend        6.12 B/sec
stress-ng: info:  [16973]             42,638,257,404 Branch Instructions            1.42 B/sec
stress-ng: info:  [16973]                167,682,072 Branch Misses                  5.59 M/sec ( 0.39%)
stress-ng: info:  [16973]             10,231,977,988 Bus Cycles                     0.34 B/sec
stress-ng: info:  [16973]            256,043,743,440 Total Cycles                   8.53 B/sec
stress-ng: info:  [16973]                        176 Page Faults Minor              5.87 sec  
stress-ng: info:  [16973]                          0 Page Faults Major              0.00 sec  
stress-ng: info:  [16973]                 22,901,328 Context Switches               0.76 M/sec
stress-ng: info:  [16973]                        952 CPU Migrations                31.73 sec  
stress-ng: info:  [16973]                          0 Alignment Faults               0.00 sec  
stress-ng: info:  [16973] for a 30.00s run time:
stress-ng: info:  [16973]     120.02s available CPU time
stress-ng: info:  [16973]      11.26s user time   (  9.38%)
stress-ng: info:  [16973]      93.84s system time ( 78.19%)
stress-ng: info:  [16973]     105.10s total time  ( 87.57%)
stress-ng: info:  [16973] load average: 3.72 1.67 1.42

So we can see here that the mq stressor is forcing the processes to context switch at around 0.76 million per second, and we're getting quite low data cache misses.

Bogo Ops

Stress-ng measures a stress test "throughput" using "bogus operations per second". The size of a bogo op depends on the stressor being run, and are not comparable between different stressors. They give some rough notion of performance but should not be used as an accurate benchmarking figure. They are useful to see if performance changes between kernel versions or different compiler versions used to build stress-ng. One can also use them to get a notional rough comparison of performance between different systems. But caveat emptor: they are NOT intended to be a scientifically accurate benchmarking metric.

Use the --metrics-brief option to show the bogo ops. Let's see how the matrix stressor fares on a i5-3210M laptop:

stress-ng --matrix 0 -t 60s --metrics-brief
stress-ng: info:  [17579] dispatching hogs: 4 matrix
stress-ng: info:  [17579] successful run completed in 60.01s (1 min, 0.01 secs)
stress-ng: info:  [17579] stressor      bogo ops real time  usr time  sys time   bogo ops/s   bogo ops/s
stress-ng: info:  [17579]                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
stress-ng: info:  [17579] matrix          349322     60.00    203.23      0.19      5822.03      1717.25

...we are primarily interested in the bogo/ops (real time) rate, that is, the total bogo ops measured divided by the total run time.

..and now run it on a 48 thread Xeon(R) CPU E5-2680 server:

stress-ng --matrix 0 -t 60s --metrics-brief
stress-ng: info:  [113534] dispatching hogs: 48 matrix
stress-ng: info:  [113534] successful run completed in 60.01s (1 min, 0.01 secs)
stress-ng: info:  [113534] stressor      bogo ops real time  usr time  sys time   bogo ops/s   bogo ops/s
stress-ng: info:  [113534]                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
stress-ng: info:  [113534] matrix         6594214     60.00   2882.38      0.02    109903.67      2287.75

so 5822.03 vs 109903.67, the Xeon server has 12 more threads and has about 18.8 x more throughput on this specific stress test.

Running many stressors

stress-ng can run more than one stress test. By default it will run the requested stressors in parallel, for example, running 2 instances of the CPU stressor, 1 instance of the matrix stressor and 3 instances of the message queue stressor in parallel for 5 minutes:

stress-ng --cpu 2 --matrix 1 --mq 3 -t 5m

One can invoke all the stress tests to run in parallel, with the --all option. The following example will run 2 instances of each of all the stress tests in parallel:

stress-ng --all 2

Or, alternatively, run each different stressor sequentially. The following example will run 4 instances of each stress test at a time, for 20 seconds for each stressor:

stress-ng --seq 4 -t 20

One may want to exclude specific stressors from the --all and --seq options, one can do that with the -x option:

stress-ng --seq 1 -x numa,matrix,hdd

...this will run all the stressors except for the numa, matrix and hdd stressor.

How hot?

If you machine supports thermal zones, then stress-ng can report on the temperature at the end of a run with the --tz option, for example, 60 seconds of the CPU stressor:

stress-ng --cpu 0 --tz -t 60
stress-ng: info:  [18065] dispatching hogs: 4 cpu
stress-ng: info:  [18065] successful run completed in 60.07s (1 min, 0.07 secs)
stress-ng: info:  [18065] cpu:
stress-ng: info:  [18065]         x86_pkg_temp   88.75 °C
stress-ng: info:  [18065]               acpitz   88.38 °C

More stressy

The --aggressive option cranks up the stress bu enabling more file, cache and memory aggressive options in the stress tests if they are available. It will also force processes to jump around between CPUs which will stress SMP and NUMA systems further.

Stressors are configured to run with default settings, such as memory sizes, cache sizes, file sizes etc. The --maximize option forces stressors to use the largest settings that are sanely possible, causing more stress, for example more I/O and considerably more paging.

Running stress-ng with root privilege is even more aggressive since stress-ng will change scheduling priorities and will maximize itself to the ulimit limits. Don't use this unless you are willing to totally lock up a machine.

Classes

The stress-ng stressors are grouped together in different classes:

  • cpu - CPU intensive
  • cpu-cache - stress CPU instruction and/or data caches
  • device - raw device driver stressors
  • io - generic input/output
  • interrupt - high interrupt load generators
  • filesystem - file system activity
  • memory - stack, heap, memory mapping, shared memory stressors
  • network - TCP/IP, UDP and UNIX domain socket stressors
  • os - core kernel stressors
  • pipe - pipe and UNIX socket stressors
  • scheduler - force high levels of context switching
  • security - AppArmor stressor

  • vm - Virtual Memory stressor (paging and memory)

stressors may be in one or more classes, for example, the lsearch (linear search) stressor is in the cpu-cache, cpu and memory classes as it touches all these three activities.

For example, to run all the stress tests in parallel under a the network class, with 1 instance of each being run, use:

stress-ng --class network --all 1

..or to run all the networking class stressors one by one with an instance of each being run on ALL cpus, use:

stress-ng --class network --seq 0