Linux I/O schedulers
I/O schedulers attempt to improve throughput by reordering request access into a linear order based on the logical addresses of the data and trying to group these together. While this may increase overall throughput it may lead to some I/O requests waiting for too long, causing latency issues. I/O schedulers attempt to balance the need for high throughput while trying to fairly share I/O requests amongst processes.
Different approaches have been taken for various I/O schedulers and each has their own set of strengths and weaknesses and the general rule is that there is no perfect default I/O scheduler for all the range of I/O demands a system may experience.
Non-multiqueue I/O schedulers
This fixes starvation issues seen in other schedulers. It uses 3 queues for I/O requests:
- Read FIFO - read requests stored chronologically
- Write FIFO - write requests stored chronologically
Requests are issued from the sorted queue inless a read from the head of a read or write FIFO expires. Read requests are preferred over write requests. Read requests have a 500ms expiration time, write requests have a 5s expiration time.
cfq (Completely Fair Queueing)
- Per-process sorted queues for synchronous I/O requests.
- Fewer queues for asynchronous I/O requests.
- Priorities from ionice are taken into account.
Each queue is allocated a time slice for fair queuing. There may be wasteful idle time if a time slice quantum has not expired.
Performs merging of I/O requests but no sorting. Good for random access devices (flash, ramdisk, etc) and for devices that sort I/O requests such as advanced storage controllers.
Multiqueue I/O schedulers
The following I/O schedulers are designed for multiqueue devices. These map I/O requests to multiple queues and these are handled by kernel threads that are distributed across multiple CPUs.
bfq (Budget Fair Queuing) (Multiqueue)
Designed to provide good interactive response, especially for slower I/O devices. This is a complex I/O scheduler and has a relatively high per-operation overhead so it is not ideal for devices with slow CPUs or high throughput I/O devices. Fair sharing is based on the number of sectors requested and heuristics rather than a time slice.
Designed for fast multi-queue devices and is relatively simple. Has two request queues:
- Synchronous requests (e.g. blocked reads)
- Asynchronous requests (e.g. writes)
There are strict limits on the number of request operations sent to the queues. In theory this limits the time waiting for requests to be dispatched, and hence should provide quick completion time for requests that are high priority.
The multi-queue no-op I/O scheduler. Does no reordering of requests, minimal overhead. Ideal for fast random I/O devices such as NVME.
This is an adaption of the deadline I/O scheduler but designed for Multiqueue devices.
Selecting I/O Schedulers
Prior to Ubuntu 19.10 (Linux 4.20), the multiqueue I/O scheduling was not enabled by default and just the deadline, cfq and noop I/O schedulers were available by default.
For Ubuntu 19.10 (Linux 4.20) onwards, multiqueue is enabled by default providing the bfq, kyber, mq-deadline and none I/O schedulers. One can disable these and fall back to the non-multiqueue I/O schedulers using a kernel parameter, for example for SCSI devices one can use:
..add this to the GRUB_CMDLINE_LINUX_DEFAULT string in /etc/default/grub and run sudo update-grub to enable this option.
Changing an I/O scheduler is performed on a per block device basis. For example, for non-multi queue device /dev/sda one can see the current I/O schedulers available using the following:
cat /sys/block/sda/queue/scheduler noop deadline [cfq]
to change this to deadline use:
echo "deadline" | sudo tee /sys/block/sda/queue/scheduler
For multiqueue devices the default will show:
cat /sys/block/sda/queue/scheduler [mq-deadline] none
To use kyber, install the module:
sudo modprobe kyber-iosched cat /sys/block/sda/queue/scheduler [mq-deadline] kyber none
and enable it:
echo "kyber" | sudo tee /sys/block/sda/queue/scheduler
Tuning I/O Schedulers
Each I/O scheduler has a default set of tunable options that may be adjusted to help improve performance or fair sharing for your particular use case. The following kernel documentation covers these per-I/O scheduler tunable options:
deadline (and mq-deadline) deadline-iosched.txt
Best I/O scheduler to use
Different I/O requirements may benefit from changing from the Ubuntu distro default. A quick start guide to select a suitable I/O scheduler is below. The results are based on running 25 different synthetic I/O patterns generated using fio on ext2, ext3, ext4, xfs and btrfs with the various I/O schedulers using the 4.19 kernel.
Note: deadline is used below for deadline or mq-deadline, noop is used for noop or none depending on the multiqueue capablity of your device.
1st choice 2nd choice avoid Random I/O: cfq/deadline bfq Sequential I/O: deadline none bfq Database I/O: deadline none bfq General I/O: deadline none bfq
It is worth noting that there is little difference in throughput between the cfq/deadline/mq-deadline/kyber I/O schedulers when using fast multi-queue SSD configurations or fast NVME devices. In these cases it may be preferable to use a noop/none I/O scheduler to reduce CPU overhead.
1st choice 2nd choice avoid Random I/O: deadline bfq/cfq bfq/none Sequential I/O: deadline bfq none Database I/O: deadline bfq/none General I/O: deadline bfq none
Avoid using the none/noop I/O schedulers for a HDD as sorting requests on block addresses reduce the seek time latencies and neither of these I/O schedulers support this feature.
Of course, your usecase may differ, the above are just suggestions to start with based on some synthetic tests. You may find other choices with adjustments to the I/O scheduler tunables produce better results.