Security issues and mitigations with lxc
Lxc creates lightweight 'containers' mainly using kernel support for namespaces and control groups. The namespaces can provide isolation (by not providing any name by which to reference a particular file, for instance), control groups can provide various limits (for instance refusal to access /dev/sda), and LSMs can clamp down on permissions with a mandatory access control policy. POSIX capabilities, in particular the bounding set, can be used to refuse some privileges, however this is less than ideal because most privileges are desirable when targeted to resources owned by the container. Finally, seccomp2 can refuse the container access to some kernel functionality (system calls).
However, containers will always (by design) share the same kernel as the host. Therefore, any vulnerabilities in the kernel interface, unless the container is forbidden the use of that interface (i.e. using seccomp2) can be exploited by the container to harm the host.
This information is aimed at the 12.04 (precise) release.
Issues considered for 12.04
Below, top level items are security concerns, and deeper nesting is potential or actual mitigations.
- specific proc files like /proc/sysrq-trigger (should the harmful ones be listed in a wiki page?)
- User namespaces will make those files owned by "another root" than the container root. They are unlikely to be complete for 12.04 however.
- Can deny access to these files by pathname
- To prevent the container bypassing this with 'mount --move', apparmor will, for 12.04, have a new rule to enforce mount locations
- reboot/shutdown system call
- A patch is being sent upstream to make reboot in a container be specially handled. This will prevent this problem.
- The alternative is to remove CAP_SYS_BOOT from the container's capability bounding set.