LxcSecurity

Differences between revisions 6 and 11 (spanning 5 versions)
Revision 6 as of 2012-01-04 19:55:18
Size: 3115
Editor: serge-hallyn
Comment:
Revision 11 as of 2012-02-21 20:59:23
Size: 3668
Editor: serge-hallyn
Comment:
Deletions are marked like this. Additions are marked like this.
Line 15: Line 15:
  * for 12.04, /proc and /sys will be mountable but only under /proc and /sys
    * this is to make pathname-based restrictions on proc and sysfs files useful
Line 26: Line 28:
      * apparmor will deny devpts mounts to the container
        * the new devpts is mounted by lxc-start before init is executed
 * execution of init is what will lead to the profile which refuses devpts mounts.
Line 30: Line 35:
    * For 12.04, there will be a choice
      * either
        * container runs in a host-defined apparmor profile
        * or container runs in its own profile
      * but it won't be able to have its own profile and still be confined by host
      * that will be fixed for 14.04
    * For 12.04, containers run in a restricted profile
    * After 12.04, containers will be in a new namespace where they can define their own (sub-)policies, fully restricted by the main policy
  * container can do super block level remounting of /, which is usually shared with host and other containers.
      * will "deny remount /" be possible?
  * a container can do 'udevadm trigger action=add' cause a udev storm - and device resets - on the host.

Security issues and mitigations with lxc

Introduction

Lxc creates lightweight 'containers' mainly using kernel support for namespaces and control groups. The namespaces can provide isolation (by not providing any name by which to reference a particular file, for instance), control groups can provide various limits (for instance refusal to access /dev/sda), and LSMs can clamp down on permissions with a mandatory access control policy. POSIX capabilities, in particular the bounding set, can be used to refuse some privileges, however this is less than ideal because most privileges are desirable when targeted to resources owned by the container. Finally, seccomp2 can refuse the container access to some kernel functionality (system calls).

However, containers will always (by design) share the same kernel as the host. Therefore, any vulnerabilities in the kernel interface, unless the container is forbidden the use of that interface (i.e. using seccomp2) can be exploited by the container to harm the host.

This information is aimed at the 12.04 (precise) release.

Issues considered for 12.04

Below, top level items are security concerns, and deeper nesting is potential or actual mitigations.

  • for 12.04, /proc and /sys will be mountable but only under /proc and /sys
    • this is to make pathname-based restrictions on proc and sysfs files useful
  • specific proc files like /proc/sysrq-trigger (should the harmful ones be listed in a wiki page?)
    • User namespaces will make those files owned by "another root" than the container root. They are unlikely to be complete for 12.04 however.
    • Apparmor
      • Can deny access to these files by pathname
      • To prevent the container bypassing this with 'mount --move', apparmor will, for 12.04, have a new rule to enforce mount locations
      • After 12.04, rules will be added to enforce pathnames relative to a particular fstype's mount root (i.e. type=procfs sysrq-trigger)
    • reboot/shutdown system call
      • A patch is being sent upstream to make reboot in a container be specially handled. This will prevent this problem.
      • The alternative is to remove CAP_SYS_BOOT from the container's capability bounding set.
  • Filesystems:
    • a guest can remount the host's devpts. Simply doing 'mount -t devpts devpts /dev/pts' will overlay its private (newinstance) devpts instance with the host's ('global') instance. This needs to be handled either with LSM or user namespaces.
      • apparmor will deny devpts mounts to the container
        • the new devpts is mounted by lxc-start before init is executed
        • execution of init is what will lead to the profile which refuses devpts mounts.
    • securityfs shouldn't be mountable by a guest
    • debugfs shouldn't be mountable by a guest
    • binfmt_misc shouldn't be mountable by a guest
  • apparmor profile in container
    • For 12.04, containers run in a restricted profile
    • After 12.04, containers will be in a new namespace where they can define their own (sub-)policies, fully restricted by the main policy
  • container can do super block level remounting of /, which is usually shared with host and other containers.
    • will "deny remount /" be possible?
  • a container can do 'udevadm trigger action=add' cause a udev storm - and device resets - on the host.

References

LxcSecurity (last edited 2012-11-26 19:34:51 by serge-hallyn)