Docker is a great technology which simplifies the development and deployment of distributed applications. While building dockerized applications, security at various points needs to be considered. Due to lack of CPU/Hypervisor, Docker security is fully implemented in Software.
A Docker container in a nutshell has the following:-
- Kernel shared with host.
- Linux namespaces provide the isolation with respect to Processes (pid), IPC (ipc), Storage (mnt), Network (net), User (usr).
- Linux Cgroups provide resource limiting and accounting in terms of CPU, Memory and IO bandwidth etc.
- Unique file system (rootfs) which is isolated from host and managed by Docker as layers. As an example: /etc/resolv.conf in a container would be totally unrelated and orthogonal to /etc/resolv.conf on the Docker host.
- All the libraries and configuration files needed for container application binaries are self-contained in the container file system.
Let us look at all prominent security aspects of kernel and container applications followed by generic aspects.
- Kernel Hardening:-
- Docker containers share the kernel with the host on which they are running. So, security should be applied on the host kernel so as to secure the container kernel. The running kernel should have CONFIG_SECURITY_SELINUX enabled and SELinux should be enabled in enforced mode.
- Another option is to use secure computing (seccomp). It is a mechanism to block the system calls at kernel level. Docker versions beyond 1.10 support default seccomp profiles where default.json file has the list of system calls which are allowed to run inside the kernel. Based on the application which is going to run inside the container, we should find the list of system calls being called by the application and allow the kernel to expose only those system calls. List of system calls made by the application can be gathered using strace command. Based on the list, we should create another profile file in json format and start the container using that profile file.
e.g. docker run --rm -it --security-opt seccomp=custom_profile.json custom_app
- Application capabilities:-
Docker provides the default security using concept of capabilities. Capabilities are like privileges which can be added or dropped. Containers run with limited set of capabilities. So, even if someone breaks into the container, the host system is not fully compromised. A simple Docker container running /bin/bash application has following capabilities. Either getpcaps (from libcap) or pscap (provided by libcap-ng-utils) shows the list of capabilities owned by a process.
[root@localhost ~]# docker run -i -t centos_cap /bin/bash # <strong>1 is pid of bash inside container</strong>. [root@1bd42416b6d2 /]# getpcaps 1
Capabilities for `1′: =
[root@1bd42416b6d2 /]# pscap -a ppid pid name command capabilities 0 1 root bash chown, dac_override, fowner, fsetid, kill, setgid, setuid, setpcap, net_bind_service, net_raw, sys_chroot, mknod, audit_write, setfcap
So, there are multiple capabilities associated with an application by default. Application developer should find the capabilities required by the application and drop all other capabilities. Simple Docker option is to drop all the capabilities and add the required once while starting the Docker container. See example:-
e.g. docker run --rm -it –-cap-drop=ALL –cap-add=NET_BIND_SERVICE custom_app
- More security aspects:-
- Docker developments start with Docker images which are downloaded from Docker hub. It is recommended to get an image from trusted source and enhance the security aspects. Production deployments should not reply on generic OS specific distribution images.
- An application launched inside Docker container runs with root user privileges unless the UID is modified with –u option. So, it is advised to use “–u” option for Docker run command.
- Never run any Docker application as root user. Use user namespaces inside Docker containers. This is because the data volumes written by the user running inside the container should be accessible to out of the container. Think of a container running Database application.
- In case a service like SSH server needs to be run as root, run it inside a bastion host or a VM in as an isolated service.
- Linux exposes hardware devices via /dev file system. The /dev filesystem, “devices” control group should be fine-tuned inside the Docker image based on the application requirement. /proc and /sys file systems are already locked by Docker, so there is no problem.
- Do not run applications with SUID binaries inside containers, use capabilities instead.
- Check for implications, before exposing network ports from Docker containers. In case, port needs to be blocked at runtime, iptables rules can be used at host level.
- There are various options for resource limiting while spawning a container. These options are parameters passed to docker run command.
- docker run –-memory-swap=400M custom_app. # Runs container with swap memory limit of 400M
- docker run –-cpuset=0 custom_app. # Runs container which runs only on logical CPU0.
Following such guidelines makes Docker containers to run in relatively safe environments. This effectively enhances the application security which is less likely to get cracked.