microVM Containers: The best of both worlds

Minimal design of virtual machines integrated with Containers and why

Created by Alejandro Escanero Blanco / Twitter: @aescanero Documentation and Demo in https://github.com/aescanero/dockerevents/opensouthcode2019 Slides in Disasterproject

Containers

What is a container?

It is a process that runs isolated in its own memory space, CPU, I / O and network

In linux there are two characteristics are used for it: Namespaces and Cgroups

There are a lot of implementations in the market: Docker, LXC ...

ComoEs

Source: What even is a container: namespaces and cgroups

Source: Cgroups, namespaces, and beyond: what are containers made from?

Source: Twitter of Sergio López (@slpnix)

How does Docker to run a container?

A client (eg Docker CLI) calls the Dockerd service using the Docker API

Dockerd performs the management of the API, orchestration and communications, delegating container life cycle functions to Containerd

Containerd performs the functions of volume, network, images management and container life cycle

Finally RunC executes the container

Source: Docker components explained

How does Kubernetes to run a container?

A client calls the ApiServer service using the Kubernetes API

Kubernetes takes decision over the node on which will be executed the POD and calls the agent (Kubelet) that runs on it

The Kubelet uses the Container Runtime Interface to communicate with the containers lifecycle manager

How does Kubernetes to run a container? (2/2)

Containerd Shim v2 is a translation layer between CRI and Containerd, currently there is CRI-Containerd that supports CRI directly

There are others CRI compatible implementations such as CRI-O used in Openshift

Finally RunC executes the container

Source: Containerd Plugin for Kubernetes Container Runtime Interface

Source: Open Container Initiative-based implementation of Kubernetes Container Runtime Interface

Container standards to be considered for MicroVMs

CRI (Container Runtime Interface): A unified Kubelet access interface to the "Container Runtime"

ComoEs

CRI uses two other technologies: gRPC, project incubated in the CNCF (Cloud Native Computing Foundation) to connect the services and Protobuff for the serialization of data

Source: Container Runtime Interface (CRI)

Source: High performance, open-source universal RPC framework

Source: Language-neutral, platform-neutral extensible mechanism for serializing structured data

Container standards to be considered for MicroVMs (2/2)

The container standards over execution are managed by the Open Container Initiative, a Linux Foundation proyect which has two specifications

OCI runtime specification: Defines how the execution environment, configuration and container lifecycle must be

OCI image format specification: Defines configuration, layers and manifest of an image should be, the objective of both standards is that any tool that works with containers can use the images generated by any other.

ComoEs

Source: Open Container Initiative

Source: Configuration, execution environment, and lifecycle of a container

Source: Image Format Specification

Virtual machines

What is a virtual machine?

In Computing, virtualization is the creation through software of a virtual version of some technological resource

A Hypervisor is a manager and monitor of such virtualized resources

A virtual machine is a virtualized resource consisting of a virtual hardware and an operating system

ComoEs

Source: Virtualization

Performance and Security

Why should we use containers?

ComoEs

The market tends to use containers for their shorter time to deploy because donb't need the isolation layer and OS of a MV

Do not require intervention from the infrastructure department, although this process is reduced with the usage of configuration management tools

It is not required to enter the OS to install or prepare

Why should we use virtual machines?

A malicious attacker can use a vulnerability in runC or/and kernel to access information that exists in other containers

For large service providers this is a big problem because they must ensure that the information of any company is isolated from the rest

Therefore they need systems that improve security by isolation

STRIDE threat model

Spoofing Identity: A malicious attacker could be shown as an authorized user of the system

Tampering with Data: A malicious attacker could add, modify or delete information

Repudiation: A malicious attacker could eliminate or make impossible to demonstrate the attack

Information disclosure: A malicious attacker could access privileged information

Denial of Service: A malicious attacker could make the service unavailable

Elevation of privilege: A malicious attacker could escalate their privileges

Authenticity

Integrity

No repudiation

Confidentiality

Availability

Authorization

Options to improve container security

ComoEs

There are basically two proposals to reduce the security problem of containers

The first is based on reducing the vulnerable surface of containers

Filtering system calls made by containers with solutions like gVisor

Or by applying linux kernel security modules such as SeLinux

However, we are going to talk about a second option: Strong isolation with MicroVMs

Source: OWASP Container Security Verification Standard

Source: How to stop worrying about Application Container Security

MicroVMs OpenSource solutions

microVMs and UniKernels

ComoEs

UniKernels are specialized Kernels that have the minimum libraries to perform the function for which they are created

There have been Unikernels for more than five years like MirageOs (based on Ocaml), Unik (that despite being experimental has an important activity)

If we create a Unikernel and assign as function execute the virtualization drivers, an agent for the hypervisor or manager and the ability to lift a container, we obtain a MicroVM

MicroVMs are specialized and not general use case, this leads to reduce drivers and libraries to fit with the hypervisor where they will be executed, significantly reducing their size and deployment times

Source: Unikernels: The Rise of the Virtual Library Operating System

Source: MirageOS is a library operating system that constructs unikernels

Source: The Unikernel & MicroVM Compilation and Deployment Platform

Source: Unik Slides

Kata Containers

kata1

Kata Containers comes from the union of two products: Intel Clear Containers and Hyper.s

Currently Kata Containers is a project within the OpenStack Foundation

Kata Containers is a "Container Runtime" compatible with OCI runtime specification so it can work with Docker or Kubernetes via CRI with CRI-O or CRI-Containerd

Kata Containers creates a virtual machine with QEMU / KVM for each container or pod

For best results, Kata Containers develop a reduced QEMU version called qemu-lite which have a minimum hypervisor layer adjusted to the kernel

Source: Intel® Clear Containers: Now part of Kata Containers

Source: Kata Containers

Kata Containers (2/2)

ComoEs

Kata Containers passes to the hypervisor a kernel to boot the MV with the minimum services to run a container

The Hypervisor starts a with a minimal OS image using the indicated Kernel

Systemd executes kata-agent and this will execute a new context where to execute the command

For this, kata-agent makes use of libcontainer (it works as if it were runC)

Kata Containers integrates in OpenStack within the framework OpenStack Zun to show containers as OpenStack MVs and take advantage of the features of the platform such as SDS (Software Defined Storage) and SDN (Software Defined Network) among many others

Source: Kata Containers Architecture

Source: A Go package for building hardware virtualized container runtimes

Source: Kata Containers guest OS building scripts

FireCracker

ComoEs

Firecracker is an Amazon AWS project recently released as free software for the creation of secure containers in multitenant environments

Firecracker is a minimal hypervisor that works over KVM (qemu-lite style) used in AWS solutions: Fargate (CaaS) and Lambda (FaaS)

Its device model is even smaller than qemu-lite device model and owns minimal SO image uses vsock for firecracker-agent communication, to raise the containers use runC

Source: Firecracker

Source: virtio-vsock

VMware VIC

ComoEs

vSphere Integrated Container is a multiple opensource components pack under Apache 2 license. These components are integrated into a single product (vic-product)

It is a compatible docker enginer (client versions 1.13 and API 1.25), and it is integrated directly into vSphere hypervisors, creating containers as virtual machines

This lets you to use the storage resources (datastores, vSAN) and network (dVswitch, NSX) of the VMware platform

Source: vSphere Integrated Containers Engine

VMware VIC Elements

ComoEs

The solution integrates a Container Registry (Harbor)

A multitenant Portal (Admiral)

A connector to manage VCHs from vCenter (vSphere plugin)

End Points that speak docker (VCH) or virtual machines (Photon OS) with pre-installed docker

Source: Harbor

Source: Admiral

Source: Photon OS

microVM Containers: The best of both worlds

Minimal design of virtual machines integrated with Containers and why

Created by Alejandro Escanero Blanco / Twitter: @aescanero Documentation and Demo in https://github.com/aescanero/dockerevents/opensouthcode2019 Slides in Disasterproject