Why VMs, Not Containers?

Containers share the host kernel. That's fine for trusted workloads. For untrusted code, you need something stronger. This page explains the isolation spectrum and where different tools sit on it.

The Problem

Containers are not sandboxes

Docker containers use Linux namespaces and cgroups to isolate processes. But every container on a host shares the same kernel. A kernel vulnerability, a misconfigured capability, or a container runtime bug can let code escape and access the host.

For your own microservices, this is an acceptable trade-off. For running other people's code — user submissions, AI agent tool calls, CI jobs, plugins — the shared kernel is the problem.

The escape history

error CVE-2019-5736 — runc allowed container escape via /proc/self/exe overwrite
error CVE-2020-15257 — containerd shim API exposed to host-network containers via abstract unix sockets, enabling privilege escalation
error CVE-2022-0185 — Linux kernel heap overflow reachable from unprivileged containers
error CVE-2024-21626 — runc working directory container breakout via leaked file descriptors

These affect any container runtime sharing the host kernel

The Isolation Spectrum

There's no single "right" isolation level. Every approach trades off between security, performance, and compatibility. The question is which trade-off fits your threat model.

Weakest

Docker / OCI

Namespaces and cgroups. Shared kernel. Fast, compatible, well-understood. No protection against kernel bugs.

gVisor

User-space kernel intercepts syscalls. Reduces kernel attack surface but still runs on the host kernel. Some syscall compatibility gaps.

MicroVMs

Purpose-built VMMs (Firecracker, Cloud Hypervisor, libkrun). Minimal device model, fast boot. VM isolation without the VM overhead. This is where hotcell sits.

Strongest

Full VMs

QEMU/KVM with full device emulation. Maximum isolation. Seconds-to-minutes boot, heavy memory footprint. Overkill for ephemeral workloads.

Weaker isolation, faster ← Docker — gVisor — MicroVMs — Full VMs → Stronger isolation, heavier

Approaches in Detail

gVisor

Google

gVisor intercepts application syscalls in user space, reimplementing a Linux-compatible kernel (Sentry) that handles most operations without touching the host kernel. This dramatically reduces the kernel attack surface — the application never makes direct kernel syscalls.

Strengths

+ Lightweight, no hardware virtualization needed
+ Drop-in replacement for Docker runtime
+ Reduces host kernel exposure significantly

Limitations

- Still runs on the host kernel (some syscalls pass through)
- Syscall compatibility gaps can break applications
- Performance overhead for syscall-heavy workloads

Kata Containers

OpenInfra Foundation

Kata wraps each container in its own lightweight VM, providing hardware-level isolation while presenting a standard OCI container interface. It integrates with Kubernetes via containerd and CRI-O. Kata uses the same VMMs as hotcell (Firecracker, Cloud Hypervisor, QEMU) but adds a CRI shim for Kubernetes compatibility.

Strengths

+ True VM isolation with separate kernel
+ Kubernetes-native, CRI-compatible
+ Mature, production-proven at scale

Limitations

- Complex deployment (agent, shim, hypervisor, kernel)
- Kubernetes-centric — hard to use outside K8s
- Heavier resource footprint per container

E2B

Commercial / Cloud

E2B provides cloud-hosted sandboxes purpose-built for AI agents. You call their API, they boot a Firecracker microVM, your agent runs code in it. Designed for the "let the LLM run code" use case.

Strengths

+ Zero infrastructure to manage
+ Purpose-built for AI agent tool use
+ SDKs in Python, TypeScript, etc.

Limitations

- Cloud-only — code leaves your network
- Metered by compute-second
- Closed-source isolation layer

Modal

Commercial / Cloud

Modal is a serverless compute platform for running Python functions in the cloud. Define a function, Modal runs it in a sandboxed container on their infrastructure. Focused on ML/AI workloads — training, inference, batch jobs, and increasingly agent tool execution.

Strengths

+ GPU support, excellent for ML workloads
+ Great Python developer experience
+ Scales to zero, pay-per-use

Limitations

- Cloud-only — code leaves your network
- Metered by compute-second
- Container-based isolation (not VM-level)

SlicerVM

Commercial / Self-hosted

SlicerVM provides lightweight Linux VMs that boot in under a second, backed by Firecracker. It's a VM management platform — create, run, and manage persistent or ephemeral VMs via CLI, REST API, or Go SDK. From the team behind OpenFaaS and Actuated.

Strengths

+ Production-proven (3M+ CI minutes for CNCF)
+ Self-hosted, data stays on your network
+ Full OS experience (systemd, SSH, GPU passthrough)

Limitations

- Proprietary, commercial license ($25–250/mo)
- Requires a guest agent inside VMs
- Platform, not an embeddable library

Comparison

Hotcell

Docker

gVisor

Kata

E2B

Modal

SlicerVM

Isolation

VM (4 backends)

Namespaces

User-space kernel

VM (Firecracker)

Container

VM (Firecracker)

Own kernel

Yes

Partial

Yes

Host sandboxing

20-layer jail

Basic

Moderate

Varies

Managed

Firecracker jailer

Open source

Yes (MIT)

Yes

Yes (Apache)

Self-hosted

Yes

Embeddable

Rust library

SDK only

macOS

Yes (native)

Via Docker Desktop

Cloud only

Yes

OCI images

Yes (any registry)

Yes

Custom

Dockerfile

Ephemeral tasks

Yes

Persistent VMs

Yes

N/A

Yes

VM migration

Warm (FC, CH)

N/A

Boot time

<350ms

~100ms

~1–2s

~200ms

~1s

<1s

GPU

Yes (VFIO + CUDA verified)

Yes

Limited

Yes

Cost

Free (MIT)

Free

Per-second

$25–250/mo

Where Hotcell Fits

Hotcell sits in the microVM tier of the isolation spectrum. It gives you VM-level isolation — separate kernel, separate memory, separate process tree — with <350ms end-to-end execution (boot, run, collect results) and standard OCI image support. Use it for ephemeral one-shot execution or persistent long-lived services with automatic port forwarding.

What makes hotcell different

It's a library, not a platform

Add hotcell as a Rust dependency and call backend.run(). No daemon, no sidecar, no Kubernetes. You embed it in your application.

Defense-in-depth, not just a VM

The VM is the first boundary. On Linux, the VMM process itself is jailed with 20 hardening layers: syscall filtering, filesystem restrictions, capability dropping, resource limits. If someone escapes the VM, they land in a sandbox.

Open and auditable

MIT-licensed. Every syscall in the allowlist is documented. Every hardening layer is in the source code. 37 security tests (18 adversarial escape + 19 isolation verification) verify the sandbox holds.

Ephemeral and persistent

Run one-shot commands that return structured results, or create persistent VMs that run long-lived services with automatic port forwarding. Both modes share the same security model and API.

Your hardware, your data

Runs on your machines. No cloud dependency, no metered billing, no data leaving your network. Works on macOS for development and Linux for production.

When hotcell is not the right choice

arrow_forward You need GPU on macOS — GPU passthrough requires Linux with IOMMU. On Linux, hotcell supports VFIO via the QEMU and Cloud Hypervisor backends.
arrow_forward You want zero infrastructure — E2B or Modal handle everything. Hotcell requires you to run and manage the host.
arrow_forward You need full OS management with SSH and systemd — hotcell persistent VMs run services with port forwarding, but SlicerVM offers a full Linux experience with SSH access, secret injection, and OS-level management.
arrow_forward You're already on Kubernetes — Kata Containers integrates natively with the container ecosystem. Hotcell is standalone.
arrow_forward You need production guarantees today — hotcell is experimental (v0.1.0). It has not been independently audited.

References

Boot time and performance claims on this page are sourced from published benchmarks, official documentation, and independent testing. Where possible, we cite the original measurement.

Docker / OCI

[1] Felter et al., “An Updated Performance Comparison of Virtual Machines and Linux Containers,” IBM Research, IEEE ISPASS 2015. Measured container startup <500ms vs ~30s for VMs.
[2] AWS Firecracker benchmarks. Established runc warm start at ~100–200ms for minimal workloads.

gVisor

[3] gVisor project, startup.csv. runsc (1144ms) vs runc (1193ms) for alpine+true — difference is within noise, Docker overhead dominates.
[4] Young et al., “The True Cost of Containing: A gVisor Case Study,” USENIX HotCloud 2019. gVisor overhead is in syscall interception (2.2x+), not startup.

Kata Containers

[5] Kata Containers issue #1102. Default QEMU: 2–4s. Tuned QEMU+virtio-fs: ~483ms.
[6] Kumar & Thangaraju, IEEE CONECCT 2020. Kata avg 2.06s vs runc avg 1.62s across repeated trials.
[7] Li et al., “RunD: A Lightweight Secure Container Runtime,” USENIX ATC 2022. Alibaba’s optimized Kata fork achieves 88ms boot, 200+ sandboxes/sec.

E2B

[8] E2B homepage (2026). Claims <200ms and 80ms same-region sandbox start.
[9] Athenic, “E2B vs Modal vs Fly.io Sandbox Comparison.” Independent benchmark: E2B cold start p50 410ms, p95 580ms, same-region 80ms.

Modal

[10] Modal, “Cold start performance.” Bare container infra boots in ~1 second; application init (imports, model loading) adds seconds to minutes.
[11] Bernhardsson, “What I’ve been working on: Modal.” Describes ~1s container boot with custom FUSE filesystem.

SlicerVM

[12] SlicerVM, “Getting to sub-300ms microVM sandboxes.” Ryzen 9 min image: 299ms, full image: 364ms. Mid-range hardware: 550–856ms. Measured via systemd-analyze.

Firecracker / VMM Baselines

[13] Agache et al., “Firecracker: Lightweight Virtualization for Serverless Applications,” USENIX NSDI 2020. VMM boot <125ms to /sbin/init; <5 MiB per-VM memory overhead.
[14] Cloud Hypervisor issue #1728. Measured VMM init ~43ms, kernel boot ~48ms, to userspace ~131ms.