# Agent Sandboxing

Safely execute LLM-generated code on OpenShift with defense-in-depth isolation.

When agents execute LLM-generated code — tool calls, data analysis, or code-interpreter tasks — that code runs with the same permissions as the agent process. Without isolation, a single prompt injection can read mounted secrets, exfiltrate data over the network, or escalate privileges on the host. The code sandbox solves this by wrapping execution in multiple independent security layers. Each layer assumes the one above it has been bypassed. No single layer is sufficient; together they make exploitation impractical even when the attacker controls the input source code. > The sandbox implementation is available at > [github.com/eformat/code-sandbox](https://github.com/eformat/code-sandbox). > The repo contains the Landlock, seccomp, and guardrails code shown on this > page. The [fips-agents/examples](https://github.com/fips-agents/examples) > repo walks through deploying it on OpenShift step by step. ## Defense in Depth The sandbox uses five layers of defense, each operating at a different level of the stack:

Layer	Technology	What it blocks
1. Static analysis	AST visitor	Dangerous calls, blocked imports, dunder traversal, SQL injection
2. Isolated subprocess	`python3 -I` + preamble	Runtime import bypass, builtin abuse, memory exhaustion
3. Filesystem restriction	Landlock LSM	Reading app source, secrets, config files outside `/tmp`
4. Syscall filtering	seccomp BPF	Network sockets (TCP + UDP), io_uring, splice
5. Cluster enforcement	NetworkPolicy + SeccompProfile + SCC	All egress traffic, container escape, privilege escalation

Layers 1-4 are applied in-process by the sandbox application — AST analysis, Python subprocess isolation, Landlock filesystem restriction, and seccomp syscall filtering. Layer 5 uses OpenShift platform features — NetworkPolicy, SeccompProfile (via Security Profiles Operator), and a custom SCC — to enforce a final security boundary that the sandbox cannot bypass, even if fully compromised. The application-level and platform-level controls work in harmony: if an attacker defeats the in-process guardrails, the cluster-enforced policies still block egress traffic, prevent privilege escalation, and deny container escape. ## Agent Frameworks The sandbox supports three deployment modes:

Mode	When to use	SANDBOX_URL
Standalone service (recommended)	Production on OpenShift — own Deployment + Service with NetworkPolicy and independent scaling	`{"http://code-sandbox..svc:8000"}`
Sidecar	Simpler networking — sandbox shares the pod network with the agent, no cross-pod traffic or client labels needed	`http://localhost:8000` (default)
Local container	Development — run the sandbox image on your workstation	`http://localhost:8000` (default)

In all modes your agent sends code to `POST /execute` and gets back stdout, stderr, and exit code. Each framework wraps this in a `run_code` tool that the LLM calls to execute code safely. The available imports depend on the sandbox profile — `minimal` for stdlib only, or `data-science` for numpy, pandas, and scipy. Pick your framework below to jump to a working example. ### Environment variables Set `SANDBOX_URL` to point at your sandbox. For standalone service mode, use the in-cluster service URL (e.g. `http://code-sandbox..svc:8000`). For local development or sidecar mode the default `http://localhost:8000` works without changes. {envVarsHighlighted} ### LangGraph [LangGraph](https://langchain-ai.github.io/langgraph/) wraps the sandbox call as a plain Python function registered with `create_react_agent`. The LLM generates Python code, calls `run_code`, and summarizes the output.

run_code tool — Sends LLM-generated code to the sandbox via HTTP

### CrewAI [CrewAI](https://www.crewai.com/) uses the `@tool` decorator to register the sandbox call. The agent's role and goal instruct the LLM to write and execute code for every task.

@tool("run_code") — CrewAI tool decorator wrapping the sandbox call

### AutoGen [AutoGen](https://microsoft.github.io/autogen/) registers the sandbox call as a tool function on an `AssistantAgent`. The agent generates code and executes it in a single turn.

AssistantAgent tools=[run_code] — Tool function registered on the agent

### LlamaIndex [LlamaIndex](https://www.llamaindex.ai/) wraps the function with `FunctionTool.from_defaults` and passes it to a `ReActAgent`. The agent reasons about what code to write, executes it, and interprets the results.

FunctionTool.from_defaults(fn=run_code) — ReAct agent with sandbox code execution

### Google ADK [Google Agent Development Kit (ADK)](https://google.github.io/adk-docs/) passes the function directly as a tool. The agent instruction tells the LLM to write Python and use `run_code` for execution.

tools=[run_code] — ADK agent with sandbox code execution

## Static Analysis Before any code executes, the sandbox parses it into an AST and walks every node looking for policy violations. This catches the obvious attacks — calling `eval()`, importing `subprocess`, traversing `__globals__` — before the code ever reaches a Python interpreter. ### AST guardrails The AST visitor checks bare function calls, attribute access, subscript access, string literals, f-strings, and `%`-format operations in a single pass: {astVisitorHighlighted} The visitor also scans for credential patterns, path traversal sequences, and SQL injection via `.format()`, f-strings, and `%`-formatting. {guardrailsHighlighted} ### Import allowlist Only stdlib modules with no filesystem or network capabilities are permitted. Profiles can extend the allowlist — the data-science profile adds numpy, pandas, and scipy with additional blocklist rules for dangerous attributes on those libraries. {allowedImportsHighlighted} ## Isolated Subprocess Validated code runs in a separate `python3 -I` subprocess. The `-I` flag enables isolated mode: user site-packages are disabled and `PYTHON*` environment variables are ignored. Code is written to a temporary file under `/tmp` and cleaned up unconditionally after execution. {executorHighlighted} ### Runtime preamble A defense-in-depth preamble is prepended to every execution. Even if an attacker constructs an import name dynamically (via `chr()`, `bytes.decode()`, etc.) to bypass the AST check, the runtime import hook blocks it. The preamble also removes dangerous builtins and monkey-patches `operator.attrgetter` to reject dunder access: {preambleHighlighted} ### Resource limits The subprocess applies `RLIMIT_AS` before any imports to cap memory usage (default 512 MB, configurable per profile). A wall-clock timeout (default 10s, max 30s) kills the process if it hangs. Stdout and stderr are capped at 50 KB each to prevent output flooding. ## Landlock Filesystem Restriction [Landlock](https://landlock.io/) is a Linux Security Module (LSM) that restricts filesystem access without requiring root privileges. It is available on RHEL 9.2+ and enabled by default on OpenShift 4.18+. Landlock rules are inherited by child processes — applying them to the FastAPI app at startup automatically restricts the code execution subprocess. The subprocess then applies a *second, tighter* Landlock ruleset that drops paths the parent process needed but the subprocess should never access. ### Parent process The parent process allows read-only access to system paths and read-write access to `/tmp` only: {landlockPathsHighlighted} {landlockApplyHighlighted} Landlock requires the `no_new_privs` bit, which OpenShift's `restricted-v2` SCC sets automatically via `allowPrivilegeEscalation: false`. No extra privileges are needed. ### Subprocess tightening The subprocess applies a second Landlock ruleset that drops `/opt/app-root` (application source) and `/etc` (mounted secrets and config). Even if all Python-level defenses are bypassed, the kernel prevents reading application code or credentials. On ABI v4+ kernels, TCP bind and connect are also denied. On ABI v5+, abstract Unix sockets and cross-process signals are scoped. ## Seccomp Syscall Filtering The subprocess installs a seccomp BPF filter that blocks all networking syscalls, io_uring (a container escape vector), and `splice()` (used in CVE-2026-31431). This closes the UDP gap that Landlock v4 does not cover — Landlock restricts TCP but not UDP, while seccomp blocks `socket()` entirely: {seccompBlockedHighlighted} The filter uses `SECCOMP_RET_ERRNO` with `EPERM` so the subprocess receives a clean error rather than being killed. Wrong-architecture processes are killed immediately with `SECCOMP_RET_KILL_PROCESS`. Both x86_64 and aarch64 are supported. ## OpenShift Deployment The recommended deployment pattern runs the sandbox as a **standalone service** — its own Deployment and Service in the cluster. The agent reaches it via the internal service URL (`http://code-sandbox..svc:8000`). Agent pods need the `code-sandbox-client: "true"` label to pass the NetworkPolicy. The cluster enforces a final layer of security that the sandbox cannot bypass, even if fully compromised. Alternatively, the sandbox can run as a **sidecar container** in the agent pod, sharing the pod network namespace so the agent reaches it at `localhost:8000`. Use sidecar mode when you want simpler networking (no cross-pod traffic, no client labels needed) and are willing to couple the sandbox lifecycle to the agent pod. ### Sidecar deployment (alternative) If you prefer to run the sandbox as a sidecar instead of a standalone service, enable it in your Helm chart values. The template adds a second container to the agent pod with a read-only root filesystem, all capabilities dropped, and `/tmp` as the only writable path (10 Mi limit). Since the sidecar shares the pod network, the agent reaches it at `localhost:8000` with no NetworkPolicy client labels required. {sidecarValuesHighlighted} {sidecarDeploymentHighlighted} The `SANDBOX_URL` environment variable is injected into the agent container automatically when `sandbox.enabled` is true. ### Profiles Profiles control which imports are available and which scan stages run. The `SANDBOX_PROFILE` environment variable selects the active profile: {profilesHighlighted}

Profile	Imports	Memory	Use case
`minimal`	stdlib only (math, json, csv, etc.)	256 Mi	Computation, formatting, string manipulation
`data-science`	+ numpy, pandas, scipy	512 Mi	Data analysis, numerical computation, statistics

The data-science profile adds a **blocklist audit** stage that blocks dangerous attributes on the allowed libraries — `numpy.ctypeslib`, `pandas.read_pickle`, `scipy.io.loadmat`, and other filesystem/deserialization methods. Libraries are pre-imported before restrictions are applied so their initialization calls (which use `open()` and internal imports) succeed normally. ### Pod security context The deployment runs as non-root with a read-only root filesystem, all capabilities dropped, and a localhost seccomp profile managed by the Security Profiles Operator: {deploymentYamlHighlighted} The `/tmp` mount is an `emptyDir` with a 10 Mi size limit — the only writable path in the container. ### NetworkPolicy In standalone service mode (the recommended pattern), a zero-egress NetworkPolicy prevents the sandbox from making any outbound connections. Ingress is restricted to pods with the `code-sandbox-client: "true"` label on port 8000. This is the default configuration deployed by the Helm chart. When using sidecar mode instead, NetworkPolicy is not needed — the sandbox container only listens on localhost inside the pod. {networkPolicyHighlighted} For standalone mode, agent pods must include the client label: ```yaml metadata: labels: code-sandbox-client: "true" ``` ### SeccompProfile (SPO) The Security Profiles Operator deploys a syscall allowlist as a `SeccompProfile` custom resource. The default action is `SCMP_ACT_ERRNO` — any syscall not explicitly allowed is denied: {seccompProfileHighlighted} This blocks io_uring, splice, ptrace, module loading, mount operations, namespace manipulation (unshare/setns), and BPF — all common privilege escalation and container escape vectors. Networking syscalls are allowed at the container level because uvicorn needs them; the subprocess BPF filter blocks them for user code. ### Custom SCC OpenShift's `restricted-v2` SCC only allows `runtime/default` seccomp profiles. A custom SCC is needed to permit the SPO localhost profile: {sccHighlighted} Bind it to the default service account in the sandbox namespace: ``` oc adm policy add-scc-to-user code-sandbox-seccomp -z default -n code-sandbox ``` ## Deploying The sandbox deploys to OpenShift via an in-cluster binary build and Helm chart, following the same pattern as the [agent build](/basic-agents/hello-world#build-the-image). Nodes must run RHEL 9.6+ or RHCOS based on RHEL 9.6+ for Landlock LSM support (the sandbox degrades gracefully on older kernels). ### Clone the repo Clone the sandbox source. The build uses a `Containerfile` (not `Dockerfile`) with a UBI 9 Python 3.12 base image: {cloneRepoHighlighted} ### Build the image Create a binary BuildConfig, patch it to use the `Containerfile`, and start the build. The image is pushed to the internal registry: {buildImageHighlighted} ### Deploy with Helm The Helm chart deploys the sandbox as a Deployment + Service with the security context, NetworkPolicy, and optional SeccompProfile. The `values-standalone.yaml` file configures a single-replica standalone deployment: {helmDeployHighlighted} ### Verify Run a health check and execute code from an ephemeral pod. The pod needs the `code-sandbox-client: "true"` label to pass the NetworkPolicy: {verifyHighlighted} ### Clean up {cleanUpHighlighted}