Benchmarks¶

Honest, reproducible measurements. Every number here can be regenerated with a single command and a clean Python 3.11 venv.

What this page is: a defence of the one-dependency, loop-first design choice against frameworks that give up more to do more.

What this page is not: an end-to-end agent benchmark. None of these numbers say anything about how well any framework solves a task. They measure what you pay for loading it.

All runs: Python 3.11.13, Linux x86_64, uv-managed venvs, PyPI wheels from 2026-04-21. See the scripts/ directory for the scripts — they're short enough to read.

Cold import time¶

The time from python -c "import <pkg>" to interpreter exit. Measured as median wall-clock of 9 fresh subprocess runs (so no warm bytecode cache).

python scripts/bench_cold_import.py --runs 9 --markdown

Framework	Version	Median cold import	vs looplet
`looplet`	0.1.7	289 ms	—
`strands-agents`	1.36.0	1 885 ms	6.5×
`langgraph`	1.1.9	2 294 ms	7.9×
`claude-agent-sdk`	0.1.65	2 409 ms	8.3×
`pydantic-ai`	1.85.1	3 975 ms	13.8×

Why it matters: agents are increasingly invoked as CLI tools, serverless functions, and hot-reload dev loops. A 3-second import tax is the difference between "snappy" and "go get coffee" for every invocation that doesn't reuse a warm process. looplet's single-dependency core (typing_extensions for Python <3.12) leaves no room for surprises.

Dependency footprint¶

Count of third-party packages installed into a fresh venv by pip install <pkg>, minus pip, setuptools, and wheel.

python scripts/bench_dep_footprint.py --markdown

Install	Packages installed
`pip install looplet`	1 (just `looplet` — zero runtime deps)
`pip install looplet[all]`	20
`pip install claude-agent-sdk`	30
`pip install langgraph`	31
`pip install strands-agents`	49
`pip install pydantic-ai`	144

looplet[all] pulls in the official openai and anthropic SDKs plus their transitive deps. If you bring your own HTTP client, stay on core looplet and write a 20-line LLMBackend adapter — see docs/recipes.md.

Why it matters: every package in your environment is a potential supply-chain surface, a potential version-conflict, and a potential wheel-download delay for your container or Lambda. 144 transitive dependencies is not a free choice — it's an ambient cost.

What we don't claim¶

looplet is not faster at running tools, better at prompting, or more accurate at any task than the frameworks above. Per-step latency is dominated by the LLM round-trip; per-task accuracy depends on prompts, tools, and the model you choose.

What looplet is is small, cold-starts in under a third of a second, and hands you a for step in loop(...): iterator so you can observe and interrupt any step without learning a new graph DSL.

If that's the trade-off you want, these numbers are your defence when someone asks why.

Reproducing¶

# One-time: clean Python 3.11 venv with all four frameworks.
uv venv /tmp/bench_env --python 3.11 -q
uv pip install --python /tmp/bench_env/bin/python \
    looplet langgraph claude-agent-sdk pydantic-ai strands-agents

# Cold-import numbers.
/tmp/bench_env/bin/python scripts/bench_cold_import.py \
    --python /tmp/bench_env/bin/python --runs 9 --markdown

# Dependency footprint (creates its own throwaway venvs).
python scripts/bench_dep_footprint.py --markdown

Both scripts exit non-zero on install/import failure; that's the only signal they're broken.

History¶

Date	looplet	claude-agent-sdk	pydantic-ai	langgraph	strands-agents
2026-04-21	289 ms / 2 pkg	2 409 ms / 30 pkg	3 975 ms / 144 pkg	2 294 ms / 31 pkg	1 885 ms / 49 pkg

Re-run these when any dependency's major version ships — deps tend to move up, not down.