Changelog¶
All notable changes to looplet are documented here. The format is
based on Keep a Changelog, and
this project adheres to Semantic Versioning.
[Unreleased]¶
Added¶
docs/faq.md: "Why not LangGraph?" honest comparison (thanks @mvanhorn, #17)
[0.1.7] - 2026-04-21¶
First public release of looplet.
Added (launch polish)¶
ROADMAP.mdwith a frozen v1.0 API contract and explicit out-of-scope list.docs/site scaffold (tutorial, evals, recipes, hooks, good-first-issues, discussions-seed, demo-script) + mkdocs-material config + GitHub Pages workflow.THIRD_PARTY_USERS.mdsocial-proof seed.src/looplet/examples/ollama_hello.py— zero-API-key onboarding.- Codecov upload step in CI (non-blocking).
- Leaner README (<170 lines) with the pydantic-ai-harness disambiguation moved to the top.
Added (evals — pytest-style agent evaluation)¶
- Eval framework (
looplet.evals). Writeeval_*functions that takeEvalContextand return any offloat,bool,str,dict, orEvalResult. The framework normalizes all return types. eval_discover(path)— auto-discovers eval functions ineval_*.pyfiles (like pytest discoverstest_*).eval_run(evals, ctx)— runs evaluators, auto-detectsllmparameter for LLM-as-judge, catches errors gracefully.eval_run_batch(evals, contexts)— runs same evals across multiple trajectories with per-eval avg/min/max aggregation.eval_mark(*tags)— decorator for categorizing evals.eval_runandeval_run_batchacceptinclude=/exclude=to filter by marks.eval_cli(args)— CLI runner with threshold-based pass/fail exit codes for CI integration.EvalHook— LoopHook that builds EvalContext aton_loop_endand runs all evaluators automatically during development.EvalContext.from_trajectory_dir()— loads context from saved trajectories with support for both looplet and benchmark formats.
Added (MCP + skills)¶
MCPToolAdapter— wraps MCP server tools asToolSpecinstances via JSON-RPC over stdio. No MCP SDK required.Skill— bundles tools + context + prompt fragment into one loadable unit.skill.register(registry)adds all tools.
Added (approval)¶
ApprovalHook— stops the loop when a tool returnsneeds_approval=True. Combined withcheckpoint_dirfor crash-safe async human-in-the-loop approval.- Renamed
elicit→approvaluniformly:LoopConfig.approval_handler,ToolContext.request_approval,ToolContext.approve().
Changed (naming cleanup)¶
- Renamed internal names for clarity:
coerce_text→to_text,DiminishingReturnsTracker→StallDetector,reactive_compact→emergency_truncate,compress_session_log→age_session_entries,enforce_result_budget→trim_results,should_compress_context→is_context_oversized,HEAVY_BLOCK_KINDS→LARGE_CONTENT_TYPES,DefaultSummarizer→default_summarizer. - Renamed compact services:
DefaultCompactService→TruncateCompact,LLMCompactService→SummarizeCompact. - Renamed
normalise_hook_return→normalize_hook_return. - Moved
concurrent_dispatchandreactive_recoveryfromFLAGSglobal singleton toLoopConfigfields. - Trimmed
__all__from 154 → 54 symbols organized into labeled tiers.
Changed (developer experience)¶
- Added
preview_prompt()— shows what the LLM sees before the first call. Invaluable for debugging. - Added
TrajectoryRecorder.summary()— one-liner run summary. - Added
--trace DIRto coding_agent example for trajectory recording. - Added step-by-step tutorial to README (5 progressive steps).
- Added
LoopConfigdocstring with "start here" guide listing the 4 essential fields. - Added
FileCheckpointStore.load_latest()+ auto-resume wiring incomposable_loop— crash-resume is now one line:LoopConfig(checkpoint_dir="./ckpt").
Removed¶
- Removed
async_loop.py(feature-frozen, no consumers). - Removed 3 mock examples (calculator, code_review, research).
Replaced with
hello_world.py(real LLM) +coding_agent.py(Claude Code-equivalent tools: bash, read, write, edit, glob, grep, think, done). - Removed all back-compat aliases.
- Removed all internal project references (cadence, primal_security).
Added (compaction strategies)¶
PruneToolResults— new zero-LLM-call compaction service that clears old tool-result content while keeping conversation structure intact. Configurablekeep_recent(how many recent tool results to preserve) andcompactable_tools(restrict to specific tools). Cheapest possible compaction — use as the first stage in a chain.compact_chain(*services)— combinator that tries compaction services in order; first stage that has an effect wins. Replaces the need for a separateChainedCompactServiceclass. Usage:compact_chain(PruneToolResults(), SummarizeCompact(), TruncateCompact()).CompactOutcome.cleanup— optional post-compact callback. When set,run_compact()invokes it after firingPOST_COMPACT. Use for domain-specific state resets (clear caches, re-inject context, reset token baselines) without the loop knowing details.
Changed (renames — back-compat aliases kept)¶
DefaultCompactService→TruncateCompact— clearer name for "drop old entries, keep N recent, zero LLM calls."LLMCompactService→SummarizeCompact— clearer name for "LLM summarizes middle, keeps N recent."- Old names (
DefaultCompactService,LLMCompactService) remain as aliases and continue to work.
Added (context management pt. 2)¶
- Prompt caching infrastructure (
looplet.cache). NewCachePolicydataclass declares which stable prompt sections (system prompt, tool schemas, memory) should carry Anthropic-stylecache_controlmarkers, with per-section TTL (ephemeral/1h).LoopConfig.cache_policythreads per-turnCacheBreakpointlists (label + SHA-256 hash + TTL) to backends that exposegenerate_with_cache(..., cache_breakpoints=[...]). Backends without the kwarg keep working unchanged — caching is strictly additive.CacheBreakDetectorships as a drop-in observer hook that records section-hash changes across turns for cache-miss telemetry. LLMCompactService— new compaction strategy that spends one LLM call to summarise the session. Produces a dense 4-section summary (task goal, findings, open questions, recent decisions) spliced into the session log as a synthetic entry after keep-recent pruning. Falls back to deterministic keep-recent on any summariser error. Trade-off vsDefaultCompactService: one LLM call per compaction for preserved reasoning chains.- Threshold-tier context budgeting (
looplet.budget). NewContextBudgetdataclass withwarning_at/error_at/compact_buffertiers.ThresholdCompactHookis a ready-to-registershould_compactimplementation that fires proactive compaction once estimated tokens cross the configured tier.BudgetTelemetryobserver records per-step tier samples and exposespeak_tierfor production dashboards.
Added (architecture improvements)¶
- Proactive compact hook slot —
LoopHook.should_compact(state, session_log, conversation, step_num) -> bool. Fires at the top of each step, before prompt build. Any hook returningTruetriggers the configuredCompactServicepreemptively. Complements the reactiveprompt_too_longpath — use for message-count or token-estimate heuristics.StreamingHookgets a no-op stub. - Tool-result streaming via
TOOL_PROGRESS— newLifecycleEvent.TOOL_PROGRESS. When hooks are present, the loop builds aToolContext.on_progresscallback per tool-call that emitsTOOL_PROGRESS(with the originatingtool_call) whenever the tool invokesctx.report_progress(stage, data). Observers can stream intermediate output from long-running tools without blocking dispatch. - Budget-aware turn continuation — new
LoopConfig.max_turn_continuations: int = 0. When> 0and the backend exposeslast_stop_reason,llm_call_with_retrywill re-prompt up to N times onstop_reason == "max_tokens"and concatenate outputs so long thoughts aren't truncated mid-message.LLMResultgainsstop_reasonandcontinuationsfields. build_briefing/build_promptas hook slots — both are now optional methods onLoopHook. First hook returning a non-Nonestring wins; the loop falls back toLoopConfig.build_briefing/config.build_prompt/ the built-in default. Lets domain hooks own prompt construction without threading callables throughLoopConfigseparately.DomainAdapter— new dataclass bundling the five domain callables (build_briefing,extract_entities,build_trace,build_prompt,extract_step_metadata) into a single object.LoopConfig.domain: DomainAdapter | None = Noneseeds matching flat fields when they areNone. Flat fields still win over the adapter, which wins over built-in defaults — use the adapter to package a reusable agent in one handle instead of five kwargs.
Removed (breaking)¶
InvestigationLogbackward-compat alias is gone — useSessionLogdirectly.HARNESS_FLAGSbackward-compat alias is gone — useFLAGS.- Legacy
CADENCE_*environment variables for feature flags are no longer read; use theLOOPLET_*prefix. _clone_tools_excludingprivate alias is gone — useclone_tools_excluding.LoopConfig.permissionsis gone. Register aPermissionHook(PermissionEngine(...))inhooks=[...]instead — it flows through the same unifiedHookDecision+ event bus as every other hook.
Added¶
- Unified hook vocabulary —
HookDecision(looplet.hook_decision). All hook slots now accept a singleHookDecisionreturn type (legacyNone/bool/strreturns still work vianormalise_hook_return). HelpersAllow(),Deny(reason),Block(reason),Stop(reason),Continue(),InjectContext(text)make intent explicit at the call site. - Lifecycle events —
on_event(payload)(looplet.events).LoopHookgained an optionalon_event(EventPayload)method. The loop now fires 10 named events:SESSION_START,PRE_LLM_CALL,POST_LLM_RESPONSE,PRE_TOOL_USE,POST_TOOL_USE,POST_TOOL_FAILURE,PRE_COMPACT,POST_COMPACT,STOP,SUBAGENT_START,SUBAGENT_STOP. Any hook can subscribe with a single method instead of implementing every slot. PermissionHook(looplet.permissions) — wrapsPermissionEngineand plugs it into the event bus so policy decisions flow through the sameHookDecisionpath as custom hooks.CompactService+DefaultCompactService+run_compact(...)(looplet.compact) — reactive compaction is now a swappable service withPRE_COMPACT/POST_COMPACTevents.LoopConfig.render_messages_override— byte-exact escape hatch. Receives(messages, default_prompt, step_num)and returns the exact prompt string sent to the LLM. Lets advanced callers take full control of prompt rendering without forking the loop.- First-class subagents —
run_sub_loop(..., subagent_id=...)now firesSUBAGENT_START/SUBAGENT_STOPevents on the parent's hooks and returnssubagent_idin the result dict for correlation. replay_loop(trace_dir, tools=...)— rerun a captured trace through a freshcomposable_loopwithout calling the LLM again. Useful for golden-trajectory regression tests, hook A/Bs, and cost-free loop diffs. RaisesRuntimeErrorif the replay loop requests more calls than were recorded or diverges in method (generatevsgenerate_with_tools). Falls back tocall_NN_response.txtfiles whenmanifest.jsonlis missing.python -m looplet show <trace-dir>— stdlib-only CLI that prints a one-page summary of a captured trace (run id, termination, per-step tool calls with durations, LLM totals). Exit code 1 when the directory is missing or malformed.looplet.provenance— new module for debugging agent runs:RecordingLLMBackend/AsyncRecordingLLMBackendwrap any backend and capture every prompt, system prompt, tool schema, response, duration, and error asLLMCallrecords.generate_with_toolsis surfaced only when the wrapped backend supports it, soNativeToolBackenddetection stays honest.TrajectoryRecorderhook captures a structuredTrajectoryper run (steps, context-before, termination reason, embeddedTracerspans) and writestrajectory.json+steps/step_NN.json.ProvenanceSinkis a 3-line facade:wrap_llm(...),trajectory_hook(),flush().- On-disk layout is diff-friendly:
call_NN_prompt.txt/call_NN_response.txtper LLM call plus amanifest.jsonl. - Both recorders accept
redact=for secret scrubbing andmax_chars_per_call=for bounded memory. - See Provenance guide for API reference, recipes, and performance notes.
Step.pretty()— human-readable CLI formatter complementingStep.summary()(which is tuned for LLM context assembly).
[0.1.6] - 2026-04-17¶
Added¶
looplet.testing— public test-utility module exposingMockLLMBackendandAsyncMockLLMBackend(scripted, zero-dependency) so downstream packages can unit-test hooks, tools, and backends without a real LLM provider.- PyPI publish workflow (
.github/workflows/publish.yml) that builds + publishes on version tags via PyPI trusted publishing. - README positioning matrix comparing
loopletto LangGraph, DSPy, and smolagents; observability/OTel wiring example; stability & versioning policy; realAnthropicBackendusage in quick-start.
Fixed¶
resume_loop_state()now restores the checkpointedConversationthread (was silently dropping multi-turn message history on resume).RoutingLLMBackend.generate_with_toolsis now gated dynamically via__getattr__sohasattr(llm, "generate_with_tools")returns a truthful answer for the currently-selected backend (consistent with_FallbackLLMandCostTracker).- Async
__llm_error__step is now recorded through_historyto match the sync loop (previously caused session-log/conversation drift on LLM failure).
Previously added in this release¶
ToolErrortaxonomy — structuredErrorKindenum (PERMISSION_DENIED,TIMEOUT,VALIDATION,EXECUTION,PARSE,CONTEXT_OVERFLOW,RATE_LIMIT,NETWORK,CANCELLED) plus aToolErrordataclass.ToolResultnow carries botherror: str(for JSON-safe display) anderror_detail: ToolError(for introspection).PermissionEngine— declarativeALLOW/DENY/ASK/DEFAULTrules with fail-closedarg_matcher, plug-inask_handlerfor human-in-the-loop, and an append-only denial audit log.CancelToken— cooperative cancellation is now threaded throughLoopConfig→llm_call_with_retry/async_llm_call_with_retry→ToolContext.cancel_token, so both the next LLM call and any in-flight tool can stop cleanly.ToolContext.elicit—LoopConfig.elicit_handlersurfaces a genericelicit(prompt) → strprotocol to tools for interactive prompts.- Multi-block messages —
Message.contentsupports alistofContentBlock(kind, data)alongside plainstr.HEAVY_BLOCK_KINDS(image/audio/video/binary) are stripped before summarization. - Async
build_trace—async_composable_loopnow stashes the built trace onstate.traceat exit (async generators can'treturna value). SyncToAsyncAdapter.generate_with_tools— router-selected sync backends keep native-tools support in the async loop.- Preflight context check — async loop matches sync by skipping a
doomed LLM call when the prompt is already too long under
FLAGS.reactive_recovery. - Checkpoint state counters —
resume_loop_statenow round-tripsstate.queries_usedandstate.budget_remainingso budget enforcement continues across resume.
Changed¶
ToolResult.errornarrowed back tostr | None(JSON-safe). UseToolResult.error_detailfor structured introspection.PermissionRule.matches()now fails closed per decision type:DENYrules match on matcher errors (block),ALLOW/ASKrules do not (don't accidentally grant).PermissionEngine._resolve_defaultcollapses ambiguous engine defaults (ASK/DEFAULT) toDENYso a decision never leaks into aPermissionOutcomewhere both.allowedand.deniedare False.ToolSpec._accepts_ctxis computed eagerly atregister()time (and self-heals indispatch()for specs inserted directly)._backend_accepts_cancel_tokencache keyed by(type, method_name)instead ofid()(eliminates id-recycling hazard)._classify_exceptionbroadened to detectasyncio.CancelledError, rate-limit, context-overflow, and parse exceptions by class name / message content.SyncToAsyncAdapter._adapter_cachenow prefers the backend object itself as the dict key, withid()as a fallback for unhashable backends.SessionLog.to_list()includesrecall_keyfor full round-trip through checkpoints.ToolError.contextnow round-trips throughConversation.serialize/deserialize.- Permission-denied results from hooks now populate
error_detailwithErrorKind.PERMISSION_DENIED(parity with thePermissionEnginepath) in both sync and async loops.
Fixed¶
_rebuild_promptnow rendersmemoryand falls back to the structuredbuild_promptfromlooplet.promptsinstead of a bare f-string, restoring parity with the first-pass build._deserialize_messagenow reconstructsToolErrorfrom serializederror_kind/error_retriable/error_contextfields._NullSessionLog(async) gained the attributes the async loop expects:entries,current_theory,to_list(),compact().
[0.1.5] - initial public import¶
- Initial release as a standalone package. See the extraction commit history for the pre-extraction development timeline.