Compaction
Compaction and branch summarization manage context window limits by summarizing older conversation history, keeping the agent productive during long sessions.
Overview
When Compaction Triggers
Auto-compaction triggers when:
Where:
contextTokensis calculated from the last assistant message'susagefieldcontextWindowis the model's context window sizereserveTokensis the configured reserve (default: 16,384 tokens)
This ensures there is always room for the next prompt and response.
Manual Trigger
The optional instructions focus the summary on specific aspects. For example:
How Compaction Works
Step 1: Find the Cut Point
Walk backwards from the newest entry, accumulating estimated token counts. Stop when keepRecentTokens (default: 16,384) worth of content has been accumulated. The cut point is where older messages will be summarized.
Step 2: Generate the Summary
The messages before the cut point are serialized and sent to the LLM with a summarization prompt. The LLM produces a structured summary.
If there is an existing compaction summary from a previous compaction, it is included as context for an iterative update, so information accumulates across compactions.
Step 3: Append CompactionEntry
A CompactionEntry is appended to the session with the summary text, the ID of the first kept entry, the token count before compaction, and file tracking details.
Step 4: Rebuild Context
After compaction, buildSessionContext() uses the compaction entry:
The compaction summary is injected as a user message with the prefix:
Step 5: Continue the Session
The agent continues with the reduced context. Future messages are appended as normal. When context fills up again, another compaction occurs, updating the previous summary.
Split Turns
The cut point can fall in the middle of a turn (between an assistant message with tool calls and its tool results). In this case, a "split turn" occurs:
When a turn is split, the orphaned tool results are summarized separately and combined into the compaction summary. This ensures tool results are never left without their corresponding tool calls.
Cut Point Rules
- Cut points can be at user messages or assistant messages (never tool results)
- When cutting at an assistant message with tool calls, the tool results that follow are kept
- Tool result messages are never valid cut points because they would be orphaned from their tool calls
- The algorithm always keeps at least
keepRecentTokensworth of content
CompactionEntry
Branch Summarization
When It Triggers
Branch summarization occurs when navigating to a different point in the session tree via /tree and the user chooses to summarize the abandoned branch.
How It Works
Step 1: Collect Entries
Walk from the current leaf (the branch being abandoned) back to the common ancestor with the target position. All entries along this path are collected for summarization.
Step 2: Prepare with Token Budget
Walk entries from newest to oldest, adding messages until the token budget is reached. This ensures the most recent context is preserved when the branch is too long for the summarization model's context window.
Step 3: Extract File Operations
Collect file operations from:
- Tool calls in assistant messages (
read,write,edittools) - Existing
BranchSummaryEntrydetails (for cumulative tracking across multiple navigations)
Step 4: Generate Summary
The collected messages are serialized and sent to the LLM with a summarization prompt. The LLM produces a structured summary.
Step 5: Append BranchSummaryEntry
A BranchSummaryEntry is appended to the session at the branch point (the target position).
Cumulative File Tracking
File operations are tracked cumulatively across branch summaries. When summarizing a branch that itself contains BranchSummaryEntry entries, the file operations from those entries are merged with the new file operations. This ensures complete file tracking even across multiple tree navigations.
BranchSummaryEntry
Summary Format
Both compaction and branch summaries follow a structured format:
Message Serialization
Before summarization, messages are serialized to a text format using serializeConversation(). This prevents the summarization model from treating the content as a conversation to continue:
The serialized format uses clear role markers ([USER], [ASSISTANT], [TOOL_RESULT]) and separates messages with dividers.
Custom Summarization via Extensions
session_before_compact
Extensions can intercept compaction and provide custom summaries:
session_before_tree
Extensions can intercept branch summarization:
Settings
Compaction behavior is controlled via settings.jsonl:
You can also toggle auto-compaction at runtime:
Or manually trigger compaction: