Guides
Conversation Management
The Conversation API provides automatic multi-turn history management backed by an actor. Every Conversation is Clone + Send + 'static, uses zero mutexes, and can be safely shared across async tasks.
Architecture overview
Conversation is a thin handle to a ConversationActor that owns the conversation history. All mutations are serialized through the actor's mailbox, which means:
- No mutexes -- state is protected by the actor mailbox, not locks
- Atomic reads --
len(),is_empty(), andshould_exit()use atomics for lock-free access - Watch channels --
history()andsystem_prompt()usetokio::sync::watchfor efficient snapshots - Clone + Send + 'static -- the handle can be cloned and shared freely across tasks
+-------------------+
conv.send() -> | ConversationActor | (owns Vec<Message>)
| - push user msg |
| - call LLM |
| - push assistant |
+-------------------+
|
watch channel
|
conv.history() (snapshot)
Creating a conversation
Use the ConversationBuilder returned by ActonAI::conversation():
use acton_ai::prelude::*;
let runtime = ActonAI::builder()
.app_name("my-app")
.from_config()?
.with_builtins()
.launch()
.await?;
let conv = runtime.conversation()
.system("You are a helpful Rust tutor.")
.build()
.await;
Builder methods
| Method | Description |
|---|---|
.system("prompt") | Set the system prompt for all messages |
.restore(messages) | Restore history from a previous session |
.with_exit_tool() | Enable the built-in exit detection tool |
.without_exit_tool() | Explicitly disable the exit tool |
.build().await | Spawn the actor and return a Conversation |
.run_chat().await | Build and immediately start an interactive chat loop |
.run_chat_with(config).await | Build and start a chat loop with custom config |
Sending messages and getting responses
The send() method is the primary way to interact with a conversation. It automatically:
- Adds the user message to the history
- Sends the full history to the LLM
- Adds the assistant's response to the history
- Returns the collected response
let response = conv.send("What is ownership in Rust?").await?;
println!("Assistant: {}", response.text);
// The conversation remembers context
let response = conv.send("How does borrowing relate to that?").await?;
println!("Assistant: {}", response.text);
// The LLM sees both the ownership question and its answer as context
The returned CollectedResponse includes:
text-- the full response texttoken_count-- number of tokens usedstop_reason-- why the LLM stopped generatingtool_calls-- any tools the LLM invoked
Streaming within conversations
For real-time token delivery, use send_streaming() with a token-handling actor:
use acton_ai::prelude::*;
use acton_ai::conversation::StreamToken;
use std::io::Write;
// Create a token-handling actor
let mut actor_runtime = runtime.runtime().clone();
let mut token_actor = actor_runtime.new_actor::<MyTokenPrinter>();
token_actor.mutate_on::<StreamToken>(|_actor, ctx| {
print!("{}", ctx.message().text);
std::io::stdout().flush().ok();
Reply::ready()
});
let token_handle = token_actor.start().await;
// Stream tokens to the actor
let response = conv.send_streaming("Tell me about Rust's type system", &token_handle).await?;
println!(); // Newline after streaming
StreamToken message type
The StreamToken message has a single field text: String containing the token. Register a handler for this message type on any actor to receive streaming tokens from conversations.
History management
Getting the history
history() returns a snapshot of the current conversation:
let messages = conv.history();
for msg in &messages {
println!("{:?}: {}", msg.role, msg.content);
}
Checking history size
Use the lock-free atomic accessors:
println!("Messages: {}", conv.len());
println!("Empty: {}", conv.is_empty());
Clearing history
Reset the conversation to start fresh while keeping the system prompt:
conv.send("Topic A discussion...").await?;
conv.clear(); // Fire-and-forget, processed after in-flight sends
conv.send("Topic B discussion...").await?;
// The LLM only sees Topic B, not Topic A
Restoring history
Load a previously saved conversation when building:
use acton_ai::messages::Message;
let saved_history = vec![
Message::user("What is Rust?"),
Message::assistant("Rust is a systems programming language..."),
];
let conv = runtime.conversation()
.system("You are a Rust tutor.")
.restore(saved_history)
.build()
.await;
// The conversation continues from where it left off
let response = conv.send("Tell me more about its memory model.").await?;
Context window management
Every turn of a Conversation sends the entire accumulated history to the LLM. Without bounds, a long-lived session eventually exceeds the provider's context window and fails at the model boundary, and token cost per turn grows linearly with session length.
By default, acton-ai truncates the history on each turn to fit a configurable token budget using the KeepRecent strategy — the newest user message is always kept; older turns are dropped until everything fits. The system prompt is carried out-of-band by the prompt builder and is never subject to truncation.
Default budget
The runtime resolves the budget at launch() time with this precedence:
- Per-provider
context_window_tokensfor the default provider (if set). - Global
[context] max_tokensinacton-ai.toml. - Built-in default of 8192 tokens (with 1024 reserved for the response).
The default token estimator is tiktoken-rs: o200k_base for GPT-4o / o-series models, cl100k_base for GPT-4 / GPT-3.5, and cl100k_base as a fallback for Anthropic and Ollama models (accurate ±10% — sufficient for budgeting).
Configuring via TOML
[providers.ollama]
type = "ollama"
model = "qwen2.5:7b"
context_window_tokens = 32000 # native limit for this model
[providers.claude]
type = "anthropic"
model = "claude-sonnet-4-20250514"
context_window_tokens = 200000 # Claude's native limit
[context]
max_tokens = 8192 # fallback when provider doesn't set it
reserved_for_response = 1024
strategy = "keep-recent" # "keep-recent" | "keep-system-and-recent" | "keep-ends"
Overriding or opting out in code
use acton_ai::prelude::*;
use acton_ai::memory::{ContextWindow, ContextWindowConfig, TruncationStrategy};
// Custom window for the whole runtime
let cw = ContextWindow::new(
ContextWindowConfig {
max_tokens: 16_384,
truncation_strategy: TruncationStrategy::KeepEnds,
reserved_for_response: 2048,
tokens_per_char: 0.25,
}
);
let runtime = ActonAI::builder()
.app_name("my-app")
.ollama("qwen2.5:7b")
.context_window(cw)
.launch()
.await?;
// Or opt out entirely (unbounded history per turn — the pre-wiring behavior)
let runtime = ActonAI::builder()
.app_name("my-app")
.ollama("qwen2.5:7b")
.without_context_window()
.launch()
.await?;
Per-Conversation overrides work the same way:
let conv = runtime.conversation()
.system("You are brief.")
.without_context_window() // this conversation ships full history
.build()
.await;
Observing truncation
When history is actually clipped, a tracing::warn! fires once per turn with the drop counts. With -v on the CLI:
WARN acton_ai::conversation: truncated conversation history to fit context window
dropped_messages=12 dropped_tokens=4231 kept_messages=8 kept_tokens=7801 max_tokens=8192
System prompt management
Setting the system prompt at build time
let conv = runtime.conversation()
.system("You are a concise assistant. Answer in one sentence.")
.build()
.await;
Changing the system prompt mid-conversation
// Read the current prompt
if let Some(prompt) = conv.system_prompt() {
println!("Current: {}", prompt);
}
// Change it (takes effect on the next send)
conv.set_system_prompt("You are now a creative writing assistant.");
// Clear it entirely
conv.clear_system_prompt();
Fire-and-forget updates
set_system_prompt() and clear_system_prompt() are fire-and-forget operations. They send a message to the actor and return immediately. The change takes effect on the next send() call.
Exit tool and interactive chat loops
The exit tool
The Conversation includes a built-in exit_conversation tool that the LLM can call when it detects the user wants to leave. When called, it sets an atomic flag you can check with should_exit().
let conv = runtime.conversation()
.system("Help the user. Use exit_conversation when they say goodbye.")
.with_exit_tool()
.build()
.await;
loop {
let input = read_user_input();
let response = conv.send(&input).await?;
println!("{}", response.text);
if conv.should_exit() {
println!("Goodbye!");
break;
}
}
You can also clear the exit flag for confirmation flows:
if conv.should_exit() {
println!("Are you sure you want to leave? (yes/no)");
let answer = read_user_input();
if answer != "yes" {
conv.clear_exit(); // Reset and continue
continue;
}
}
run_chat() -- the minimal chat loop
run_chat() handles stdin reading, streaming, exit detection, and EOF in a single call:
use acton_ai::prelude::*;
ActonAI::builder()
.app_name("chat")
.from_config()?
.with_builtins()
.launch()
.await?
.conversation()
.run_chat()
.await?;
This is equivalent to building a conversation and calling run_chat() on it. The exit tool is automatically enabled, and a default system prompt is used if none was set.
run_chat_with() -- customized chat loops
Use ChatConfig to customize prompts and input processing:
use acton_ai::prelude::*;
use acton_ai::conversation::ChatConfig;
let conv = runtime.conversation()
.system("You are a coding assistant.")
.build()
.await;
conv.run_chat_with(
ChatConfig::new()
.user_prompt(">>> ") // Custom input prompt
.assistant_prompt("AI: ") // Custom response prefix
.map_input(|s| { // Transform input before sending
format!("[user:admin] {}", s)
})
).await?;
ChatConfig options
| Method | Default | Description |
|---|---|---|
.user_prompt(">>> ") | "You: " | Prompt shown before user input |
.assistant_prompt("AI: ") | "Assistant: " | Prefix before assistant responses |
.map_input(fn) | None | Transform user input before sending to LLM |
The map_input callback is useful for injecting context, adding metadata, or preprocessing user input:
ChatConfig::new()
.map_input(|input| {
let timestamp = chrono::Local::now().format("%H:%M:%S");
format!("[{}] {}", timestamp, input)
})
Default system prompt
When run_chat() or run_chat_with() is called without a system prompt, this default is used:
You are a helpful assistant with access to various tools.
Use tools when appropriate to help the user.
When the user wants to end the conversation (says goodbye, bye, quit, exit, etc.),
use the exit_conversation tool.
Zero-mutex design
The Conversation handle achieves thread safety without any Mutex or RwLock:
| Data | Synchronization | Access pattern |
|---|---|---|
| Conversation history | Actor mailbox + watch::channel | Writes serialized by mailbox; reads via watch snapshot |
| History length | AtomicUsize | Lock-free read with Ordering::SeqCst |
| Exit flag | AtomicBool | Lock-free read/write |
| Exit tool enabled | AtomicBool | Lock-free read/write |
| System prompt | watch::channel | Reads via watch snapshot |
This design means:
send()blocks the mailbox during the LLM call, guaranteeing orderinghistory()returns an instant snapshot without waiting for in-flight sendslen(),is_empty(), andshould_exit()are always non-blocking
Sharing conversations across tasks
Because Conversation is Clone + Send + 'static, you can share it across tokio tasks:
let conv = runtime.conversation()
.system("You are helpful.")
.build()
.await;
// Clone for use in another task
let conv_clone = conv.clone();
let handle = tokio::spawn(async move {
let response = conv_clone.send("Background question").await?;
Ok::<_, ActonAIError>(response.text)
});
// Meanwhile, use the original
let response = conv.send("Foreground question").await?;
Serialized sends
While Conversation is safe to share across tasks, sends are serialized through the actor mailbox. Two concurrent send() calls will execute one after the other, not in parallel. This is by design -- it guarantees history consistency.
Next steps
- Multi-Agent Collaboration -- coordinate multiple conversations across agents
- Error Handling -- handle
ActonAIErrorfrom conversation operations - Testing Your Agents -- test conversation flows with mock providers