Guides

Conversation Management

The Conversation API provides automatic multi-turn history management backed by an actor. Every Conversation is Clone + Send + 'static, uses zero mutexes, and can be safely shared across async tasks.


Architecture overview

Conversation is a thin handle to a ConversationActor that owns the conversation history. All mutations are serialized through the actor's mailbox, which means:

  • No mutexes -- state is protected by the actor mailbox, not locks
  • Atomic reads -- len(), is_empty(), and should_exit() use atomics for lock-free access
  • Watch channels -- history() and system_prompt() use tokio::sync::watch for efficient snapshots
  • Clone + Send + 'static -- the handle can be cloned and shared freely across tasks
                  +-------------------+
  conv.send() -> | ConversationActor |  (owns Vec<Message>)
                  |  - push user msg  |
                  |  - call LLM       |
                  |  - push assistant  |
                  +-------------------+
                          |
                    watch channel
                          |
                  conv.history()  (snapshot)

Creating a conversation

Use the ConversationBuilder returned by ActonAI::conversation():

use acton_ai::prelude::*;

let runtime = ActonAI::builder()
    .app_name("my-app")
    .from_config()?
    .with_builtins()
    .launch()
    .await?;

let conv = runtime.conversation()
    .system("You are a helpful Rust tutor.")
    .build()
    .await;

Builder methods

MethodDescription
.system("prompt")Set the system prompt for all messages
.restore(messages)Restore history from a previous session
.with_exit_tool()Enable the built-in exit detection tool
.without_exit_tool()Explicitly disable the exit tool
.build().awaitSpawn the actor and return a Conversation
.run_chat().awaitBuild and immediately start an interactive chat loop
.run_chat_with(config).awaitBuild and start a chat loop with custom config

Sending messages and getting responses

The send() method is the primary way to interact with a conversation. It automatically:

  1. Adds the user message to the history
  2. Sends the full history to the LLM
  3. Adds the assistant's response to the history
  4. Returns the collected response
let response = conv.send("What is ownership in Rust?").await?;
println!("Assistant: {}", response.text);

// The conversation remembers context
let response = conv.send("How does borrowing relate to that?").await?;
println!("Assistant: {}", response.text);
// The LLM sees both the ownership question and its answer as context

The returned CollectedResponse includes:

  • text -- the full response text
  • token_count -- number of tokens used
  • stop_reason -- why the LLM stopped generating
  • tool_calls -- any tools the LLM invoked

Streaming within conversations

For real-time token delivery, use send_streaming() with a token-handling actor:

use acton_ai::prelude::*;
use acton_ai::conversation::StreamToken;
use std::io::Write;

// Create a token-handling actor
let mut actor_runtime = runtime.runtime().clone();
let mut token_actor = actor_runtime.new_actor::<MyTokenPrinter>();
token_actor.mutate_on::<StreamToken>(|_actor, ctx| {
    print!("{}", ctx.message().text);
    std::io::stdout().flush().ok();
    Reply::ready()
});
let token_handle = token_actor.start().await;

// Stream tokens to the actor
let response = conv.send_streaming("Tell me about Rust's type system", &token_handle).await?;
println!(); // Newline after streaming

StreamToken message type

The StreamToken message has a single field text: String containing the token. Register a handler for this message type on any actor to receive streaming tokens from conversations.


History management

Getting the history

history() returns a snapshot of the current conversation:

let messages = conv.history();
for msg in &messages {
    println!("{:?}: {}", msg.role, msg.content);
}

Checking history size

Use the lock-free atomic accessors:

println!("Messages: {}", conv.len());
println!("Empty: {}", conv.is_empty());

Clearing history

Reset the conversation to start fresh while keeping the system prompt:

conv.send("Topic A discussion...").await?;
conv.clear();  // Fire-and-forget, processed after in-flight sends
conv.send("Topic B discussion...").await?;
// The LLM only sees Topic B, not Topic A

Restoring history

Load a previously saved conversation when building:

use acton_ai::messages::Message;

let saved_history = vec![
    Message::user("What is Rust?"),
    Message::assistant("Rust is a systems programming language..."),
];

let conv = runtime.conversation()
    .system("You are a Rust tutor.")
    .restore(saved_history)
    .build()
    .await;

// The conversation continues from where it left off
let response = conv.send("Tell me more about its memory model.").await?;

Context window management

Every turn of a Conversation sends the entire accumulated history to the LLM. Without bounds, a long-lived session eventually exceeds the provider's context window and fails at the model boundary, and token cost per turn grows linearly with session length.

By default, acton-ai truncates the history on each turn to fit a configurable token budget using the KeepRecent strategy — the newest user message is always kept; older turns are dropped until everything fits. The system prompt is carried out-of-band by the prompt builder and is never subject to truncation.

Default budget

The runtime resolves the budget at launch() time with this precedence:

  1. Per-provider context_window_tokens for the default provider (if set).
  2. Global [context] max_tokens in acton-ai.toml.
  3. Built-in default of 8192 tokens (with 1024 reserved for the response).

The default token estimator is tiktoken-rs: o200k_base for GPT-4o / o-series models, cl100k_base for GPT-4 / GPT-3.5, and cl100k_base as a fallback for Anthropic and Ollama models (accurate ±10% — sufficient for budgeting).

Configuring via TOML

[providers.ollama]
type = "ollama"
model = "qwen2.5:7b"
context_window_tokens = 32000    # native limit for this model

[providers.claude]
type = "anthropic"
model = "claude-sonnet-4-20250514"
context_window_tokens = 200000   # Claude's native limit

[context]
max_tokens = 8192                # fallback when provider doesn't set it
reserved_for_response = 1024
strategy = "keep-recent"         # "keep-recent" | "keep-system-and-recent" | "keep-ends"

Overriding or opting out in code

use acton_ai::prelude::*;
use acton_ai::memory::{ContextWindow, ContextWindowConfig, TruncationStrategy};

// Custom window for the whole runtime
let cw = ContextWindow::new(
    ContextWindowConfig {
        max_tokens: 16_384,
        truncation_strategy: TruncationStrategy::KeepEnds,
        reserved_for_response: 2048,
        tokens_per_char: 0.25,
    }
);

let runtime = ActonAI::builder()
    .app_name("my-app")
    .ollama("qwen2.5:7b")
    .context_window(cw)
    .launch()
    .await?;

// Or opt out entirely (unbounded history per turn — the pre-wiring behavior)
let runtime = ActonAI::builder()
    .app_name("my-app")
    .ollama("qwen2.5:7b")
    .without_context_window()
    .launch()
    .await?;

Per-Conversation overrides work the same way:

let conv = runtime.conversation()
    .system("You are brief.")
    .without_context_window()    // this conversation ships full history
    .build()
    .await;

Observing truncation

When history is actually clipped, a tracing::warn! fires once per turn with the drop counts. With -v on the CLI:

WARN acton_ai::conversation: truncated conversation history to fit context window
    dropped_messages=12 dropped_tokens=4231 kept_messages=8 kept_tokens=7801 max_tokens=8192

System prompt management

Setting the system prompt at build time

let conv = runtime.conversation()
    .system("You are a concise assistant. Answer in one sentence.")
    .build()
    .await;

Changing the system prompt mid-conversation

// Read the current prompt
if let Some(prompt) = conv.system_prompt() {
    println!("Current: {}", prompt);
}

// Change it (takes effect on the next send)
conv.set_system_prompt("You are now a creative writing assistant.");

// Clear it entirely
conv.clear_system_prompt();

Fire-and-forget updates

set_system_prompt() and clear_system_prompt() are fire-and-forget operations. They send a message to the actor and return immediately. The change takes effect on the next send() call.


Exit tool and interactive chat loops

The exit tool

The Conversation includes a built-in exit_conversation tool that the LLM can call when it detects the user wants to leave. When called, it sets an atomic flag you can check with should_exit().

let conv = runtime.conversation()
    .system("Help the user. Use exit_conversation when they say goodbye.")
    .with_exit_tool()
    .build()
    .await;

loop {
    let input = read_user_input();
    let response = conv.send(&input).await?;
    println!("{}", response.text);

    if conv.should_exit() {
        println!("Goodbye!");
        break;
    }
}

You can also clear the exit flag for confirmation flows:

if conv.should_exit() {
    println!("Are you sure you want to leave? (yes/no)");
    let answer = read_user_input();
    if answer != "yes" {
        conv.clear_exit();  // Reset and continue
        continue;
    }
}

run_chat() -- the minimal chat loop

run_chat() handles stdin reading, streaming, exit detection, and EOF in a single call:

use acton_ai::prelude::*;

ActonAI::builder()
    .app_name("chat")
    .from_config()?
    .with_builtins()
    .launch()
    .await?
    .conversation()
    .run_chat()
    .await?;

This is equivalent to building a conversation and calling run_chat() on it. The exit tool is automatically enabled, and a default system prompt is used if none was set.

run_chat_with() -- customized chat loops

Use ChatConfig to customize prompts and input processing:

use acton_ai::prelude::*;
use acton_ai::conversation::ChatConfig;

let conv = runtime.conversation()
    .system("You are a coding assistant.")
    .build()
    .await;

conv.run_chat_with(
    ChatConfig::new()
        .user_prompt(">>> ")           // Custom input prompt
        .assistant_prompt("AI: ")      // Custom response prefix
        .map_input(|s| {               // Transform input before sending
            format!("[user:admin] {}", s)
        })
).await?;

ChatConfig options

MethodDefaultDescription
.user_prompt(">>> ")"You: "Prompt shown before user input
.assistant_prompt("AI: ")"Assistant: "Prefix before assistant responses
.map_input(fn)NoneTransform user input before sending to LLM

The map_input callback is useful for injecting context, adding metadata, or preprocessing user input:

ChatConfig::new()
    .map_input(|input| {
        let timestamp = chrono::Local::now().format("%H:%M:%S");
        format!("[{}] {}", timestamp, input)
    })

Default system prompt

When run_chat() or run_chat_with() is called without a system prompt, this default is used:

You are a helpful assistant with access to various tools.
Use tools when appropriate to help the user.
When the user wants to end the conversation (says goodbye, bye, quit, exit, etc.),
use the exit_conversation tool.

Zero-mutex design

The Conversation handle achieves thread safety without any Mutex or RwLock:

DataSynchronizationAccess pattern
Conversation historyActor mailbox + watch::channelWrites serialized by mailbox; reads via watch snapshot
History lengthAtomicUsizeLock-free read with Ordering::SeqCst
Exit flagAtomicBoolLock-free read/write
Exit tool enabledAtomicBoolLock-free read/write
System promptwatch::channelReads via watch snapshot

This design means:

  • send() blocks the mailbox during the LLM call, guaranteeing ordering
  • history() returns an instant snapshot without waiting for in-flight sends
  • len(), is_empty(), and should_exit() are always non-blocking

Sharing conversations across tasks

Because Conversation is Clone + Send + 'static, you can share it across tokio tasks:

let conv = runtime.conversation()
    .system("You are helpful.")
    .build()
    .await;

// Clone for use in another task
let conv_clone = conv.clone();

let handle = tokio::spawn(async move {
    let response = conv_clone.send("Background question").await?;
    Ok::<_, ActonAIError>(response.text)
});

// Meanwhile, use the original
let response = conv.send("Foreground question").await?;

Serialized sends

While Conversation is safe to share across tasks, sends are serialized through the actor mailbox. Two concurrent send() calls will execute one after the other, not in parallel. This is by design -- it guarantees history consistency.


Next steps

Previous
Process Sandbox