Multi-turn Conversations

A research session rarely ends after the first answer. Users refine, follow up, branch into a sibling line of inquiry, or back up and restart. This page covers the practical patterns for building those interactions on top of the Research Agent’s conversation primitives. The mechanics of chat_id, checkpoint_id, from_checkpoint_id, persistence_mode, and the "INITIAL" sentinel are documented in Conversation continuity. This page assumes you have read that reference and focuses on when and how to use each pattern in real applications.

The examples on this page set persistence_mode: "enabled" because they illustrate patterns that often outlast a single hour. With the default disabled mode the same patterns work, but ephemeral threads expire about an hour after the last turn; follow-ups after that point return HTTP 404 with detail "Cannot follow up: chat expired". Once set, the mode is immutable for the life of the conversation — a mismatched resume returns HTTP 400 on the Research Agent (HTTP 409 on Workflows). See Conversation continuity for the full persistence model.

A reusable turn function

The examples below share a single helper that issues one request and returns the assembled answer along with the updated chat_id and checkpoint_id. This is the same shape introduced in Conversation continuity, reproduced here for clarity.

import json
import requests


def turn(
    api_key: str,
    message: str,
    chat_id: str | None = None,
    checkpoint_id: str | None = None,
    research_effort: str = "standard",
) -> tuple[str, str, str]:
    endpoint = "https://agents.bigdata.com/v1/research-agent"
    headers = {"X-API-KEY": api_key, "Content-Type": "application/json"}
    payload: dict = {
        "message": message,
        "research_effort": research_effort,
        "persistence_mode": "enabled",
    }
    if chat_id:
        payload["chat_id"] = chat_id
    if checkpoint_id:
        payload["from_checkpoint_id"] = checkpoint_id

    answer = ""
    new_chat_id = chat_id or ""
    new_ckpt = ""

    with requests.post(endpoint, headers=headers, json=payload, stream=True, timeout=300) as r:
        r.raise_for_status()
        for raw_line in r.iter_lines(decode_unicode=True):
            if not raw_line or not raw_line.startswith("data: "):
                continue
            try:
                event = json.loads(raw_line[6:])
            except json.JSONDecodeError:
                continue
            new_chat_id = event.get("chat_id") or new_chat_id
            msg = event.get("message", {})
            if msg.get("type") == "ANSWER":
                answer += msg.get("content", "")
            elif msg.get("type") == "COMPLETE":
                new_ckpt = msg.get("checkpoint_id") or new_ckpt
            elif msg.get("type") == "ERROR":
                raise RuntimeError(msg.get("error"))

    return answer, new_chat_id, new_ckpt

Pattern 1: Linear follow-ups

The most common multi-turn shape. Each turn picks up where the last one left off; the agent has the entire prior conversation in context.

api_key = "..."
chat_id = None
ckpt = None

answer, chat_id, ckpt = turn(api_key, "Summarize NVIDIA's Q2 FY26 results.", chat_id, ckpt)
answer, chat_id, ckpt = turn(api_key, "How does that compare to AMD?",        chat_id, ckpt)
answer, chat_id, ckpt = turn(api_key, "Which one has better margins?",         chat_id, ckpt)

After each call, persist chat_id and ckpt so the next turn can build on the previous state. The agent’s grounding for “that” and “Which one” references the prior turns’ answers without your client needing to repeat the prompt history.

When to use

A chat-style UI where each user message is a follow-up to the previous response.
Tool-driven workflows where each step refines the previous result (e.g., “deepen the analysis on point 3”).
Any conversation where conversational context is the value.

Pattern 2: Branching from a checkpoint

A user edits a previous message (or you detect a mistake in an earlier question) and wants to re-run that turn with the corrected input, keeping everything before it intact. Pass the checkpoint_id from the turn you want to resume before as from_checkpoint_id; the agent discards everything that came after and treats the new message as the replacement.

# Suppose we already have a conversation through three turns, and we stored
# the checkpoint after turn 1 (`ckpt_after_turn_1`).
#
# The user now realizes turn 2's question had a typo and wants to redo it.
answer, chat_id, ckpt = turn(
    api_key,
    "How does NVIDIA's data center segment compare to AMD's, not its gaming segment?",
    chat_id=chat_id,
    checkpoint_id=ckpt_after_turn_1,
)

After this call, the conversation history on the server consists of: turn 1 (unchanged), then the new turn 2. The original turns 2 and 3 no longer exist. The returned ckpt is your new “after turn 2” cursor; future follow-ups should use it.

When to use

An “edit message” affordance in a chat UI.
“Try a different question from here” branching.
Recovering from a context-poisoning mistake without scrubbing the entire conversation.

Storage implication

If you want to support arbitrary branching back to any prior turn, you must store the checkpoint_id returned at the end of every turn, not just the latest. Without per-turn checkpoints you can only branch back as far as the most recent stored one.

Pattern 3: Full reset with `"INITIAL"`

The user explicitly says “let’s start over,” or you detect a deeper context problem (wrong company, stale assumptions) that branching cannot cleanly fix. Pass from_checkpoint_id: "INITIAL" to truncate the entire conversation while keeping the same chat_id.

answer, chat_id, ckpt = turn(
    api_key,
    "Forget the previous topic. Let's analyze Apple's Vision Pro launch instead.",
    chat_id=chat_id,
    checkpoint_id="INITIAL",
)

This keeps any client-side state attached to chat_id (user metadata, thread title, etc.) while clearing the server-side context. There is no undo; gate this behind an explicit user action.

When to use

A “clear conversation” button.
Programmatic recovery after detecting deeply confused agent output.
Reusing a long-lived chat_id for sequential, independent research sessions (rare; usually a fresh chat_id is cleaner).

Pattern 4: Concurrent threads

A user often has multiple independent research threads in flight (one per ticker, one per project, one per industry). Each thread is its own chat_id and progresses independently.

# Maintain a registry on the client side.
threads = {
    "nvda-followups":   {"chat_id": None, "ckpt": None},
    "aapl-due-diligence": {"chat_id": None, "ckpt": None},
}

def ask(thread_key: str, message: str) -> str:
    state = threads[thread_key]
    answer, state["chat_id"], state["ckpt"] = turn(
        api_key, message, state["chat_id"], state["ckpt"]
    )
    return answer

ask("nvda-followups",      "Summarize NVIDIA's most recent earnings.")
ask("aapl-due-diligence",  "What is Apple's exposure to Chinese manufacturing?")
ask("nvda-followups",      "What was the gaming segment trend?")

Each thread has its own state. Do not share chat_id values across threads or users; they are server-side identifiers for a specific conversation.

App-side storage

Two common storage shapes:

Lightweight: persist `chat_id` only

If you only support linear follow-ups and never branching, storing just the latest chat_id per thread is sufficient. Use the most recent server-side state implicitly:

# Pass chat_id but no checkpoint_id; agent continues from the latest state.
answer, chat_id, ckpt = turn(api_key, message, chat_id=chat_id, checkpoint_id=None)

Full history: persist `chat_id` plus per-turn `checkpoint_id`

If you support branching, store the checkpoint_id returned by every turn, keyed by your client-side turn identifier. A schema sketch:

thread:
  id: string                 # your app's thread id
  bigdata_chat_id: string    # the chat_id from the API
  turns:
    - turn_id: string        # your app's turn id
      checkpoint_id: string  # the checkpoint_id returned in COMPLETE
      user_message: string
      assistant_answer: string

Branching back to a specific turn means looking up its checkpoint_id and passing it as from_checkpoint_id on the next request.

Error recovery

`404` on `from_checkpoint_id`

The server returns 404 when a checkpoint or chat can no longer be resumed. Causes:

The conversation was reset with "INITIAL" since the checkpoint was stored.
The conversation was created with the default persistence_mode: "disabled" and more than an hour has elapsed since the last turn. Ephemeral threads expire with detail "Cannot follow up: chat expired". Set persistence_mode: "enabled" on turn 1 to avoid this.
A different chat_id is being passed than the one the checkpoint was created under.

Recovery flow:

import requests

try:
    answer, chat_id, ckpt = turn(api_key, message, chat_id, stored_ckpt)
except requests.HTTPError as exc:
    if exc.response is not None and exc.response.status_code == 404:
        # Checkpoint is gone; fall back to the latest state on this chat_id.
        answer, chat_id, ckpt = turn(api_key, message, chat_id, None)
    else:
        raise

For the full error model (HTTP-level errors, in-stream ERROR / TOOL_ERROR / LLM_RETRY), see Error handling.

Truncated stream mid-turn

If the SSE stream terminates without a COMPLETE event (connection drop, client crash, server-side abort), the conversation state on the server is ambiguous: the turn may or may not have been recorded. The safest recovery is to retry the turn using the previous checkpoint_id (the one stored before the failed turn began), which guarantees the server starts from the known-good state.

Anti-patterns

Sharing a chat_id across users. The chat_id is the conversation; giving the same one to two users mixes their messages on the server. Allocate one per logical conversation.
Parsing or comparing checkpoint_id. It is an opaque string. Treat it as a token to round-trip, not data to introspect.
Treating a stored checkpoint_id as permanent. Resets and persistence changes can invalidate it. Always be prepared to fall back gracefully.
Treating the default disabled mode as long-lived. Ephemeral threads expire after about an hour; conversations that need to outlast a single working session should set persistence_mode: "enabled" on turn 1. The mode is fixed at conversation creation and cannot be raised later.
Setting persistence_mode inconsistently across turns. Every turn must pass the same value used on turn 1. Mismatched resumes return HTTP 400 (Research Agent) or HTTP 409 (Workflows).
Mixing checkpoints across chat_ids. A checkpoint_id from one conversation is meaningless in another; pass it only with the chat_id it was returned from.

Tips

Prefer the implicit-latest behavior for linear chat UIs. If your product only supports “next message in this conversation,” you can skip per-turn checkpoint_id storage entirely and just pass chat_id — the agent uses the latest state by default.
Mark branch points explicitly in your UI. When a user backtracks and branches, surface that they’ve created an alternate thread (or that the later messages have been discarded). The behavior can otherwise be confusing.
Limit how far back branching is offered. Stored checkpoints accumulate storage cost on your side. Capping branching to the last N turns (e.g., last 10) is a reasonable default.

Next steps

Conversation continuity

The protocol-level reference for chat_id, checkpoint_id, and persistence.

Error handling

Recover from 404 on stale checkpoints and other failure modes.

Search tool configuration

Carry a scoped search config across follow-up turns of the same conversation.

Structured output extraction

Extract typed JSON from every turn of a multi-turn conversation.

How to guides

Research Service

Search Service

Proprietary Content

Knowledge Graph

A reusable turn function

Pattern 1: Linear follow-ups

When to use

Pattern 2: Branching from a checkpoint

When to use

Storage implication

Pattern 3: Full reset with `"INITIAL"`

When to use

Pattern 4: Concurrent threads

App-side storage

Lightweight: persist `chat_id` only

Full history: persist `chat_id` plus per-turn `checkpoint_id`

Error recovery

`404` on `from_checkpoint_id`

Truncated stream mid-turn

Anti-patterns

Tips

Next steps

Conversation continuity

Error handling

Search tool configuration

Structured output extraction

​A reusable turn function

​Pattern 1: Linear follow-ups

​When to use

​Pattern 2: Branching from a checkpoint

​When to use

​Storage implication

​Pattern 3: Full reset with "INITIAL"

​When to use

​Pattern 4: Concurrent threads

​App-side storage

​Lightweight: persist chat_id only

​Full history: persist chat_id plus per-turn checkpoint_id

​Error recovery

​404 on from_checkpoint_id

​Truncated stream mid-turn

​Anti-patterns

​Tips

​Next steps

Conversation continuity

Error handling

Search tool configuration

Structured output extraction

A reusable turn function

Pattern 1: Linear follow-ups

When to use

Pattern 2: Branching from a checkpoint

When to use

Storage implication

Pattern 3: Full reset with `"INITIAL"`

When to use

Pattern 4: Concurrent threads

App-side storage

Lightweight: persist `chat_id` only

Full history: persist `chat_id` plus per-turn `checkpoint_id`

Error recovery

`404` on `from_checkpoint_id`

Truncated stream mid-turn

Anti-patterns

Tips

Next steps