structured_output_schema field on a request
asks the agent to produce a parallel JSON object that conforms to a schema you
provide, extracted from the same research material that produced the answer.
The schema acts as a typed contract between the agent and your code.
This page covers how structured_output_schema works end-to-end: when the
extraction runs, the schema constraints, a worked example, schema-design tips,
and how to combine structured output with the other Research Agent features.
How it works
structured_output_schema is a JSON Schema document attached to the request.
When set, the agent runs a dedicated extraction step after the main
answer has been produced. The extracted object is emitted as a single
STRUCTURED_OUTPUT message before the final COMPLETE event:
STRUCTURED_OUTPUT event is emitted (see Schema conformance).
Schema constraints
A few rules to keep in mind when authoring your schema:- Top-level must be a JSON object. If you need a list of items, wrap them in an object with a single array property (see List outputs below).
- Nested objects, arrays, and primitives are all supported. Use
type,properties,required,items, andenumas you would in any JSON Schema. - Field descriptions matter. The agent reads each property’s
descriptionto know what to extract. Specific, instruction-shaped descriptions ("Net income in millions of USD for the most recent fiscal year") produce dramatically better extractions than generic ones ("net income"). - Required fields should reflect what must always be present. Marking every field as required can cause extraction failures when one piece of data is genuinely absent; mark only fields you can guarantee.
Worked example: extract financial metrics
The request below asks for a one-paragraph credit summary and a parallel JSON object with quantitative metrics. The Markdown answer is for human readers; the JSON is for downstream code.structured variable receives a dict matching the schema, ready to be
persisted or passed downstream.
List outputs
JSON Schema’s top level must be an object, so to return a list of items wrap the array in a single property:{"competitors": [...]}. Downstream
code accesses the array via structured["competitors"].
Schema-design tips
- Lead with the description. A property’s
descriptionis the single most important field for extraction quality. Write it like an instruction to the agent: what to find, where to find it, and the expected unit or format. - Use
enumto constrain categorical fields. For sentiment, ratings, or classification fields, list the allowed values explicitly:{"type": "string", "enum": ["positive", "neutral", "negative"]}. The agent will pick one rather than inventing an out-of-band value. - Prefer flat schemas. Deep nesting (3+ levels) tends to reduce extraction reliability. If you need rich structure, split it across multiple requests rather than packing one massive schema.
- Keep arrays short and bounded. Specify expected length in the description (“up to five”, “exactly three”) so the agent does not over- or under-fill.
- Mark
requiredminimally. Only fields you are certain the research will surface should be required. Optional fields gracefully degrade to missing keys when the data is absent. - Test on representative prompts. Different prompts can elicit different shapes of evidence; an extraction schema that works for “summarize NVIDIA’s earnings” may behave differently for “compare NVIDIA and AMD.”
Schema conformance
Structured output is produced with the model’s native structured-output mode, which enforces your schema. When aSTRUCTURED_OUTPUT message is emitted, its
content already conforms to the schema you supplied — you do not need to
re-validate it client-side, and you will not receive a partially-filled object.
Optional fields you did not mark required may simply be absent when the
research did not surface them; that is normal and expected.
Extraction is all-or-nothing. In the rare case the model cannot map the
research onto your schema, the request still completes but no
STRUCTURED_OUTPUT message is emitted. This is not a routine outcome to build
fallbacks around — if you hit it repeatedly for a given schema, simplify the
schema (see the design tips above) and report it, rather than absorbing it
silently.
Combining with other features
With tools_configs
Use a tightly scoped search config alongside a strict schema to build
deterministic, narrow pipelines. For example, a “company snapshot” pipeline
might scope to one entity ID and extract a fixed financial-metrics schema:
With conversation continuity
Structured output is computed once per request. A follow-up turn with the samechat_id runs its own extraction against its own answer; previous
turns’ structured outputs are not re-emitted. If you need a running structured
state across turns, accumulate the values on the client side.
Next steps
Streaming responses
See where STRUCTURED_OUTPUT falls in the message lifecycle.
Search tool configuration
Pair a strict schema with a tightly scoped search for deterministic pipelines.
Multi-turn conversations
Run structured extraction across multiple turns of a continuing conversation.
Conversation continuity
chat_id, checkpoint_id, and the persistence model for follow-up turns.