Search Tool Configuration

The Research Agent has a built-in search tool that retrieves source documents from the Bigdata.com content platform. By default the tool searches across the full corpus, with the agent choosing which queries to run. The tools_configs field on a request lets you constrain that search at the protocol level — restricting which entities, time windows, sources, or tags the tool may consider, and tuning how results are ranked. This page is the reference for tools_configs.search: the shape of the configuration, the semantics of each filter, and worked examples for the most common scoping patterns.

tools_configs only affects the Research Agent (/v1/research-agent). Workflows scope their search at the template level via content_filter on the template definition; see Creating templates.

The shape of a config

The tools_configs field on the Research Agent request accepts an object with exactly one key, "search", whose value is a SearchToolConfig:

{
  "message": "Analyze NVIDIA's data center segment.",
  "research_effort": "standard",
  "tools_configs": {
    "search": {
      "query_filters": { },
      "ranking_parameters": { }
    }
  }
}

SearchToolConfig has two optional members:

query_filters — narrows which content the search tool may return.
ranking_parameters — influences how matching content is ordered.

Either or both may be omitted. An empty config ({"search": {}}) is equivalent to no config at all.

`query_filters`

query_filters has four optional components. They are independent and may be combined freely; the search tool returns only documents that satisfy every component you specify.

Component	Restricts results to…
`entities`	Documents mentioning a specific set of entity IDs.
`period`	A specific time window.
`content`	A specific set of document IDs or sources.
`tags`	Your privately-uploaded documents carrying any of the given file tags.

`entities`

EntityFilter selects content based on the entities mentioned in a document. It uses three set-style operators that combine with AND:

Operator	Meaning
`all_of`	Document must mention all of these entity IDs.
`any_of`	Document must mention at least one of these entity IDs.
`none_of`	Document must mention none of these entity IDs (exclusion).

Entity IDs are short identifier strings such as D8442A (NVIDIA) or 4E4980 (Apple). Resolve them with the Knowledge Graph endpoints (find_companies, find_etfs, find_organizations, etc.) or with the find_companies / find_securities MCP tools.

payload = {
    "message": "What are the key competitive dynamics between NVIDIA and AMD in AI accelerators?",
    "research_effort": "standard",
    "tools_configs": {
        "search": {
            "query_filters": {
                "entities": {
                    "any_of": ["D8442A", "D58A18"]  # NVIDIA OR AMD
                }
            }
        }
    },
}

`period`

DateRange restricts content to a specific time window. Both start and end are ISO 8601 timestamps in UTC, and both are inclusive.

payload = {
    "message": "Summarize NVIDIA's Q2 FY26 earnings call commentary.",
    "research_effort": "standard",
    "tools_configs": {
        "search": {
            "query_filters": {
                "period": {
                    "start": "2026-02-01T00:00:00Z",
                    "end":   "2026-04-30T23:59:59Z"
                }
            }
        }
    },
}

Either bound can be omitted to leave that side unconstrained: {"start": "2026-01-01T00:00:00Z"} searches everything from the start date forward. For the Workflows API, time scoping has an additional rolling-window option (last_24_hours, last_7_days, etc.) at the request level. The Research Agent does not currently expose those rolling values via tools_configs; use explicit start and end instead.

`content`

ContentFilter selects content by document or source identifier. Like EntityFilter, it composes three operators:

Operator	Meaning
`all_of`	Match documents satisfying all of these item filters.
`any_of`	Match documents satisfying at least one of these item filters.
`none_of`	Exclude documents satisfying any of these item filters.

Each item inside all_of / any_of / none_of is one of:

{"type": "DOCUMENT_ID", "document_ids": [...]} — up to 1000 unique Bigdata.com document IDs.
{"type": "SOURCE_TYPE", "sources_ids": [...]} — up to 1000 unique source IDs. Resolve source IDs with find_sources (Knowledge Graph) or the find_securities MCP tool.

Restrict research to a curated set of sources:

payload = {
    "message": "Summarize policy signals from major central banks this month.",
    "research_effort": "standard",
    "tools_configs": {
        "search": {
            "query_filters": {
                "content": {
                    "any_of": [
                        {
                            "type": "SOURCE_TYPE",
                            "sources_ids": ["FED-PRESS", "ECB-PRESS", "BOJ-PRESS"]
                        }
                    ]
                }
            }
        }
    },
}

Exclude a noisy source while keeping everything else:

"content": {
    "none_of": [
        {"type": "SOURCE_TYPE", "sources_ids": ["LOW_QUALITY_SRC"]}
    ]
}

`tags`

tags filters on the user-defined labels you attach to your own privately-uploaded documents — the same file tags you set when uploading content (see Upload and search your content). It does not classify Bigdata.com’s public content by topic or document type. Provide up to 1000 tag names; a document is eligible if it carries any of them (OR within the set).

"query_filters": {
    # Tag names you assigned to your own uploaded documents.
    "tags": ["broker-research", "internal-memos"]
}

To scope by document type or category (earnings calls, filings, news), filter on content sources or entities instead — tags only applies to your uploaded content.

Combining filters

Filters compose with AND across components. The example below scopes research to a specific company within a date window:

payload = {
    "message": "Identify the primary risk factors NVIDIA disclosed in 2025.",
    "research_effort": "standard",
    "tools_configs": {
        "search": {
            "query_filters": {
                "entities": {"all_of": ["D8442A"]},
                "period": {
                    "start": "2025-01-01T00:00:00Z",
                    "end":   "2025-12-31T23:59:59Z"
                }
            }
        }
    },
}

The search tool will only return documents that mention NVIDIA and fall within 2025. Tightening the filter set dramatically improves answer relevance for narrow analytical questions; it also reduces token consumption because the agent retrieves fewer irrelevant documents.

`ranking_parameters`

RankingParameters influences how matching documents are ordered (it does not filter content). Two knobs are available, each on a 1-10 scale:

Field	Effect
`freshness_boost`	Higher values prioritize more recently published content.
`source_boost`	Higher values prioritize results from high-quality, authoritative sources.

"tools_configs": {
    "search": {
        "ranking_parameters": {
            "freshness_boost": 8,
            "source_boost": 6
        }
    }
}

Defaults are tuned for general research; raise freshness_boost for fast- moving topics (breaking news, recent earnings) and source_boost for research that should lean on flagship publications and regulatory filings.

Common patterns

Single-company analysis

Scope the entire research session to one entity. Useful for company tearsheets and equity-research briefs.

"query_filters": {"entities": {"all_of": ["D8442A"]}}

Peer comparison

Allow any of a small set of companies; the agent decides which to retrieve per query.

"query_filters": {
    "entities": {"any_of": ["D8442A", "D58A18", "4E4980"]}  # NVIDIA, AMD, Apple
}

Recent-events focus

Restrict to the last quarter and bias ranking toward freshness.

"query_filters": {
    "period": {"start": "2026-02-01T00:00:00Z", "end": "2026-04-30T23:59:59Z"}
},
"ranking_parameters": {"freshness_boost": 9}

Authoritative-source mode

Lean on a curated allowlist of high-quality sources and boost source quality.

"query_filters": {
    "content": {
        "any_of": [
            {"type": "SOURCE_TYPE", "sources_ids": ["WSJ", "FT", "REUTERS", "BLOOMBERG"]}
        ]
    }
},
"ranking_parameters": {"source_boost": 9}

Hard exclusion

Permanently rule out a noisy source for an entire conversation.

"query_filters": {
    "content": {"none_of": [{"type": "SOURCE_TYPE", "sources_ids": ["NOISY_SRC"]}]}
}

Worked example: end-to-end

The script below runs a Research Agent request with a tightly scoped search config and prints the streamed answer.

import json
import requests
from dotenv import load_dotenv
import os

load_dotenv()
api_key = os.getenv("BIGDATA_API_KEY")

payload = {
    "message": "Summarize the key risk factors NVIDIA disclosed during FY2025.",
    "research_effort": "standard",
    "tools_configs": {
        "search": {
            "query_filters": {
                "entities": {"all_of": ["D8442A"]},
                "period": {
                    "start": "2025-01-01T00:00:00Z",
                    "end":   "2025-12-31T23:59:59Z",
                },
            },
            "ranking_parameters": {"source_boost": 7},
        }
    },
}

headers = {"X-API-KEY": api_key, "Content-Type": "application/json"}

with requests.post(
    "https://agents.bigdata.com/v1/research-agent",
    headers=headers,
    json=payload,
    stream=True,
    timeout=300,
) as r:
    r.raise_for_status()
    for raw_line in r.iter_lines(decode_unicode=True):
        if not raw_line or not raw_line.startswith("data: "):
            continue
        try:
            event = json.loads(raw_line[6:])
        except json.JSONDecodeError:
            continue
        msg = event.get("message", {})
        if msg.get("type") == "ANSWER":
            print(msg.get("content", ""), end="", flush=True)
        elif msg.get("type") == "ERROR":
            raise RuntimeError(msg.get("error"))

For the full streaming-handler pattern (including PLANNING, GROUNDING, errors, and forward-compatibility), see Streaming responses.

Tips and pitfalls

Filters are AND across components. Adding more filter components narrows the result set; if the agent reports “no relevant sources found,” the scope is likely too tight. Relax period or entities first.
all_of vs any_of for entities matters. all_of: ["D8442A", "D58A18"] requires documents that mention both NVIDIA and AMD — often empty for broad queries. any_of is usually what you want for peer comparisons.
none_of is a hard exclusion. It applies to every search the agent performs in this request; use it sparingly and only for sources you are certain are noise.
Entity and source IDs must be valid. The agent will not surface a helpful error if you pass a typo; it will simply find no matches. Resolve IDs through the Knowledge Graph endpoints before using them.
Date ranges are inclusive. A period.end of 2026-04-30T23:59:59Z includes documents published on April 30; 2026-05-01T00:00:00Z would include the first second of May as well.

Next steps

Streaming responses

Reference for every message type, including the canonical Python handler.

Structured output extraction

Combine tools_configs with structured_output_schema to extract typed JSON from scoped research.

Multi-turn conversations

Carry a search config across follow-up turns by persisting the conversation.

Knowledge Graph: find companies

Resolve entity IDs for use in entities.all_of and entities.any_of.

How to guides

Research Service

Search Service

Proprietary Content

Knowledge Graph

The shape of a config

`query_filters`

`entities`

`period`

`content`

`tags`

Combining filters

`ranking_parameters`

Common patterns

Single-company analysis

Peer comparison

Recent-events focus

Authoritative-source mode

Hard exclusion

Worked example: end-to-end

Tips and pitfalls

Next steps

Streaming responses

Structured output extraction

Multi-turn conversations

Knowledge Graph: find companies

​The shape of a config

​query_filters

​entities

​period

​content

​tags

​Combining filters

​ranking_parameters

​Common patterns

​Single-company analysis

​Peer comparison

​Recent-events focus

​Authoritative-source mode

​Hard exclusion

​Worked example: end-to-end

​Tips and pitfalls

​Next steps

Streaming responses

Structured output extraction

Multi-turn conversations

Knowledge Graph: find companies

The shape of a config

`query_filters`

`entities`

`period`

`content`

`tags`

Combining filters

`ranking_parameters`

Common patterns

Single-company analysis

Peer comparison

Recent-events focus

Authoritative-source mode

Hard exclusion

Worked example: end-to-end

Tips and pitfalls

Next steps