> ## Documentation Index
> Fetch the complete documentation index at: https://docs.bigdata.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Query filters

## Entity

A filter to match an entity by its "EntityID". Utilize the methods
provided in [Knowledge Graph](../knowledge_graph/introduction)
to identify entities/topics/sources of interest and use the obtained IDs
to build queries.

Example:

<CodeGroup>
  ```bash API highlight={7-11} theme={null}
  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
      "query": {
        "filters": {
          "entity": {
            "any_of": [
              "228D42",
              "D8442A"
            ]
          }
        },
        "max_chunks": 10
      }
    }'
  ```

  ```python Python SDK highlight={10} theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import Entity

  bigdata = Bigdata()

  # Entity IDs:
  MICROSOFT = "228D42"
  APPLE = "D8442A"

  query = Entity(MICROSOFT) | Entity(APPLE)
  search = bigdata.search.new(query)

  for document in search.limit_documents(2):
      print(document)

  # or

  documents = search.run(2)
  for document in documents:
      print(document)
  ```
</CodeGroup>

## Similarity

It calculates the embedding of the provided sentence in the Similarity filter and searches for the closest nodes in the proprietary Bigdata Vector Database.

The following example searches for chunks closely related to the sentence `Tariffs impacting US companies`.

<CodeGroup>
  ```bash API highlight={5-6} theme={null}
  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
    "query": {
      "text": "Tariffs impacting US companies",
      "filters": {},
      "max_chunks": 10
    }
  }'
  ```

  ```python Python SDK highlight={7} theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import Similarity

  bigdata = Bigdata()

  query = (
      Similarity("Tariffs impacting US companies")
  )
  search = bigdata.search.new(query)
  documents = search.run(2)
  print(documents)
  ```
</CodeGroup>

When using the Similarity filter, you could also apply a second ranking phase to improve precision. More details in [Rerank search](../../how-to-guides/rerank_search).

<Tip>
  For Python SDK users:

  We advise using a maximum of one Similarity filter per query. If you need to search for multiple sentences, you can create various queries, each with one Similarity filter.

  The operator AND (`&`) is supported, but the returned chunks must be closely related to all specified sentences in the Similarity filters.

  The operator OR (`|`) is not supported. Please create multiple queries with one Similarity filter each, and then you can combine their results.
</Tip>

## Keyword

We can enrich the query criteria with positive or negative Keyword filters. The keyword match is at the document title level or the chunk text.

For instance, the following query will retrieve chunks that mention "Announcement" and "2024" but not "2023" in either the chunk or the document's title.

Example:

<CodeGroup>
  ```bash API highlight={7-15} theme={null}
  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
    "query": {
      "filters": {
        "keyword": {
          "all_of": [
            "Announcement",
            "2024"
          ],
          "none_of": [
            "2023"
          ]
        }
      },
      "max_chunks": 10
    }
  }'
  ```

  ```python Python SDK theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import Keyword

  bigdata = Bigdata()

  # Search for matches of Announcements that mention 2024 but not 2023
  query = (
      Keyword("Announcement")
      & Keyword("2024")
      & (~Keyword("2023"))
  )
  search = bigdata.search.new(query)
  documents = search.run(2)
  print(documents)
  ```
</CodeGroup>

<Note>
  The Keyword matching uses *stemming*, which means that the search will also
  match similar words. For example, searching for "resignation" will
  also match results containing the word "resignations".
</Note>

## Topic

Bigdata identifies topics in the unstructured data so you can filter by them and find the text where those events have been identified.

The Knowledge Graph defines 2.4k topics. The best way to get a list of relevant topics for your search is with the [Co-mentions > Connected Topics](../search/search_co_mentions#connected-topics) method. You can also explore them in the [Knowledge Graph > Find Topics](../knowledge_graph/find_topics) page or send us an email at [support@bigdata.com](mailto:support@bigdata.com) to request the whole taxonomy.

Once you have the list of topic IDs you want to monitor, you can add them to the Search as a filter.

Example:

<CodeGroup>
  ```bash API highlight={8-13} theme={null}

  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
    "query": {
      "filters": {
        "topic": {
          "any_of": [ 
            "business,labor-issues,executive-appointment,",
            "business,labor-issues,executive-resignation,",
            "business,labor-issues,executive-retirement,"
          ]
        }
      },
      "max_chunks": 10
    }
  }'

  ```

  ```python Python SDK theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import Topic

  bigdata = Bigdata()

  query = (
      Topic("business,labor-issues,executive-appointment,,")
      | Topic("business,labor-issues,executive-resignation,,")
      | Topic("business,labor-issues,executive-retirement,,")
  )
  search = bigdata.search.new(query)

  documents = search.run(2)
  for document in documents:
      print(document)
  ```
</CodeGroup>

## Source

Bigdata's ecosystem comprises key high-quality content sources, including web content, premium news, press wires, call transcripts, and regulatory filings.

You can focus your search on a list of trusted sources to minimize the noise and ensure novel information in your results.

Example:

<CodeGroup>
  ```bash API highlight={7-12} theme={null}
  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
    "query": {
      "filters": {
        "source": {
          "mode": "INCLUDE",
          "values": [
            "E54C73"
          ]
        },
        "entity": {
          "any_of": [
            "228D42"
          ]
        }
      },
      "max_chunks": 10
    }
  }'
  ```

  ```python Python SDK theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import Source, Entity

  bigdata = Bigdata()

  MICROSOFT = "228D42"
  ABC_NEWS = "E54C73"
  query = (
      Entity(MICROSOFT)
      & Source(ABC_NEWS)
  )
  search = bigdata.search.new(query)
  documents = search.run(2)
  print(documents)
  ```
</CodeGroup>

## SentimentRange

With Sentiment Ranges you can filter out document chunks by specifying a
sentiment score range between -1.00 and +1.00. This score reflects the
sentiment of each chunk based on the language used in every sentence. A
score closer to -1.00 indicates negative sentiment, while a score closer
to +1.00 indicates positive sentiment.

<Info>
  The API support 3 values: Positive, Neutral and Negative.

  The Python SDK directly support numerical values for sentiment.
</Info>

<CodeGroup>
  ```bash API highlight={13-17} theme={null}
  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
    "query": {
      "filters": {
        "entity": {
          "any_of": [
            "228D42",
            "D8442A"
          ]
        },
        "sentiment": {
          "values": [
            "negative"
          ]
        }
      },
      "max_chunks": 10
    }
  }'
  ```

  ```python Python SDK highlight={9-10} theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import Entity, SentimentRange

  bigdata = Bigdata()

  MICROSOFT = "228D42"
  APPLE = "D8442A"

  positive_peak_microsoft = Entity(MICROSOFT) & SentimentRange([0.8,1])
  negative_peak_apple = Entity(APPLE) & SentimentRange([-1,-0.8])
  query = positive_peak_microsoft | negative_peak_apple

  search = bigdata.search.new(query)
  documents = search.run(2)
  print(documents)
  ```
</CodeGroup>

## Document

Restrict the search to a list of specified documents. Use document IDs (e.g. from a previous search response) to search only within those documents. You can use the **document** filter in the API or the **Document** component in the Python SDK.

<Tabs>
  <Tab title="API">
    In the [Search API](/api-reference/search/search-documents#body-query-filters-document), set `query.filters.document` with `mode` (`INCLUDE` or `EXCLUDE`) and `values` (array of document IDs).

    ```bash theme={null}
    curl -X POST 'https://api.bigdata.com/v1/search' \
      -H 'Content-Type: application/json' \
      -H 'X-API-KEY: YOUR_API_KEY' \
      --data '{
        "query": {
          "filters": {
            "document": {
              "mode": "INCLUDE",
              "values": [
                "3C54E042B2B19B244F57D3C6415439D1",
                "0B4EE52A6A611A8326D7EA3E8DC075E3"
              ]
            }
          },
          "max_chunks": 10
        }
      }'
    ```

    To exclude specific documents from the search, set `"mode": "EXCLUDE"` and provide the document IDs in `values`.
  </Tab>

  <Tab title="Python SDK">
    Use the `Document` query component with one or more document IDs. You can combine it with other filters (e.g. `Entity`) using the `&` operator.

    ```python theme={null}
    from bigdata_client import Bigdata
    from bigdata_client.query import Entity, Document

    bigdata = Bigdata()

    MICROSOFT = "228D42"

    query = Entity(MICROSOFT) & Document("0B4EE52A6A611A8326D7EA3E8DC075E3", "9C67269CD8747E33DDEE94554A13E6EC")

    search = bigdata.search.new(query)
    documents = search.run(2)
    print(documents)
    ```
  </Tab>
</Tabs>

## Transcript

You can filter by a transcript subtype. The possible values are:

* `ANALYST_INVESTOR_SHAREHOLDER_MEETING`: Analyst, Investor and Shareholder meeting.
* `CONFERENCE_CALL`: General Conference Call.
  `Coming Soon`
* `GENERAL_PRESENTATION`: General Presentation.
* `EARNINGS_CALL`: Earnings Call.
* `EARNINGS_RELEASE`: Earnings Release.
  `Coming Soon`
* `GUIDANCE_CALL`: Guidance Call.
* `SALES_REVENUE_CALL`: Sales and Revenue Call.
* `SALES_REVENUE_RELEASE`: Sales and Revenue Release.
  `Coming Soon`
* `SPECIAL_SITUATION_MA`: Special Situation, M\&A and Other.
* `SHAREHOLDERS_MEETING`: Shareholders Meeting.
  `Coming Soon`
* `MANAGEMENT_PLAN_ANNOUNCEMENT`: Management Plan Announcement.
  `Coming Soon`
* `INVESTOR_CONFERENCE_CALL`: Investor Conference Call.
  `Coming Soon`

Example:

<CodeGroup>
  ```bash API highlight={9-14} theme={null}
  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
    "query": {
      "filters": {
         "document_type": {
                  "mode": "INCLUDE",
                  "values": [
                      {
                        "type": "TRANSCRIPT",
                        "subtypes": ["EARNINGS_CALL"]
                      }
                  ]
              }
      },
      "max_chunks": 10
    }
  }'
  ```

  ```python Python SDK highlight={6} theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import TranscriptTypes

  bigdata = Bigdata()

  query = TranscriptTypes.EARNINGS_CALL

  search = bigdata.search.new(query)
  documents = search.run(2)
  print(documents)
  ```
</CodeGroup>

<Note>The API still needs to support `SectionMetadata`</Note>

* `SectionMetadata`: This filter allows querying for segments inside
  transcript documents. A
  `DocumentChunk` will be defined by one or more sections, always
  within its hierarchical structure:
  * `QA`: question and answer section. This section can be decomposed on:
    * `QUESTION`: a question made during the session to a speaker.
    * `ANSWER`: an answer from a speaker of the event.
  * `MANAGEMENT_DISCUSSION`: Management Discussion Section.

Example:

```python theme={null}
from bigdata_client import Bigdata
from bigdata_client.query import Entity, TranscriptTypes, SectionMetadata

bigdata = Bigdata()

MICROSOFT = "228D42"

query = Entity(MICROSOFT) & TranscriptTypes.EARNINGS_CALL & SectionMetadata.MANAGEMENT_DISCUSSION

search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
```

## Filing

You can also query a specific Filing subtype. The possible values are:

* `SEC_10_K`: Annual report filing regarding a company's financial performance submitted to the Securities and Exchange Commission (SEC).
* `SEC_10_Q`: Quarterly report filing regarding a company's financial performance submitted to SEC.
* `SEC_8_K`: Report filed whenever a significant corporate event takes place that triggers a disclosure submitted to SEC.
* `SEC_20_F`: Annual report filing for non-U.S. and non-Canadian companies that have securities trading in the U.S.
* `SEC_S_1`: Filing needed to register the securities of companies that wish to go public with the U.S.
* `SEC_S_3`: Filing utilized when a company wishes to raise capital.
* `SEC_6_K`: Report of foreign private issuer pursuant to rules 13a-16 and 15d-16.
* `SEC_DEF_14A`: Definitive proxy statement the SEC requires before an annual meeting or shareholder vote; it includes material financial information and corporate governance details (for example, committee composition).

Example:

<CodeGroup>
  ```bash API highlight={9-14} theme={null}
  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
    "query": {
      "filters": {
         "document_type": {
                  "mode": "INCLUDE",
                  "values": [
                      {
                        "type": "FILING",
                        "subtypes": ["SEC_10_K"]
                      }
                  ]
              }
      },
      "max_chunks": 10
    }
  }'
  ```

  ```python Python SDK highlight={6} theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import FilingTypes

  bigdata = Bigdata()

  query = FilingTypes.SEC_10_K

  search = bigdata.search.new(query)
  documents = search.run(2)
  print(documents)
  ```
</CodeGroup>

<Tip>
  When querying Transcripts or Filings, it is helpful to narrow your search using the reporting details, such as fiscal year and quarter, as described in the next section.
</Tip>

## Investment Research

You can filter by investment research document subtypes. The possible values are:

* `COMPANY_REPORT`: Analysis of a single company's financials, strategy, performance, including forecasts, valuation, and investment recommendations.
* `COVERAGE_ANALYSIS`: Document defining an analyst's coverage universe by listing multiple companies or assets formally tracked on an ongoing basis.
* `ECONOMIC_REPORT`: Analysis of macroeconomic data (GDP, inflation, employment, central bank decisions) and forward-looking economic trends for countries or regions.
* `FIXED_INCOME_REPORT`: Analysis of debt securities like bonds or loans, covering performance, interest rates, risks, and investment recommendations.
* `FUND_REPORT`: Analytical report evaluating a fund's performance, including attribution, transactions, risk metrics, and commentary on contributors/detractors.
* `FX_AND_DERIVATIVES_REPORT`: Research focused on derivative instruments and currency markets.
* `GENERIC_REPORT`: Miscellaneous research that cannot be assigned to other categories.
* `INDEX_REPORT`: Analysis of stock market index performance, composition, and outlook to guide investment or asset allocation decisions.
* `INDUSTRY_REPORT`: Analysis of an entire industry's structure, trends, risks, and outlook to support investment decisions across companies in that sector.
* `MARKET_UPDATE`: Time-specific report summarizing recent market activity or price movements, often with tables or charts showing prices, returns, volume, or market breadth.
* `PORTFOLIO_STRATEGY`: Document outlining investment approach, objectives, asset allocation, equity strategies, and risk management plan for a portfolio.
* `PORTFOLIO_SUMMARY`: High-level snapshot of a portfolio showing current holdings, asset allocation, and key performance metrics.
* `RATING_REPORT`: Document assigning a buy, hold, or sell recommendation on a security, signaling structural changes like rating upgrades/downgrades or price target revisions.
* `RESEARCH_NOTE`: Brief, tactical commentary updating a broker's investment view on a company, sector, or market event, typically triggered by specific news.
* `THEMATIC_ANALYSIS`: Analysis focused on a specific trend, sector, or investment theme, exploring drivers, opportunities, and potential impact on related companies or markets.

Example:

```bash API highlight={9-14} theme={null}
curl -X POST 'https://api.bigdata.com/v1/search' \
  -H 'Content-Type: application/json' \
  -H 'X-API-KEY: <your-api-key>' \
  --data '{
  "query": {
    "filters": {
       "document_type": {
                "mode": "INCLUDE",
                "values": [
                    {
                      "type": "INVESTMENT-RESEARCH",
                      "subtypes": ["COMPANY_REPORT"]
                    }
                ]
            }
    },
    "max_chunks": 10
  }
}'
```

## Reporting details

They help you to specify the period and and the reporting company.

* `FiscalYear`: Integer representing the annual reporting period.
* `FiscalQuarter`: Integer representing the fiscal quarter covered.
* `ReportingEntity`: Allows searching by the reporting company.

Example:

<CodeGroup>
  ```bash API highlight={16-24} theme={null}
  curl -X POST 'https://api.bigdata.com/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'X-API-KEY: <your-api-key>' \
    --data '{
    "query": {
      "filters": {
        "document_type": {
            "mode": "INCLUDE",
            "values": [
              {
                 "type": "TRANSCRIPT",
                 "subtypes": ["EARNINGS_CALL"]
              }
            ]
        },
        "reporting_entities": [
          "228D42"
        ],
        "reporting_periods": [
          {
            "fiscal_year": 2024,
            "fiscal_quarter": 2
          }
        ]
      },
      "max_chunks": 10
    }
  }'
  ```

  ```python Python SDK theme={null}
  from bigdata_client import Bigdata
  from bigdata_client.query import TranscriptTypes, FiscalYear, FiscalQuarter, ReportingEntity

  bigdata = Bigdata()

  MICROSOFT = "228D42"

  query = (
      TranscriptTypes.EARNINGS_CALL 
      & FiscalYear(2024) & FiscalQuarter(2)                   # filter by fiscal quarter 2, 2024
      & ReportingEntity(MICROSOFT)                            # Reported by the company itself
      )

  search = bigdata.search.new(query)
  documents = search.run(2)
  print(documents)
  ```
</CodeGroup>

## FileTag

You can filter by private uploaded documents that have specific tags. Use the **tag** filter in the API or **FileTag** in the Python SDK.

<Tabs>
  <Tab title="API">
    In the [Search API](/api-reference/search/search-documents#body-query-filters-tag), set `query.filters.tag` with an `any_of` array of tag names. Documents matching any of the specified tags are included.

    ```bash theme={null}
    curl -X POST 'https://api.bigdata.com/v1/search' \
      -H 'Content-Type: application/json' \
      -H 'X-API-KEY: YOUR_API_KEY' \
      --data '{
        "query": {
          "text": "recommend stock",
          "filters": {
            "category": {
              "mode": "INCLUDE",
              "values": ["my_files"]
            },
            "tag": {
              "any_of": ["Data Science Research", "Cooking recipes"]
            }
          },
          "max_chunks": 10
        }
      }'
    ```
  </Tab>

  <Tab title="Python SDK">
    Use the `FileTag` query component to filter by tag names. Combine it with other filters (e.g. `Entity`, `Similarity`) with the `&` operator.

    ```python theme={null}
    from bigdata_client import Bigdata
    from bigdata_client.query import Entity, FileTag

    bigdata = Bigdata()

    MICROSOFT = "228D42"

    query = (
        Entity(MICROSOFT)
        & FileTag("tag_1", "tag_2")
    )

    search = bigdata.search.new(query)
    documents = search.run(2)
    print(documents)
    ```
  </Tab>
</Tabs>

## Query operators (SDK related)

The API requests can contain multiple filters; the SDK uses the Query operators to combine them.

For example, you can combine different query filters with `&` (*AND*) `|` (*OR*) and `~` (*NOT*) operators.

```python theme={null}
from bigdata_client import Bigdata
from bigdata_client.query import Entity, Keyword, Topic, Similarity

bigdata = Bigdata()

TESLA = "DD3BB1"
APPLE = "D8442A"
GOOGLE = "D8C3A1"

tech_companies = Entity(TESLA) | Entity(APPLE) | Entity(GOOGLE)
keywords = Similarity("executive appointment") | Keyword("CEO resignation")
topics = (
    Topic("business,labor-issues,executive-appointment,,")
    | Topic("business,labor-issues,executive-resignation,,")
    | Topic("business,labor-issues,executive-retirement,,")
)
query = tech_companies & (keywords | topics)

search = bigdata.search.new(query)

for result in search.limit_documents(2):
    print(result)
```

This should be sufficient for most use cases, but sometimes the query is
built from an external list of entities, keywords, topics, etc. For
example, provided a list of entity ids you could do:

```python theme={null}
from bigdata_client import Bigdata
from bigdata_client.query import Entity

bigdata = Bigdata()

entity_ids = read_entity_ids_from_file()  # Just for explanation purposes 
entities = [Entity(eid) for eid in entity_ids]
query = None
for entity in entities:
    if query is None:
        query = entity
    else:
        query = query | entity
search = bigdata.search.new(query)

documents = search.run(2)
print(documents)
```

This is a bit cumbersome, so we provide two helper function to make this
easier: `All` and `Any`. The first one is used to combine a list of
entities, keywords, topics, etc. with the AND operator, and the second
one is used to combine them with the OR operator. With the help from
`Any` the previous example would be rewritten as:

```python theme={null}
from bigdata_client import Bigdata
from bigdata_client.query import Entity, Any

bigdata = Bigdata()

entity_ids = read_entity_ids_from_file()  # Just for explanation purposes 
entities = [Entity(eid) for eid in entity_ids]
query = Any(entities)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
```

## Document Version

<Note>Document Version are not yet supported in the API.</Note>

Search by Document Version.

Example:

```python theme={null}
from bigdata_client import Bigdata
from bigdata_client.query import DocumentVersion

bigdata = Bigdata()

VERSION = "RAW"

query = DocumentVersion(VERSION)
# Search for DocumentVersion
search = bigdata.search.new(query)

documents = search.run(2)
print(documents)
```

See class `DocumentVersion` for further details.

## Watchlist

If you want to retrieve insights about any of the entities in a Watchlist, you can add all the entities in the query with a `Any` operator.

<Note>Watchlists are not yet supported in the API.</Note>

```python theme={null}
from bigdata_client import Bigdata
from bigdata_client.query import Any

bigdata = Bigdata()

MY_WATCHLIST_ID = "c2356958-48f6-4380-bb1f-c588656fb2c0"

watchlist = bigdata.watchlists.get(MY_WATCHLIST_ID)
companies = bigdata.knowledge_graph.get_entities(watchlist.items)

query = Any(companies)
search = bigdata.search.new(query)

documents = search.run(2)
for doc in documents:
    print(doc)
```

<Tip>
  Checkout the page [Watchlist management](../watchlist_management) for more information on how to create and manage Watchlists.
</Tip>
