Query filters

Entity

A filter to match an entity by its “EntityID”. Utilize the methods provided in Knowledge Graph to identify entities/topics/sources of interest and use the obtained IDs to build queries.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Entity

bigdata = Bigdata()

# Entity IDs:
MICROSOFT = "228D42"
APPLE = "D8442A"

query = Entity(MICROSOFT) | Entity(APPLE)
search = bigdata.search.new(query)

for document in search.limit_documents(10):
    print(document)

# or

documents = search.run(10)
for document in documents:
    print(document)

If you don’t know the EntityID, you can use the autosuggest feature to find it, and use the returned entity to build the query:

from bigdata_client import Bigdata

bigdata = Bigdata()

microsoft = bigdata.knowledge_graph.autosuggest("Microsoft")[0]
apple = bigdata.knowledge_graph.autosuggest("Apple")[0]

query = microsoft | apple
search = bigdata.search.new(query)

documents = search.run(10)
for document in documents:
    print(document)

Checkout the page Find companies for more information on how to find companies’ EntityIDs.

The search object is of type Search, and the individual items returned by the search are instances of Document. See the Document and the page Search results to see available attributes and methods. Also, see reference_entities for further details on each specific entity type.

Watchlist

If you want to retrieve insights about any of the entities in a Watchlist, you can add all the entities in the query with a Any operator.

from bigdata_client import Bigdata
from bigdata_client.query import Any

bigdata = Bigdata()

MY_WATCHLIST_ID = "c2356958-48f6-4380-bb1f-c588656fb2c0"

watchlist = bigdata.watchlists.get(MY_WATCHLIST_ID)
companies = bigdata.knowledge_graph.get_entities(watchlist.items)

query = Any(companies)
search = bigdata.search.new(query)

documents = search.run(2)
for doc in documents:
    print(doc)

Checkout the page Watchlist management for more information on how to create and manage Watchlists.

Topic

A filter to match content containing macroeconomic, geopolitical, and business events. Just like in the cases before, you can use the TopicID if it’s known:

from bigdata_client import Bigdata
from bigdata_client.query import Topic

bigdata = Bigdata()

query = (
    Topic("business,labor-issues,executive-appointment,,")
    | Topic("business,labor-issues,executive-resignation,,")
    | Topic("business,labor-issues,executive-retirement,,")
)
search = bigdata.search.new(query)

documents = search.run(10)
for document in documents:
    print(document)

Or use the autosuggest feature to find the Topic object:

from bigdata_client import Bigdata
from bigdata_client.query import Any

bigdata = Bigdata()

topics = ["executive appointment", "executive resignation", "executive retirement"]
topic_list = [bigdata.knowledge_graph.find_topics(topic)[0] for topic in topics]

query = Any(topic_list)
search = bigdata.search.new(query)

documents = search.run(10)
for document in documents:
    print(document)

Source

Bigdata’s ecosystem comprises key high-quality content sources, including web content, premium news, press wires, call transcripts, and regulatory filings. Filter out your search results by the target source in your query.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Source, Entity

bigdata = Bigdata()

MICROSOFT = "228D42"
ABC_NEWS = "E54C73"
query = (
    Entity(MICROSOFT)
    & Source(ABC_NEWS)
)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

Keyword

Type and search, matching a specific keyword. Note that there is stemming applied to the keyword which means that the search will also match similar words. For example, searching for “resignation” will also match results containing the word “resignations”.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Keyword

bigdata = Bigdata()

# Search for matches of Announcements that mention 2024 but not 2023
query = (
    Keyword("Announcement")
    & Keyword("2024")
    & (~Keyword("2023"))
)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

Similarity

Search for sentences after transforming them into embeddings.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Similarity

bigdata = Bigdata()

query = (
    Similarity("South Korea elections")
)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

The OR operator (|) is not supported for Similarity. If you want to search for multiple sentences, you must use AND (&) to combine them. - Querying by a Watchlist and Similarity is not supported. We advise creating a query per Entity ID and Similarity filter.

SentimentRange

With Sentiment Ranges you can filter out document chunks by specifying a sentiment score range between -1.00 and +1.00. This score reflects the sentiment of each chunk based on the language used in every sentence. A score closer to -1.00 indicates negative sentiment, while a score closer to +1.00 indicates positive sentiment.

from bigdata_client import Bigdata
from bigdata_client.query import Entity, SentimentRange

bigdata = Bigdata()

MICROSOFT = "228D42"
APPLE = "D8442A"

positive_peak_microsoft = Entity(MICROSOFT) & SentimentRange([0.8,1])
negative_peak_apple = Entity(APPLE) & SentimentRange([-1,-0.8])
query = positive_peak_microsoft | negative_peak_apple

search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

Document

By providing a document ID, you can retrieve all the chunks within that document, or all the chunks that meet the criteria of your query statements.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Entity, Document

bigdata = Bigdata()

MICROSOFT = "228D42"

query = Entity(MICROSOFT) & Document("0B4EE52A6A611A8326D7EA3E8DC075E3","9C67269CD8747E33DDEE94554A13E6EC")

search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

TranscriptTypes

At this point, you’re already familiar with the various components of a query and how to filter by specific types of content. Now, let’s delve into how to perform queries that allow you to discover transcripts with greater precision:

TranscriptTypes: This filter enables querying by the document type of the transcript. A DocumentChunk will be defined by a single document type at a time, with the possible values being:
- ANALYST_INVESTOR_SHAREHOLDER_MEETING: Analyst, Investor and Shareholder meeting.
- CONFERENCE_CALL: General Conference Call. Coming Soon
- GENERAL_PRESENTATION: General Presentation.
- EARNINGS_CALL: Earnings Call.
- EARNINGS_RELEASE: Earnings Release. Coming Soon
- GUIDANCE_CALL: Guidance Call.
- SALES_REVENUE_CALL: Sales and Revenue Call.
- SALES_REVENUE_RELEASE: Sales and Revenue Release. Coming Soon
- SPECIAL_SITUATION_MA: Special Situation, M&A and Other.
- SHAREHOLDERS_MEETING: Shareholders Meeting. Coming Soon
- MANAGEMENT_PLAN_ANNOUNCEMENT: Management Plan Announcement. Coming Soon
- INVESTOR_CONFERENCE_CALL: Investor Conference Call. Coming Soon

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Entity, TranscriptTypes

bigdata = Bigdata()

MICROSOFT = "228D42"

query = Entity(MICROSOFT) & TranscriptTypes.EARNINGS_CALL

search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

SectionMetadata: This filter allows querying for segments inside transcript documents. A DocumentChunk will be defined by one or more sections, always within its hierarchical structure:
- QA: question and answer section. This section can be decomposed on:
  - QUESTION: a question made during the session to a speaker.
  - ANSWER: an answer from a speaker of the event.
- MANAGEMENT_DISCUSSION: Management Discussion Section.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Entity, TranscriptTypes, SectionMetadata

bigdata = Bigdata()

MICROSOFT = "228D42"

query = Entity(MICROSOFT) & TranscriptTypes.EARNINGS_CALL & SectionMetadata.MANAGEMENT_DISCUSSION

search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

FilingTypes

You can also query a specific Filing type:

FilingTypes: This filter enables querying by a filing type. A DocumentChunk will be defined by a single document type at a time, with the possible values being:
- SEC_10_K: Annual report filing regarding a company’s financial performance submitted to the Securities and Exchange Commission (SEC).
- SEC_10_Q: Quarterly report filing regarding a company’s financial performance submitted to SEC.
- SEC_8_K: Report filed whenever a significant corporate event takes place that triggers a disclosure submitted to SEC.
- SEC_20_F: Annual report filing for non-U.S. and non-Canadian companies that have securities trading in the U.S.
- SEC_S_1: Filing needed to register the securities of companies that wish to go public with the U.S.
- SEC_S_3: Filing utilized when a company wishes to raise capital.
- SEC_6_K: Report of foreign private issuer pursuant to rules 13a-16 and 15d-16.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Entity, FilingTypes

bigdata = Bigdata()

MICROSOFT = "228D42"

query = Entity(MICROSOFT) & FilingTypes.SEC_10_K

search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

Reporting details

When querying TranscriptTypes or FilingTypes, you can also filter by reporting details like:

FiscalYear: Integer representing the annual reporting period.
FiscalQuarter: Integer representing the fiscal quarter covered.
ReportingEntity: This field allows searching by the reporting company.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import Entity, TranscriptTypes, SectionMetadata, FiscalYear, FiscalQuarter, ReportingEntity

bigdata = Bigdata()

MICROSOFT = "228D42"

query = (
    Entity(MICROSOFT) 
    & TranscriptTypes.EARNINGS_CALL 
    & SectionMetadata.MANAGEMENT_DISCUSSION
    & FiscalYear(2024) & FiscalQuarter(2)                   # filter by fiscal quarter 2, 2024
    # & FiscalQuarter(2)                                    # filter by fiscal quarter, any year
    # & FiscalYear(2024)                                    # filter by fiscal year only   
    & ReportingEntity(MICROSOFT)                            # Reported by the company itself
    )

search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

FileTag

You can also add a tag to your query to filter by private documents that include that tag.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import FileTag

bigdata = Bigdata()

MICROSOFT = "228D42"

query = (
    Entity(MICROSOFT)
    & FileTag("tag_1", "tag_2")
    )

search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

Query operators

Bigdata also allows you to perform very complex queries in a very expressive way. This is done by combining different query filters with & (AND) | (OR) and ~ (NOT) operators. For example:

from bigdata_client import Bigdata
from bigdata_client.daterange import RollingDateRange
from bigdata_client.models.search import DocumentType
from bigdata_client.query import Entity, Keyword, Topic, Similarity

bigdata = Bigdata()

TESLA = "DD3BB1"
APPLE = "D8442A"
GOOGLE = "D8C3A1"

tech_companies = Entity(TESLA) | Entity(APPLE) | Entity(GOOGLE)
keywords = Similarity("executive appointment") | Keyword("CEO resignation")
topics = (
    Topic("business,labor-issues,executive-appointment,,")
    | Topic("business,labor-issues,executive-resignation,,")
    | Topic("business,labor-issues,executive-retirement,,")
)
query = tech_companies & (keywords | topics)

search = bigdata.search.new(query)

for result in search.limit_documents(10):
    print(result)

This should be sufficient for most use cases, but sometimes the query is built from an external list of entities, keywords, topics, etc. For example, provided a list of entity ids you could do:

from bigdata_client import Bigdata
from bigdata_client.query import Entity

bigdata = Bigdata()

entity_ids = read_entity_ids_from_file()  # Just for explanation purposes 
entities = [Entity(eid) for eid in entity_ids]
query = None
for entity in entities:
    if query is None:
        query = entity
    else:
        query = query | entity
search = bigdata.search.new(query)

documents = search.run(2)
print(documents)

This is a bit cumbersome, so we provide two helper function to make this easier: All and Any. The first one is used to combine a list of entities, keywords, topics, etc. with the AND operator, and the second one is used to combine them with the OR operator. With the help from Any the previous example would be rewritten as:

from bigdata_client import Bigdata
from bigdata_client.query import Entity, Any

bigdata = Bigdata()

entity_ids = read_entity_ids_from_file()  # Just for explanation purposes 
entities = [Entity(eid) for eid in entity_ids]
query = Any(entities)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)

Document Version

Search by Document Version.

Example:

from bigdata_client import Bigdata
from bigdata_client.query import DocumentVersion

bigdata = Bigdata()

VERSION = "RAW"

query = DocumentVersion(VERSION)
# Search for DocumentVersion
search = bigdata.search.new(query)

documents = search.run(2)
print(documents)

See class DocumentVersion for further details.

Getting Started

Research Service

Search Service

Upload proprietary content

Knowledge Graph

Watchlist

Partner Content Upload

Entity

Watchlist

Topic

Source

Keyword

Similarity

SentimentRange

Document

TranscriptTypes

FilingTypes

Reporting details

FileTag

Query operators

Document Version

Getting Started

Research Service

Search Service

Upload proprietary content

Knowledge Graph

Watchlist

Partner Content Upload

​Entity

​Watchlist

​Topic

​Source

​Keyword

​Similarity

​SentimentRange

​Document

​TranscriptTypes

​FilingTypes

​Reporting details

​FileTag

​Query operators

​Document Version

Entity

Watchlist

Topic

Source

Keyword

Similarity

SentimentRange

Document

TranscriptTypes

FilingTypes

Reporting details

FileTag

Query operators

Document Version