The Bigdata.com API provides powerful retrieval capabilities,
enabling you to search and analyze news articles, transcripts, corporate
filings, and other documents. Notably, it supports both keyword-based
searches and similarity searches, along with a range of other
advanced search
features.
In this notebook, weβll demonstrate how to use the Bigdata.com API to
perform a similarity search effectively.
Copy
Ask AI
# Import required modules and classesimport htmlfrom IPython.display import display, HTMLfrom bigdata_client import Bigdatafrom bigdata_client.daterange import RollingDateRangefrom bigdata_client.models.advanced_search_query import Similarityfrom bigdata_client.models.search import DocumentType, SortBy# Initialize the Bigdata client# Make sure BIGDATA_USERNAME and BIGDATA_PASSWORD are set in the environment# Alternatively, you can pass your credentials directly to the Bigdata classbigdata = Bigdata()
We define our search parameters, including the query, time period, and
the number of documents to retrieve. In this example, we are searching
for articles related to the Federal Reserveβs actions on inflation and
concerns about tariffs.
Copy
Ask AI
# Create a similarity search queryquery = Similarity('Fed addresses inflation amid tariff concerns')# Search within a specific time frameDATE_RANGE = RollingDateRange.LAST_WEEK# Set the rerank threshold to improve search relevanceRERANK_THRESHOLD = 0.85# This will limit the search to news articles onlychunk_relevance = ...# Set the maximum number of documents to retrieveDOCUMENT_LIMIT = 10
We now run the search using the specified parameters.
One of the key features of the Bigdata API is the ability to rerank
the search results based on relevance scores. This is a cross-encoder
reranking that can help you find the most relevant documents quickly.
You can read more about the reranking feature
here.
We activate this feature by setting the rerank_threshold:
Copy
Ask AI
# Execute the search# Configure and execute the search with specified parameterssearch = bigdata.search.new( query=query, date_range=DATE_RANGE, rerank_threshold=RERANK_THRESHOLD, scope=DocumentType.NEWS, # Limit to news articles sortby=SortBy.RELEVANCE # Sort by relevance score)# Run the search and get resultsresults = search.run(DOCUMENT_LIMIT)
For more details and documentation on the Bigdata.com API, refer to the
official documentation.
There are many more filters that you can apply to narrow down your
search results.