Learn how to run your own financial risk analyzer service with a pre-built Docker and the Bigdata API.
bigdata-research-tools
package, providing a deployable API for systematic corporate exposure analysis.
The service uses the Bigdata API to analyze corporate exposure to specific risk scenarios using unstructured data from news, earnings calls, and regulatory filings. It combines hybrid semantic search, risk factor taxonomies, and structured validation techniques to deliver targeted extraction of risk signals and supporting evidence from massive unstructured datasets.
If you prefer to work with the risk analyzer as a Python package directly, you can explore the Risk Analyzer which provides Jupyter notebook cookbooks and detailed examples.
http://localhost:8000/
and the documentation for the API at http://localhost:8000/docs
.
http://localhost:8000/
for a user-friendly dashboard that lets you run risk analyses through a visual interfacehttp://localhost:8000/docs
for complete API documentation with interactive examplesmain_theme
: The overarching risk scenario you want to analyze. It can be specified as a single word or as a short sentence. The risk analyzer will generate a list of sub-themes representing individual, self contained components of the main risk.It can contain multiple core concepts, but we would recommend not adding too many core concepts in the same run.(e.g., “US Import Tariffs against China”, “Energy Transition”, “Regulatory Changes in AI”)focus
: Use this parameter to pass additional, custom instructions to the llm when breaking down the theme into sub-risks. These parameters allow you to guide the mindmap creation and customize it to your needs, as it allows users to inject their own domain knowledge, your specific point of view, and it will ensure that the mindmap will focus on the core concepts required to cover all risk dimensions.companies
: The portfolio of companies you want to screen for exposure, either as a list of RavenPack entity IDs representing individual companies ["4A6F00", "D8442A"]
or a watchlist ID "44118802-9104-4265-b97a-2e6d88d74893"
. Watchlists can be created programmatically using the Bigdata.com SDK or through the Bigdata appstart_date
/ end_date
: The start and end of the time sample during which you want to screen your portfolio for thematic exposure. The value has to be specified as a string in YYYY-MM-DD format.document_type
: The type of documents to search over. Use this parameter to point your screener to analyse text data extracted from news, corporate transcripts, or corporate filings. Currently, only supports “TRANSCRIPTS”.fiscal_year
: When screening for exposure in Transcripts and Filings, these documents can be further filtered by their reporting details. fiscal_year
represents the annual reporting period of the transcript and can be used in combination with start_date and end_date to further limit the queries to only those that are time sensitive from a calendar year and reporting period perspective. This parameter is not to be applied to News as news are not augmented with reporting metadata.frequency
: This parameter allows you to break down your sample range into higher frequency intervals. It can be useful when running a screener on a large sample, as the document_limit parameter will limit the ability of search to retrieve a representative sample of documents across many months. Instead of increasing the document limit, breaking down the creation of a large archive into smaller intervals will allow you to have more control over the retrieval process and obtain a more meaningful representation of exposure over time. The value must be one of: D
, Y
, M
, 3M
or Y
.keywords
: A list of keywords defining core concepts of the theme (or sub-theme) to be included in the query. The keywords from the list are combined with an OR operator and added to the Similarity and Entity filter with an AND.control_entities
: A dict of entities that have to be co-mentioned with the sentence, companies, and keywords. You can specify other companies, places, people, concepts, or more, and create a dictionary where entity types are the keys, and the values are lists of names (one per key). The tool will resolve the entity ids of these names, combine them together with an OR filter, and append the resulting logic to the base query with an AND operator. You can read more about it here.rerank_threshold
: An optional parameter to be used in conjunction with sentence search only. It acts as a second-step query filter that ensures a close cosine similarity between the embeddings of the sentences and the ones of the chunk retrieved. By default, this parameter is not applied so as not to significantly reduce the recall of similarity search, as accuracy can be boosted by applying entity and keyword filters instead. For most use cases, the one-step retrieval based on cosine similarity is more than enough. Further instructions on when the reranker can be useful are available here: https://docs.bigdata.com/how-to-guides/rerank_searchhttp://localhost:8000/docs
.