What is it?

Risk analysis is the systematic process of identifying, assessing, and quantifying potential threats that could impact investment portfolios, corporate operations, or financial markets. In today’s interconnected global economy, understanding how companies are exposed to geopolitical events, supply chain disruptions, regulatory changes, and economic shifts is critical for informed decision-making. Traditional risk assessment often relies on backward-looking metrics and structured data. However, emerging risks and their early signals are frequently embedded in unstructured data sources like news articles, earnings call transcripts, and regulatory filings. Modern risk analysis leverages AI and natural language processing to extract these signals and convert them into actionable intelligence.

Deploy your own risk analyzer service

To support users building with Bigdata, we’ve released a pre-built Docker image that lets you run your own financial risk analysis service. This service is built on top of the same Risk Analyzer technology available in our bigdata-research-tools package, providing a deployable API for systematic corporate exposure analysis. The service uses the Bigdata API to analyze corporate exposure to specific risk scenarios using unstructured data from news, earnings calls, and regulatory filings. It combines hybrid semantic search, risk factor taxonomies, and structured validation techniques to deliver targeted extraction of risk signals and supporting evidence from massive unstructured datasets. If you prefer to work with the risk analyzer as a Python package directly, you can explore the Risk Analyzer which provides Jupyter notebook cookbooks and detailed examples.
Open in GitHub

What the Risk Analyzer Service Provides

The risk analyzer service offers several key capabilities for professional risk assessment:
  • Automated Risk Taxonomy Generation: Uses AI to create comprehensive risk hierarchies and sub-scenarios for any given theme, ensuring systematic coverage of all risk dimensions
  • Standardized Exposure Metrics: Provides quantifiable risk scores that enable comparison across firms, sectors, or entire portfolios
  • Time-based Risk Monitoring: Tracks how exposure levels shift in response to world events, policy changes, or market developments
  • Multi-source Intelligence: Analyzes unstructured data from news articles, earnings call transcripts, and regulatory filings to capture comprehensive risk signals
  • RESTful API Interface: Offers programmatic access for integration into existing risk management systems and workflows
  • Interactive Web Interface: Provides a user-friendly dashboard for ad-hoc risk analysis and exploration
Current Limitation: The demo version is currently restricted to analyzing TRANSCRIPTS only. Support for additional document sources (NEWS and FILINGS) is planned for future releases.

Setup

Some pre-requisites are required to run the service:
  • A Bigdata.com account that supports programmatic access.
  • A Bigdata.com API key, which can be obtained from your account settings.
  • An LLM and embeddings provider, currently the service supports OpenAI.
To build and run the Docker image, you need to have Docker installed on your machine.

Quickstart

To quickly get started, you have two options:
  1. Build and run locally: You need to build the docker image first and then run it:
# Clone the repository and navigate to the folder
git clone git@github.com:Bigdata-com/bigdata-risk-analyzer.git
cd "bigdata-risk-analyzer"

# Build the docker image
docker build -t bigdata_risk_analyzer .

# Run the docker image
docker run -d \
  --name bigdata_risk_analyzer \
  -p 8000:8000 \
  -e BIGDATA_API_KEY=<bigdata-api-key-here> \
  -e OPENAI_API_KEY=<openai-api-key-here> \
  bigdata_risk_analyzer
  1. Run directly from GitHub Container Registry:
docker run -d \
  --name bigdata_risk_analyzer \
  -p 8000:8000 \
  -e BIGDATA_API_KEY=<bigdata-api-key-here> \
  -e OPENAI_API_KEY=<openai-api-key-here> \
  ghcr.io/bigdata-com/bigdata-risk-analyzer:latest
This will start the risk analyzer service locally on port 8000. You can then access the service at http://localhost:8000/ and the documentation for the API at http://localhost:8000/docs.
For custom enterprise-ready solutions, please contact us at support@bigdata.com. If you are interested in using a different LLM provider—whether enterprise-grade or self-hosted solutions, let us know by opening an issue on the Bigdata.com GitHub repository or through our support channels.

Usage

The risk analyzer service provides a comprehensive API for generating systematic risk analysis across your investment universe. Once the service is running, you can access it on port 8000 by default through both programmatic endpoints and an interactive web interface.

Access Methods

  • Interactive Web Interface: Navigate to http://localhost:8000/ for a user-friendly dashboard that lets you run risk analyses through a visual interface
  • API Documentation: Visit http://localhost:8000/docs for complete API documentation with interactive examples
  • Programmatic Access: Use the RESTful API endpoints for integration into your existing workflows and systems

Core Parameters

To run a risk analysis, you’ll need to configure several key parameters that define the scope and focus of your analysis: Risk Definition Parameters:
  • main_theme: The overarching risk scenario you want to analyze. It can be specified as a single word or as a short sentence. The risk analyzer will generate a list of sub-themes representing individual, self contained components of the main risk.It can contain multiple core concepts, but we would recommend not adding too many core concepts in the same run.(e.g., “US Import Tariffs against China”, “Energy Transition”, “Regulatory Changes in AI”)
  • focus: Use this parameter to pass additional, custom instructions to the llm when breaking down the theme into sub-risks. These parameters allow you to guide the mindmap creation and customize it to your needs, as it allows users to inject their own domain knowledge, your specific point of view, and it will ensure that the mindmap will focus on the core concepts required to cover all risk dimensions.
Company Universe:
  • companies: The portfolio of companies you want to screen for exposure, either as a list of RavenPack entity IDs representing individual companies ["4A6F00", "D8442A"] or a watchlist ID "44118802-9104-4265-b97a-2e6d88d74893". Watchlists can be created programmatically using the Bigdata.com SDK or through the Bigdata app
Search Configuration:
  • start_date / end_date: The start and end of the time sample during which you want to screen your portfolio for thematic exposure. The value has to be specified as a string in YYYY-MM-DD format.
  • document_type: The type of documents to search over. Use this parameter to point your screener to analyse text data extracted from news, corporate transcripts, or corporate filings. Currently, only supports “TRANSCRIPTS”.
  • fiscal_year: When screening for exposure in Transcripts and Filings, these documents can be further filtered by their reporting details. fiscal_year represents the annual reporting period of the transcript and can be used in combination with start_date and end_date to further limit the queries to only those that are time sensitive from a calendar year and reporting period perspective. This parameter is not to be applied to News as news are not augmented with reporting metadata.
  • frequency: This parameter allows you to break down your sample range into higher frequency intervals. It can be useful when running a screener on a large sample, as the document_limit parameter will limit the ability of search to retrieve a representative sample of documents across many months. Instead of increasing the document limit, breaking down the creation of a large archive into smaller intervals will allow you to have more control over the retrieval process and obtain a more meaningful representation of exposure over time. The value must be one of: D, Y, M, 3M or Y.
  • keywords: A list of keywords defining core concepts of the theme (or sub-theme) to be included in the query. The keywords from the list are combined with an OR operator and added to the Similarity and Entity filter with an AND.
  • control_entities: A dict of entities that have to be co-mentioned with the sentence, companies, and keywords. You can specify other companies, places, people, concepts, or more, and create a dictionary where entity types are the keys, and the values are lists of names (one per key). The tool will resolve the entity ids of these names, combine them together with an OR filter, and append the resulting logic to the base query with an AND operator. You can read more about it here.
Advanced Options:
  • rerank_threshold: An optional parameter to be used in conjunction with sentence search only. It acts as a second-step query filter that ensures a close cosine similarity between the embeddings of the sentences and the ones of the chunk retrieved. By default, this parameter is not applied so as not to significantly reduce the recall of similarity search, as accuracy can be boosted by applying entity and keyword filters instead. For most use cases, the one-step retrieval based on cosine similarity is more than enough. Further instructions on when the reranker can be useful are available here: https://docs.bigdata.com/how-to-guides/rerank_search
For a complete list of parameters and their descriptions, refer to the API documentation at http://localhost:8000/docs.

Example: Analyzing Tariff Risk Exposure

Here’s a practical example analyzing how US import tariffs against China might impact companies in your portfolio:
curl -X 'POST' \
  'http://localhost:8000/risk-analysis' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "main_theme": "US Import Tariffs against China",
  "focus": "Provide a detailed taxonomy of risks describing how new American import tariffs against China will impact US companies, their operations and strategy. Cover trade-relations risks, foreign market access risks, supply chain risks, US market sales and revenue risks (including price impacts), and intellectual property risks, provide at least 4 sub-scenarios for each risk factor.",
  "companies": "44118802-9104-4265-b97a-2e6d88d74893",
  "control_entities": {
    "place": ["China"]
  },
  "start_date": "2024-01-01",
  "end_date": "2024-12-31",
  "keywords": ["Tariffs"],
  "document_type": "TRANSCRIPTS",
  "fiscal_year": 2024,
  "frequency": "M"
}'

Built on top of the risk analyzer service

The risk analyzer service serves as a foundational component for building sophisticated risk management and investment applications. Here are comprehensive use cases you can develop:

Portfolio Management Applications

  • Risk-Aware Portfolio Construction: Integrate risk scores into your optimization models to build portfolios that account for specific geopolitical or thematic risks
  • Dynamic Hedging Strategies: Automatically identify companies with inverse risk exposure to create natural hedges for your portfolio positions
  • Sector Rotation Models: Use industry-level risk analysis to time sector allocation decisions based on emerging risk scenarios
  • ESG Risk Integration: Combine traditional financial metrics with ESG risk exposure scores for comprehensive investment decision-making

Risk Management Systems

  • Automated Risk Monitoring: Set up continuous monitoring of your portfolio’s exposure to evolving risk scenarios with automated alerts
  • Stress Testing Frameworks: Build systematic stress testing capabilities that update with real-world events and policy changes
  • Regulatory Reporting: Generate quantitative risk reports that meet regulatory requirements with standardized exposure metrics
  • Risk Committee Dashboards: Create executive-level dashboards that translate complex risk analysis into actionable insights

Client-Facing Solutions

  • Custom Research Reports: Generate automated, client-specific risk analysis reports for your investment advisory services
  • Interactive Risk Analytics: Build client portals where investors can explore how different scenarios might impact their holdings
  • Risk-Based Investment Products: Develop thematic investment strategies based on systematic risk analysis (e.g., “Tariff-Resilient US Equity Fund”)
  • Due Diligence Automation: Integrate risk analysis into your investment research workflow for faster, more comprehensive company evaluation

Institutional Applications

  • Pension Fund Risk Management: Help institutional investors understand how their long-term portfolios are exposed to secular trends and policy shifts
  • Insurance Portfolio Analysis: Assess how different risk scenarios might impact insurance company investment portfolios
  • Corporate Treasury Risk Assessment: Enable corporations to understand how external risks might affect their vendor relationships, customer base, or supply chains
  • Sovereign Wealth Fund Analytics: Provide country-level investors with tools to assess how global developments might impact their domestic and international holdings

Integration Opportunities

  • Portfolio Management System APIs: Integrate risk scores directly into other proprietary portfolio management systems
  • CRM Integration: Connect risk insights to client relationship management systems for more informed client conversations
  • Compliance Monitoring: Build automated compliance monitoring that flags when portfolio risk exposure exceeds predefined thresholds