Why It Matters

Central bank announcements can trigger significant movements in currency, bond, and equity markets. Timely identification of emerging narratives around monetary policy, rate decisions, and economic outlooks is critical for traders, analysts, and policymakers seeking to anticipate market reactions and adjust strategies accordingly.

What It Does

This workflow identifies, verifies, clusters, and summarizes the most relevant and impactful news related to central bank announcements. It uses the Bigdata API for content retrieval and large language models for topic analysis, producing daily market reports and structured datasets for monitoring or backtesting.

How It Works

The notebook implements a four-step agentic workflow built on Bigdata API:
  • Lexicon Generation of monetary policy and central bank-specific terminology to maximize recall in news retrieval
  • Content Retrieval via the Bigdata API, splitting searches into daily windows and parallelizing keyword lookups for speed
  • Topic Clustering & Selection to verify, group, and summarize news into ranked trending topics, scoring each for trendiness, novelty, impact, and magnitude
  • Custom Report Generation in the form of a daily digest with a configurable ranking system, supported by granular news sources for verification

A Real-World Use Case

This cookbook illustrates the full workflow through a practical example: tracking the most discussed topics in central bank communications during the week of the 2025 Jackson Hole meeting. You’ll learn how to transform unstructured policy-related news into structured, ranked insights on market-moving announcements. Ready to get started? Let’s dive in! Open in GitHub

Prerequisites

To run the Daily Digest Central Banks workflow, you can choose between two options:
  • 💻 GitHub cookbook
    • Use this if you prefer working locally or in a custom environment.
    • Follow the setup and execution instructions in the README.md.
    • API keys are required:
      • Option 1: Follow the key setup process described in the README.md
      • Option 2: Refer to this guide: How to initialise environment variables
        • ❗ When using this method, you must manually add the OpenAI API key:
          # OpenAI credentials
          OPENAI_API_KEY = "<YOUR_OPENAI_API_KEY>"
          
  • 🐳 Docker Installation
    • Docker installation is available for containerized deployment.
    • Provides an alternative setup method with containerized deployment, simplifying the environment configuration for those preferring Docker-based solutions.

Setup and Imports

Below is the Python code required for setting up our environment and importing necessary libraries.
from bigdata_client import Bigdata
from bigdata_client.models.search import DocumentType
from src.lexicon_generator import LexiconGenerator
from src.search_topics import search_by_keywords
from src.topics_extractor import (process_all_reports,
                                run_process_all_trending_topics,
                                run_add_advanced_novelty_scores,
                                add_market_impact_to_df,)
from src.report_generator import(prepare_data_for_report, generate_html_report,
                                save_html_report)

# Define output file paths for our results
output_dir = "output"
os.makedirs(output_dir, exist_ok=True)

export_path = f"{output_dir}/daily_digest_central_banks.csv"

Defining Your Daily Digest Context and Parameters

To perform a trending topics analysis, we need to define several key parameters:
  • Main Theme (main_theme): The main topic, asset class, or context to analyze (e.g. Central Bank Announcements)
  • Point of View (point_of_view): The additional instructions to the LLM-based Impact and Magnitude generation. Use this parameter to add your domain expertise and contextualize your own definition of Financial Materiality
  • Time Period (start_date and end_date): The date range over which to run the search
  • Document Type (document_type): Specify which documents to search over (transcripts, filings, news)
  • Model Selection (llm_model): The AI model used for semantic analysis and topic classification
  • Frequency (frequency): The frequency of the date ranges to search over. Supported values:
    • Y: Yearly intervals
    • M: Monthly intervals
    • W: Weekly intervals
    • D: Daily intervals (default)
  • Document Limit (document_limit): The maximum number of documents to return per query to Bigdata API
# ===== Context Definition =====
main_theme = "Central Bank Announcements"

point_of_view = 'Domestic Equity market'

# ===== Specify Time Range =====
start_query = '2025-08-18'
end_query = '2025-08-26'

# ===== Query Configuration =====
document_type = DocumentType.NEWS

# ===== LLM Specification =====
llm_model = "gpt-4o-mini"

# ===== Query Configuration =====
document_limit = 10 # Maximum number of documents to retrieve per query
frequency = 'D'  # Query frequency

Instantiating the Lexicon Generator

In this step, we identify the specialized industry-specific jargon relevant to Central Banks Announcements to ensure a high recall in the content retrieval.
lexicongenerator = LexiconGenerator(openai_key=OPENAI_API_KEY, model="gpt-4o", seeds=[123, 123456, 123456789, 456789, 789])

keywords = lexicongenerator.generate(theme=main_theme)

Content Retrieval from Bigdata Search API

In this section, we perform a keyword search on the news content with the Bigdata API to retrieve documents, splitting the search over daily timeframes and multi-threading the content search on the individual keywords for speed purpose. With the list of market-specific keywords parameters, you can leverage the Search functionalities in bigdata-research-tools, built with Bigdata API, to run search at scale against news documents.
results, daily_keyword_count = search_by_keywords(
    keywords=keywords,
    start_date=start_query,
    end_date=end_query,
    scope=document_type,
    freq=frequency,
    document_limit=document_limit)

Topic Clustering and Summarization

In this step, we perform topic modelling using a large language model to verify and cluster the news. Then, the summarization ensures topic selection identifying the top trending news for Central Banks Announcements, while deriving advanced analytics to quantify the trendiness, novelty, impact and magnitude of the trending topics.
semaphore_size = 1000 # Maximum number of concurrent requests to OpenAI API

# Apply verification layer to remove irrelevant news
filtered_reports = process_all_reports(results, llm_model, OPENAI_API_KEY, main_theme, semaphore_size)

# Perform topic modeling and clustering
flattened_trending_topics_df = run_process_all_trending_topics(
    unique_reports=filtered_reports,
    model=llm_model,
    start_query=start_query,
    end_query=end_query,
    api_key=OPENAI_API_KEY,
    main_theme=main_theme,
    batches=20 # Number of batches to process the reports in parallel
)

Topic Scoring

Trendiness and Novelty Scores: We derive analytics related to the trendiness of the topic based on the news volume, and the novelty of the topic based on the changes in daily summaries, evaluating the uniqueness and freshness of each topic.
flattened_trending_topics_df = run_add_advanced_novelty_scores(flattened_trending_topics_df, api_key=OPENAI_API_KEY, main_theme=main_theme)
Price Impact: We derive analytics related to the impact (Positive, Negative) and magnitude (High, Medium, Low) of the topics, inferring their market impact on equity prices. The price impact inference is based on the price mechanisms and the perceived sentiment and market reaction of the news on the market.
flattened_trending_topics_df = add_market_impact_to_df(flattened_trending_topics_df, api_key=OPENAI_API_KEY, main_theme=main_theme, point_of_view=point_of_view)

Generate a Custom Daily Digest

In this step, we rank the topics, allowing the user to customize the ranking system to reindex the news, based on their trendiness, novelty, and impact on equity prices.
specific_date = '2025-08-25'  # Example date, can be modified as needed
user_selected_ranking = ['novelty', 'volume', 'magnitude']  # User can modify this list
#impact_filter = 'positive_impact' #User can use the impact_filter to filter out the report

prepared_reports = prepare_data_for_report(flattened_trending_topics_df, user_selected_ranking, impact_filter=None, report_date=specific_date)

# Generate and display the HTML report for each date
for report in prepared_reports:
    html_content = generate_html_report(
        report['date'],
        report['day_in_review'],
        report['topics'],
        main_theme  # Pass the main theme to dynamically generate the title
    )
    save_html_report(html_content, report['date'], main_theme)
daily digest example

Export the Results

Export the data as CSV files for further analysis or to share with the team.
flattened_trending_topics_df.to_csv(export_path, index=False)

Conclusion

The Daily Digest provides a comprehensive framework for identifying and quantifying trending topics in central bank communications. By leveraging advanced information retrieval and LLM-powered analysis, this workflow transforms unstructured data into actionable market intelligence. Through the automated analysis of central bank announcement dynamics, you can:
  1. Identify trending topics - Discover the most relevant and impactful news trends in central bank communications through systematic analysis
  2. Assess market impact - Use scoring methodology to evaluate the potential impact and magnitude of policy announcements on equity markets
  3. Generate daily reports - Create professional HTML reports with ranked topics and comprehensive policy summaries
  4. Export structured data - Obtain structured datasets for backtesting and further quantitative analysis
Whether you’re conducting policy analysis, building trading strategies, or monitoring monetary policy exposure, the Daily Digest automates the research process while providing the depth required for professional market analysis.