Identify cryptocurrencies positioned to benefit from institutional adoption ahead of broader market recognition. This workflow adapts our proven equity research screening methodology to digital assets, systematically identifying cryptocurrencies genuinely aligned with key institutional adoption drivers.

Why It Matters

Crypto institutional adoption is accelerating rapidly, but identifying which digital assets stand to benefit most requires analyzing vast amounts of information from news, regulatory updates, and market data. Manual analysis is inefficient and prone to bias. As institutional capital enters the market, the window to identify well-positioned cryptocurrencies before mass recognition is narrowing. A systematic, data-driven approach is essential to uncover emerging leaders ahead of the crowd.

What It Does

The ThematicScreener class delivers institutional-grade thematic intelligence at scale. Designed for analysts, PMs, and strategists managing thematic portfolios or scouting new ideas, it systematically connects different assets to investment themes using unstructured data from news, earnings calls, and filings.

How It Works

ThematicScreener follows a systematic 4-step workflow to transform investment themes into actionable intelligence:
  1. Mindmap Theme Taxonomy - LLM-powered breakdown of investment themes into specific, measurable sub-categories using automated theme tree generation
  2. Search with Premium Sources - Semantic content retrieval leveraging our dedicated Crypto Wire source, delivering ‘gold-standard’ quality, topicality, volume, source diversity, coverage and timeliness for comprehensive digital asset intelligence
  3. Label Results - LLM-based classification to analyze text chunks and determine relevance to sub-themes, filtering out content not explicitly linked to the main theme
  4. Post-process & Score - Qualitative-to-quantitative transformation that aggregates thematic signals into structured scoring methodologies for portfolio-level assessment

A Real-World Use Case

This cookbook demonstrates a systematic approach for identifying institutional adoption trends across 15 major cryptocurrencies, leveraging our dedicated Crypto Wire source. The workflow enables early detection of digital assets positioned to benefit from increasing institutional capital flows, delivering actionable insights before these trends become widely apparent. Ready to get started? Let’s dive in!
Open in GitHub

Prerequisites

To run the Crypto Thematic Exposure workflow, you can choose between two options:
  • 💻 GitHub cookbook
    • Use this if you prefer working locally or in a custom environment.
    • Follow the setup and execution instructions in the README.md.
    • API keys are required:
      • Option 1: Follow the key setup process described in the README.md
      • Option 2: Refer to this guide: How to initialise environment variables
        • ❗ When using this method, you must manually add the OpenAI API key:
          # OpenAI credentials
          OPENAI_API_KEY = "<YOUR_OPENAI_API_KEY>"
          
  • 🐳 Docker Installation
    • Docker installation is available for containerized deployment.
    • Provides an alternative setup method with containerized deployment, simplifying the environment configuration for those preferring Docker-based solutions.

Setup and Imports

Below is the Python code required for setting up our environment and importing necessary libraries.
from bigdata_client import Bigdata
from bigdata_client.models.search import DocumentType
from bigdata_research_tools.themes import (
    generate_theme_tree,
)
from bigdata_research_tools.labeler.screener_labeler import ScreenerLabeler
from bigdata_research_tools.workflows.utils import get_scored_df
from src.search_entities import search_by_entities, post_process_dataframe
import pandas as pd
from dotenv import load_dotenv
from pathlib import Path

# Load credentials
script_dir = Path(__file__).parent if '__file__' in globals() else Path.cwd()
load_dotenv(script_dir / '.env')

BIGDATA_USERNAME = os.getenv('BIGDATA_USERNAME')
BIGDATA_PASSWORD = os.getenv('BIGDATA_PASSWORD')
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')

if not all([BIGDATA_USERNAME, BIGDATA_PASSWORD, OPENAI_API_KEY]):
    raise ValueError("Missing required environment variables. Check your .env file.")

# Connect to Bigdata
bigdata = Bigdata(BIGDATA_USERNAME, BIGDATA_PASSWORD)

Defining your Screening Parameters

  • Main Theme (main_theme): The central concept to explore
  • Entity Universe (entities): The set of entities to screen
  • Control Entities (control_entities): The set of entities to be always comentioned with your watchlist (e.g. people, places, organizations, etc)
  • Time Period (start_date and end_date): The date range over which to run the search
  • Document Type (document_type): Specify which documents to search over (transcripts, filings, news)
  • Sources (sources): Specify set of sources within a document type, for example which news outlets (available via Bigdata API) you wish to search over. For this crypto analysis, we leverage our dedicated Crypto Wire [D6D057] source, which represents the ‘gold-standard’ in terms of quality, topicality, volume, source diversity, coverage and timeliness for cryptocurrency market intelligence
  • Fiscal Year (fiscal_year): If the document type is transcripts or filings, fiscal year needs to be specified
  • Model Selection (llm_model): The LLM model used to mindmap the theme and label the search result chunks
  • Rerank Threshold (rerank_threshold): By setting this value, you’re enabling the cross-encoder which reranks the results and selects those whose relevance is above the percentile you specify (0.7 being the 70th percentile). More information on the re-ranker can be found here.
  • Focus (focus): Specify a focus within the main theme. This will then be used in building the LLM generated mindmapper
# ===== Theme Definition =====
main_theme = "Crypto Institutional Adoption"
focus="Include know your customer (KYC) and anti-money laundering (AML) themes"

# ===== Entity Universe (from Watchlist) =====
# Get Top 15 Cryptos watchlist from Bigdata.com
top_15_cryptos = "812cb73c-55b1-410c-8fda-3fb3fd770967"
watchlist = bigdata.watchlists.get(top_15_cryptos)
entities = bigdata.knowledge_graph.get_entities(watchlist.items)
control_entities = None

# ===== LLM Specification =====
llm_model = "openai::gpt-4o-mini"

# ===== Docs Configuration =====
document_type = DocumentType.NEWS
fiscal_year = None

# ===== Enable/Disable Reranker =====
rerank_threshold = None

# ===== Specify Time Range =====
start_date = "2025-01-01"
end_date = "2025-09-08"

# ===== Source Selection =====
# Crypto Wire [D6D057] - Gold-standard crypto intelligence
# Premium dedicated source providing superior quality, topicality, volume, 
# source diversity, coverage and timeliness for cryptocurrency market analysis
sources = ["D6D057"]

Mindmap a Theme Taxonomy with Bigdata Research Tools

You can leverage Bigdata Research Tools to generate a comprehensive theme taxonomy with an LLM, breaking down a megatrend into smaller, well-defined concepts for more targeted analysis.
theme_tree = generate_theme_tree(
    main_theme=main_theme,
    focus=focus,
)

theme_tree.visualize()
Theme Tree Visualization showing Crypto Institutional Adoption broken down into sub-themes
The taxonomy tree includes descriptive sentences that explicitly connect each sub-theme back to the main theme, ensuring all search results remain contextually relevant to our central trend.
node_summaries = theme_tree.get_summaries()
labels=list(theme_tree.get_terminal_label_summaries().keys())

Retrieve Content

With the theme taxonomy and screening parameters, you can leverage the Bigdata API to run a search on news. We need to define 3 more parameters for searching:
  • Frequency (freq): The frequency of the date ranges to search over. Supported values:
    • Y: Yearly intervals.
    • M: Monthly intervals.
    • W: Weekly intervals.
    • D: Daily intervals. Defaults to 3M.
  • Document Limit (document_limit): The maximum number of documents to return per query to Bigdata API.
  • Batch Size (batch_size): The number of entities to include in a single batched query.
freq = "3M"
document_limit = 10
batch_size = 10

df_sentences = search_by_entities(entities=entities,
    sentences=node_summaries,
    start_date=start_date,
    end_date=end_date,
    scope=document_type,
    sources=sources,
    fiscal_year=None,
    freq=freq,
    document_limit=document_limit,
    batch_size=batch_size
)

Label the Results

Use an LLM to analyze each text chunk and determine its relevance to the sub-themes. Any chunks which aren’t explicitly linked to the main theme will be filtered out.
labeler = ScreenerLabeler(llm_model=llm_model)
df_labels = labeler.get_labels(
    main_theme=main_theme,
    labels=list(theme_tree.get_terminal_label_summaries().keys()),
    texts=df_sentences["masked_text"].tolist(),
)

# Merge and process results
df = pd.merge(df_sentences, df_labels, left_index=True, right_index=True)

df = post_process_dataframe(df)

Assess Thematic Exposure

We’ll look at the top 10 most exposed entities to our main theme. The function get_scored_df will calculate the composite thematic score, summing up the scores across the sub-themes for each entity (df_entity).
df_entity = pd.DataFrame()

df_entity = get_scored_df(
    df, index_columns=["Entity"], pivot_column="Theme"
)

from src.visualization_tool import display_figures

display_figures(df_entity, interactive=False, n_entities=10)
thematic exposure heatmap
thematic exposure score
top thematics
thematics scores

Conclusion

The Crypto Institutional Adoption Analysis provides a powerful way to identify cryptocurrencies positioned to benefit from institutional investment flows and regulatory acceptance. By leveraging our dedicated Crypto Wire [D6D057] source - the ‘gold-standard’ for cryptocurrency intelligence - and applying LLM-based classification, you can:
  1. Identify institutional adoption leaders - Find cryptocurrencies with the strongest regulatory compliance, institutional partnerships, and enterprise-grade infrastructure
  2. Track adoption readiness - Assess which digital assets are implementing KYC/AML frameworks, custody solutions, and institutional-grade security measures
  3. Monitor regulatory positioning - Evaluate how different cryptocurrencies are adapting to evolving regulatory requirements and institutional standards
  4. Discover early adoption signals - Spot cryptocurrencies gaining institutional traction before it becomes widely recognized in the market
Whether you’re building institutional-focused crypto portfolios, conducting due diligence for institutional clients, or tracking the evolution of crypto’s mainstream adoption, this workflow transforms institutional development signals into structured, investment-ready intelligence.