Why It Matters
Crypto institutional adoption is accelerating rapidly, but identifying which digital assets stand to benefit most requires analyzing vast amounts of information from news, regulatory updates, and market data. Manual analysis is inefficient and prone to bias. As institutional capital enters the market, the window to identify well-positioned cryptocurrencies before mass recognition is narrowing. A systematic, data-driven approach is essential to uncover emerging leaders ahead of the crowd.What It Does
TheThematicScreener
class delivers institutional-grade thematic intelligence at scale. Designed for analysts, PMs, and strategists managing thematic portfolios or scouting new ideas, it systematically connects different assets to investment themes using unstructured data from news, earnings calls, and filings.
How It Works
ThematicScreener follows a systematic 4-step workflow to transform investment themes into actionable intelligence:- Mindmap Theme Taxonomy - LLM-powered breakdown of investment themes into specific, measurable sub-categories using automated theme tree generation
- Search with Premium Sources - Semantic content retrieval leveraging our dedicated Crypto Wire source, delivering ‘gold-standard’ quality, topicality, volume, source diversity, coverage and timeliness for comprehensive digital asset intelligence
- Label Results - LLM-based classification to analyze text chunks and determine relevance to sub-themes, filtering out content not explicitly linked to the main theme
- Post-process & Score - Qualitative-to-quantitative transformation that aggregates thematic signals into structured scoring methodologies for portfolio-level assessment
A Real-World Use Case
This cookbook demonstrates a systematic approach for identifying institutional adoption trends across 15 major cryptocurrencies, leveraging our dedicated Crypto Wire source. The workflow enables early detection of digital assets positioned to benefit from increasing institutional capital flows, delivering actionable insights before these trends become widely apparent. Ready to get started? Let’s dive in!Prerequisites
To run the Crypto Thematic Exposure workflow, you can choose between two options:-
💻 GitHub cookbook
- Use this if you prefer working locally or in a custom environment.
- Follow the setup and execution instructions in the
README.md
. - API keys are required:
- Option 1: Follow the key setup process described in the
README.md
- Option 2: Refer to this guide: How to initialise environment variables
- ❗ When using this method, you must manually add the OpenAI API key:
- ❗ When using this method, you must manually add the OpenAI API key:
- Option 1: Follow the key setup process described in the
-
🐳 Docker Installation
- Docker installation is available for containerized deployment.
- Provides an alternative setup method with containerized deployment, simplifying the environment configuration for those preferring Docker-based solutions.
Setup and Imports
Below is the Python code required for setting up our environment and importing necessary libraries.Defining your Screening Parameters
- Main Theme (
main_theme
): The central concept to explore - Entity Universe (
entities
): The set of entities to screen - Control Entities (
control_entities
): The set of entities to be always comentioned with your watchlist (e.g. people, places, organizations, etc) - Time Period (
start_date
andend_date
): The date range over which to run the search - Document Type (
document_type
): Specify which documents to search over (transcripts, filings, news) - Sources (
sources
): Specify set of sources within a document type, for example which news outlets (available via Bigdata API) you wish to search over. For this crypto analysis, we leverage our dedicated Crypto Wire [D6D057] source, which represents the ‘gold-standard’ in terms of quality, topicality, volume, source diversity, coverage and timeliness for cryptocurrency market intelligence - Fiscal Year (
fiscal_year
): If the document type is transcripts or filings, fiscal year needs to be specified - Model Selection (
llm_model
): The LLM model used to mindmap the theme and label the search result chunks - Rerank Threshold (
rerank_threshold
): By setting this value, you’re enabling the cross-encoder which reranks the results and selects those whose relevance is above the percentile you specify (0.7 being the 70th percentile). More information on the re-ranker can be found here. - Focus (
focus
): Specify a focus within the main theme. This will then be used in building the LLM generated mindmapper
Mindmap a Theme Taxonomy with Bigdata Research Tools
You can leverage Bigdata Research Tools to generate a comprehensive theme taxonomy with an LLM, breaking down a megatrend into smaller, well-defined concepts for more targeted analysis.Retrieve Content
With the theme taxonomy and screening parameters, you can leverage the Bigdata API to run a search on news. We need to define 3 more parameters for searching:- Frequency (
freq
): The frequency of the date ranges to search over. Supported values:Y
: Yearly intervals.M
: Monthly intervals.W
: Weekly intervals.D
: Daily intervals. Defaults to3M
.
- Document Limit (
document_limit
): The maximum number of documents to return per query to Bigdata API. - Batch Size (
batch_size
): The number of entities to include in a single batched query.
Label the Results
Use an LLM to analyze each text chunk and determine its relevance to the sub-themes. Any chunks which aren’t explicitly linked to the main theme will be filtered out.Assess Thematic Exposure
We’ll look at the top 10 most exposed entities to our main theme. The functionget_scored_df
will calculate the composite thematic score, summing up the scores across the sub-themes for each entity (df_entity
).




Conclusion
The Crypto Institutional Adoption Analysis provides a powerful way to identify cryptocurrencies positioned to benefit from institutional investment flows and regulatory acceptance. By leveraging our dedicated Crypto Wire [D6D057] source - the ‘gold-standard’ for cryptocurrency intelligence - and applying LLM-based classification, you can:- Identify institutional adoption leaders - Find cryptocurrencies with the strongest regulatory compliance, institutional partnerships, and enterprise-grade infrastructure
- Track adoption readiness - Assess which digital assets are implementing KYC/AML frameworks, custody solutions, and institutional-grade security measures
- Monitor regulatory positioning - Evaluate how different cryptocurrencies are adapting to evolving regulatory requirements and institutional standards
- Discover early adoption signals - Spot cryptocurrencies gaining institutional traction before it becomes widely recognized in the market