Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.bigdata.com/llms.txt

Use this file to discover all available pages before exploring further.

We are sunsetting our SDKs and will no longer add new features, security patches, bug fixes, or technical support for them. To access the latest capabilities and ongoing improvements, we encourage you to migrate to our RESTful API.SDK support will officially end on December 31, 2026. On this date, the underlying endpoints used by the SDKs and related documentation will be decommissioned.To avoid any disruption to your services, please ensure your migration is complete by that date.For migration assistance, please contact us at support@bigdata.com.

Labeler (class)

Base class for labeling operations using an LLM. Parameters (constructor)
  • llm_model (str): Name of the LLM model to use (e.g., "openai::gpt-4o-mini").
  • unknown_label (str, optional): Label for unclear classifications (default: "unclear").
  • temperature (float, optional): Temperature for the LLM model (default: 0).
Key Methods
  • _deserialize_label_responses(responses): Deserialize LLM responses into a DataFrame.
  • _run_labeling_prompts(prompts, system_prompt, max_workers=100): Run prompts concurrently and collect LLM responses.
Example
from bigdata_research_tools.labeler import Labeler

labeler = Labeler(llm_model="openai::gpt-4o-mini")
# Use labeler._run_labeling_prompts(...) and labeler._deserialize_label_responses(...)

get_prompts_for_labeler

Generate a list of user messages for each text to be labelled by the labeling system. Parameters
  • texts (List[str]): Texts to get the labels from.
  • textsconfig (Optional[List[Dict]], optional): Optional fields for the prompts in addition to the text.
Returns
  • List[str]: List of prompts for the labeling system.
Example
from bigdata_research_tools.labeler import get_prompts_for_labeler

texts = ["Chunk 0 text here", "Chunk 1 text here"]
prompts = get_prompts_for_labeler(texts)

parse_labeling_response

Parse the response from the LLM model used for labeling. Parameters
  • response (str): The response from the LLM model as a raw string.
Returns
  • dict: Parsed dictionary with keys:
    • motivation
    • label
Example
from bigdata_research_tools.labeler import parse_labeling_response

response = '{"motivation": "Relevant to AI", "label": "AI"}'
parsed = parse_labeling_response(response)
print(parsed["label"])

NarrativeLabeler (class)

LLM-powered labeler for narrative labeling. Parameters (constructor)
  • llm_model (str): Name of the LLM model to use.
  • label_prompt (str, optional): Custom prompt for labeling (default: uses system prompt).
  • unknown_label (str, optional): Label for unclear classifications (default: "unclear").
  • temperature (float, optional): Temperature for the LLM model (default: 0).
Key Methods
  • get_labels(theme_labels, texts, max_workers=50): Label a list of texts with the provided theme labels.
  • post_process_dataframe(df): Post-process the labeled DataFrame for export.
Example
from bigdata_research_tools.labeler import NarrativeLabeler

labeler = NarrativeLabeler(llm_model="openai::gpt-4o-mini")
labels_df = labeler.get_labels(
    theme_labels=["AI", "Cloud"],
    texts=["AI is transforming business.", "Cloud adoption is accelerating."]
)
processed_df = labeler.post_process_dataframe(labels_df)

get_prompts_for_labeler

Generate a list of user messages for each text to be labelled by the labeling system. Parameters
  • texts (List[str]): Texts to get the labels from.
  • textsconfig (Optional[List[Dict]], optional): Optional fields for the prompts in addition to the text.
Returns
  • List[str]: List of prompts for the labeling system.
Example
from bigdata_research_tools.labeler import get_prompts_for_labeler

texts = ["Chunk 0 text here", "Chunk 1 text here"]
prompts = get_prompts_for_labeler(texts)

parse_labeling_response

Parse the response from the LLM model used for labeling. Parameters
  • response (str): The response from the LLM model as a raw string.
Returns
  • dict: Parsed dictionary with keys:
    • motivation
    • label
Example
from bigdata_research_tools.labeler import parse_labeling_response

response = '{"motivation": "Relevant to AI", "label": "AI"}'
parsed = parse_labeling_response(response)
print(parsed["label"])

ScreenerLabeler (class)

LLM-powered labeler for thematic screener labeling. Parameters (constructor)
  • llm_model (str): Name of the LLM model to use.
  • label_prompt (str, optional): Custom prompt for labeling (default: uses system prompt).
  • unknown_label (str, optional): Label for unclear classifications (default: "unclear").
  • temperature (float, optional): Temperature for the LLM model (default: 0).
Key Methods
  • get_labels(main_theme, labels, texts, max_workers=50): Label a list of texts with the provided main theme and labels.
  • post_process_dataframe(df): Post-process the labeled DataFrame for export, including company/entity columns and placeholder replacement.
Example
from bigdata_research_tools.labeler import ScreenerLabeler

labeler = ScreenerLabeler(llm_model="openai::gpt-4o-mini")
labels_df = labeler.get_labels(
    main_theme="AI",
    labels=["AI", "Cloud"],
    texts=["AI is transforming business.", "Cloud adoption is accelerating."]
)
processed_df = labeler.post_process_dataframe(labels_df)

get_prompts_for_labeler

Generate a list of user messages for each text to be labelled by the labeling system. Parameters
  • texts (List[str]): Texts to get the labels from.
  • textsconfig (Optional[List[Dict]], optional): Optional fields for the prompts in addition to the text.
Returns
  • List[str]: List of prompts for the labeling system.
Example
from bigdata_research_tools.labeler import get_prompts_for_labeler

texts = ["Chunk 0 text here", "Chunk 1 text here"]
prompts = get_prompts_for_labeler(texts)

parse_labeling_response

Parse the response from the LLM model used for labeling. Parameters
  • response (str): The response from the LLM model as a raw string.
Returns
  • dict: Parsed dictionary with keys:
    • motivation
    • label
Example
from bigdata_research_tools.labeler import parse_labeling_response

response = '{"motivation": "Relevant to AI", "label": "AI"}'
parsed = parse_labeling_response(response)
print(parsed["label"])