ThemeTree

A hierarchical tree structure rooted in a main theme, branching into distinct sub-themes that guide the analyst’s research process.

Parameters

  • label (str): The name of the theme or sub-theme.
  • node (int): A unique identifier for the node.
  • summary (str, optional): A brief explanation of the node’s relevance.
  • children (Optional[List[ThemeTree]]): A list of child nodes representing sub-themes.
  • keywords (Optional[List[str]]): A list of keywords summarizing the main theme.

Key Methods

  • from_dict(tree_dict): Create a ThemeTree object from a dictionary.
  • as_string(prefix=""): Convert the tree into a string.
  • get_label_summaries(): Extract all label summaries from the tree.
  • get_summaries(): Extract all node summaries from the tree.
  • get_terminal_label_summaries(): Extract label/summary pairs from terminal nodes.
  • get_terminal_labels(): Extract terminal node labels.
  • get_terminal_summaries(): Extract summaries from terminal nodes.
  • print(prefix=""): Print the tree.
  • visualize(engine="graphviz"): Visualize the tree as a mind map (requires graphviz or plotly).
  • get_label_to_parent_mapping(): Map each leaf node label to its parent.
  • save_json(filepath): Save the ThemeTree as a JSON file.

Example

from bigdata_research_tools.themes import ThemeTree

tree = ThemeTree(
    label="AI",
    node=0,
    summary="Artificial Intelligence and its applications.",
    children=[],
    keywords=["machine learning", "automation"]
)
print(tree.as_string())

generate_theme_tree

Generate a ThemeTree from a main theme and (optionally) a focus.

Parameters

  • main_theme (str): The primary theme to analyze.
  • focus (str, optional): Specific aspect(s) to guide sub-theme generation.
  • llm_model_config (dict, optional): Configuration for the LLM used to generate themes.

Returns

  • ThemeTree: The generated theme tree.

Example

from bigdata_research_tools.themes import generate_theme_tree

tree = generate_theme_tree(main_theme="AI", focus="healthcare")
tree.print()

stringify_label_summaries

Convert the label summaries of a ThemeTree into a list of strings.

Parameters

  • label_summaries (dict): Dictionary of label summaries from a ThemeTree.

Returns

  • List[str]: List of strings, each containing a label and its summary.

Example

from bigdata_research_tools.themes import stringify_label_summaries

summaries = tree.get_label_summaries()
summary_strings = stringify_label_summaries(summaries)
for s in summary_strings:
    print(s)