Why evaluate?
When financial institutions adopt agentic platforms for research automation, report quality is not optional. Accuracy, source provenance, analytical depth, and professional presentation must meet institutional standards across every report. To measure how well Bigdata MCP tools support this bar, we run structured comparative evaluations. Each evaluation takes the same prompt and generates two reports side by side: one grounded in Bigdata MCP tools (with access to SEC filings, earnings transcripts, analyst notes, and structured financial data) and one using general web search. Both reports are then scored across eight dimensions on a 1-10 scale. The goal is not to declare a winner in every category, but to surface where premium data access materially improves output quality and where it does not, giving teams a clear picture of the value MCP-grounded workflows deliver at scale.Comparative Evaluation example
Figma Q4 2025 Earnings Digest
On February 21, 2026, we evaluated the following reports:- Claude + Web Search: Generated using Claude and the general web search tool
- Claude + Bigdata MCP: Generated using Claude, Bigdata MCP tools and Bigdata skill workflow
1. Factual Accuracy of Financial Data
| Criterion | Claude + Web Search | Claude + Bigdata MCP |
|---|---|---|
| Q4 Revenue ($303.8M, +40% YoY) | Correct | Correct |
| FY Revenue ($1.056B, +41% YoY) | Correct | Correct |
| Non-GAAP EPS ($0.08 vs $0.07 est.) | States consensus was $(0.20), comparing GAAP vs non-GAAP. LSEG consensus for adj. EPS was $0.07, not $(0.20). The “$0.28 beat” is misleading. | Correct |
| GAAP Net Loss ($226.6M) | Correct | Correct |
| Non-GAAP Op Income ($44M, 14%) | Correct | Correct |
| Q4 2024 Non-GAAP Op Margin | States 26%. Plausible from prior quarter data but not independently verified as Q4 2024 specifically. | Not stated |
| NDR (136%, +5pp QoQ) | Correct | Correct |
| Guidance (Q1: $315-$317M; FY26: $1.366-$1.374B) | Correct | Correct |
| FY25 Non-GAAP Op Income ($130M) | Referenced indirectly | Correct (explicitly stated, sourced) |
| Cash position ($1.7B) | Correct | Correct |
| FY25 Adjusted FCF ($243.4M) | States $242.7M, a minor ~$0.7M discrepancy | Correct |
| GAAP Gross Margin decline (92% to 82%) | Correctly identified | Not broken out separately |
| Customer >$10K ARR (13,861) | Correct | Correct |
| Customer >$1M ARR (67, +68% YoY) | Correct | Correct |
| Prior-year Q4 revenue ($216.9M) | Correct | Correct |
| Stock after-hours move (~15%) | Stated 15% | Stated 15-18% |
2. Source Quality and Attribution
| Criterion | Web Search | Bigdata MCP |
|---|---|---|
| Number of unique sources | 11 | 17 |
| Primary sources used | StockTitan (press release), CNBC, Yahoo Finance, Fortune | SEC filings (8-K), Quartr transcripts, earnings slides, press release |
| Analyst note access | Indirect, analyst actions sourced from aggregator sites (GuruFocus, Meyka) | Direct access to broker notes (Wells Fargo, JPM, MS, GS, Stifel, RBC, Piper) via Bigdata |
| Transcript depth | Earnings call highlights via MarketBeat/Yahoo summary | Full earnings call transcript cited with specific claims |
| Source diversity | Good: mix of press release, news coverage, analyst aggregators | Excellent: mix of filings, transcripts, slides, analyst notes, news |
| All links functional | All links are publicly accessible | Bigdata.com links require platform access; Quartr links are PDFs |
3. Completeness and Coverage
| Section | Web Search | Bigdata MCP |
|---|---|---|
| Executive Summary | Yes | Yes |
| Financial Results Table | Yes | Yes |
| Revenue & Margin Analysis | Detailed, includes GAAP gross margin trend | Detailed |
| Customer KPIs Table | Missing YoY for $10K and $100K cohorts | Full table with Q4 2024 comparisons |
| AI/Product Metrics | Slightly more detail (60% non-designers, 80% also use Design) | Good |
| Enterprise Anecdotes | Less specific | Specific examples |
| Forward Guidance | Full table | Full table with consensus comparisons |
| Strategic Commentary | 4 priorities, includes AI partnerships (MCP, model integrations) | 3 priorities well explained |
| Cash Flow & Balance Sheet | Includes deferred revenue ($595.3M), capital allocation (Weavy acquisition) | Detailed with derived metrics (net debt, working capital, current ratio) |
| Surprises vs. Expectations | 4 positive, 4 negative, slightly more granular | 4 positive, 3 negative |
| Analyst Reactions | 6 firms, includes Barclays upgrade, but fewer details | 7 firms with names, ratings, old/new PTs |
| Investment Implications | Well structured, includes specific valuation context (~12x revenue) | Well structured |
| Sources Section | 11 sources | 17 sources |
| Disclaimer | Included | Missing |
4. Analytical Depth and Insight
| Criterion | Web Search | Bigdata MCP |
|---|---|---|
| Revenue beat contextualization | Same | Quantifies vs. guidance and consensus |
| Margin trajectory analysis | Better: includes GAAP gross margin erosion from AI costs | FY25 to FY26 compression discussed |
| NDR analysis | Good: “highest in 10 quarters” framing | Good: 5pp QoQ and historical context |
| International opportunity sizing | Good but less emphasis on the gap | Excellent: 85% MAU vs 54% revenue gap quantified |
| Valuation context | Better: provides ~12x revenue multiple, $12B market cap | Limited: notes stock -77% from highs |
| Competitive risk framing | Names specific competitors (Anthropic, OpenAI) | “Vibe coding” risk noted |
| FCF trajectory analysis | Better: notes Q1 41% to Q4 13% sequential decline, explains drivers | Q4 only |
| Consumption model risk | Noted similarly | Noted as execution risk |
| SBC analysis | Better: quantifies $1.36B FY total, $975.7M one-time Q3 charge | Brief |
| Guidance conservatism assessment | Good but less explicit | Strong: notes 40% exit rate vs 30% guide |
5. Structure and Readability
Both reports follow nearly identical section structures, which is sensible for an earnings digest format.| Criterion | Web Search | Bigdata MCP |
|---|---|---|
| Logical flow | Excellent | Excellent |
| Use of tables | Well-placed | Well-placed, informative |
| Scannability | Clear headers, numbered lists | Clear headers, numbered lists |
| Length appropriateness | ~220 lines, appropriate | ~210 lines, appropriate |
| Internal consistency | Minor: the EPS consensus mismatch creates internal confusion | Consistent |
| ”Chain of thought” artifacts at top | Clean, no processing artifacts | Line 1 contains visible processing notes |
6. Analyst Coverage Quality
| Criterion | Web Search | Bigdata MCP |
|---|---|---|
| Number of firms covered | 6 | 7 |
| Named analysts | 1 (Jaluria) | 4 (Turrin, Murphy, Porter, Rudoff, Jaluria) |
| Prior PTs shown | Yes | Yes, enables change tracking |
| Rating changes noted | Explicit (maintained/upgraded) | Implicit (all maintained) |
| Unique coverage | Barclays upgrade to Neutral (not in Bigdata MCP report) | — |
| Consensus context | ”Hold, avg $41-$48”, this range seems high vs. the individual targets shown (most are $30-$44) | “Hold, median $35, range $30-$44” |
| Narrative context | Explains SaaS derating | Explains paradox of lowered PTs despite beat |
7. Timeliness and Recency
Both reports are dated February 21, 2026 (3 days post-earnings on Feb 18).| Criterion | Web Search | Bigdata MCP |
|---|---|---|
| Incorporates earnings data | Yes | Yes |
| Incorporates analyst reactions (Feb 19-20) | Yes | Yes |
| Incorporates stock reaction | Yes | Yes |
| Pre-earnings context (prior guidance) | Implied | Q3 guidance cited |
8. Professional Polish and Trust Signals
| Criterion | Web Search | Bigdata MCP |
|---|---|---|
| Disclaimer | Present | Missing |
| Processing artifacts | None | Chain-of-thought visible on line 1 |
| Branding | ”Powered by Bigdata.com" | "Powered by Bigdata.com” |
| Tone | Professional, measured | Professional, measured |
| Balanced (bull/bear) | Both sides presented | Both sides presented |
| Data-to-claim traceability | Inline citations to 11 sources | Inline citations to 17 sources |
Summary Scorecard
| Evaluation Dimension | Claude + Web Search | Claude + Bigdata MCP |
|---|---|---|
| 1. Factual Accuracy | 7.5 | 9.0 |
| 2. Source Quality & Attribution | 7.5 | 9.5 |
| 3. Completeness & Coverage | 8.5 | 9.0 |
| 4. Analytical Depth & Insight | 9.0 | 8.5 |
| 5. Structure & Readability | 9.0 | 8.0 |
| 6. Analyst Coverage Quality | 7.5 | 9.0 |
| 7. Timeliness & Recency | 9.5 | 9.5 |
| 8. Professional Polish & Trust | 9.0 | 8.0 |
| Overall Average | 8.4 | 8.8 |

