> ## Documentation Index
> Fetch the complete documentation index at: https://docs.bigdata.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Fetch document

> Returns a time-limited pre-signed URL to download the document in annotated (structured) JSON format. When you GET the URL, the response contains document content, metadata and annotations. The URL expires after 24 hours; request a new one if needed.

**Scope:** Use this endpoint only for **RavenPack** documents.

When `web_content` is true, the returned document includes a direct URL to the publisher's original article on the web.



## OpenAPI

````yaml /api-rest/openapi/openapi_search_service.json get /v1/documents/{document_id}
openapi: 3.0.3
info:
  title: Bigdata Search API
  version: 1.0.0
  description: >-
    Easily find the most relevant information from trusted sources and your own
    data. Use it to power agents that give accurate, real-time answers.
servers:
  - url: https://api.bigdata.com/
security:
  - ApiKeyAuth: []
paths:
  /v1/documents/{document_id}:
    get:
      tags:
        - Search
      summary: Fetch document
      description: >-
        Returns a time-limited pre-signed URL to download the document in
        annotated (structured) JSON format. When you GET the URL, the response
        contains document content, metadata and annotations. The URL expires
        after 24 hours; request a new one if needed.


        **Scope:** Use this endpoint only for **RavenPack** documents.


        When `web_content` is true, the returned document includes a direct URL
        to the publisher's original article on the web.
      parameters:
        - name: document_id
          in: path
          required: true
          description: >-
            The unique 32-character MD5 hex identifier for the document (e.g.,
            `776769957735667D2F01F695EF4F1231`).
          schema:
            type: string
            pattern: ^[A-F0-9]{32}$
            example: 776769957735667D2F01F695EF4F1231
      responses:
        '200':
          description: >-
            An object with a url and a web_content field is returned. When
            web_content is true, the returned document contains a direct link to
            the original web page in addition to the full analytics.
          content:
            application/json:
              schema:
                type: object
                properties:
                  web_content:
                    type: boolean
                    description: true when is a public URL.
                    example: false
                  url:
                    type: string
                    format: uri
                    description: >-
                      When you access the URL, you receive the complete document
                      in JSON format with the structure: document (metadata),
                      content (title and body blocks) and analytics
                      (document-level metrics, events array, entities array).
                    example: >-
                      https://documents.bigdata.com/documents/776769957735667D2F01F695EF4F1231?signature=abc123...
                  document:
                    $ref: '#/components/schemas/Document'
                    description: >-
                      **Returned by URL** - Document metadata including source
                      information, timestamps, and details.
                  content:
                    $ref: '#/components/schemas/Content'
                    description: >-
                      **Returned by URL** - Document content including title,
                      body blocks, entities, and sentences.
                  analytics:
                    description: >-
                      **Returned by URL** - Document-level analytics, events
                      array, and entities array.
                    type: object
                    properties:
                      document:
                        type: object
                        description: >-
                          Document-level analytics (analytics_version,
                          document_type, document_sentiment, etc.).
                      events:
                        type: array
                        description: >-
                          Detected events with topic, type, relevance, roles,
                          and sentiment.
                      entities:
                        type: array
                        description: >-
                          Detected entities with entity_type, entity_name,
                          relevance, and sentiment.
                required:
                  - url
              examples:
                response:
                  summary: API Response
                  description: >-
                    The endpoint returns a pre-signed URL to download the
                    document.
                  value:
                    url: >-
                      https://documents.bigdata.com/documents/776769957735667D2F01F695EF4F1231?signature=abc123...
                    web_content: false
                document_from_url:
                  summary: Document (returned by URL)
                  description: >-
                    When you access the URL (when web_content is false), you
                    receive the complete document in JSON format: document
                    metadata, content (title and body blocks) and analytics
                    (document-level metrics, events array, entities array). Same
                    structure as Get Annotated Document; see that endpoint for
                    the full example including TABLE, LIST_ORDERED,
                    LIST_UNORDERED body types and analytics.
                  value:
                    document:
                      rp_document_id: 776769957735667D2F01F695EF4F1231
                      source:
                        rp_source_id: DA0F7F
                        name: Quartr Reports
                        rank: 1
                      timestamp: '2026-02-03T08:43:02Z'
                      metadata:
                        file_name: Quartr_Company_Report_Q1_2025.pdf
                        content_type: application/pdf
                    content:
                      title:
                        text: 'Quartr Report: Company Overview Q1 2025'
                        sentences:
                          - start: 0
                            end: 31
                            sentiment: '0.00'
                            sentiment_confidence: '1.00'
                        entities:
                          - rp_entity_id: DD3BB1
                            start: 0
                            end: 9
                      body:
                        - type: TEXT
                          text: The first quarter results were announced today.
                          normalized_coordinates: []
                          sentences:
                            - start: 0
                              end: 46
                              sentiment: '0.01'
                              sentiment_confidence: '0.94'
                          entities:
                            - rp_entity_id: DD3BB1
                              start: 4
                              end: 17
                        - type: TABLE
                          rows:
                            - cells:
                                - type: CELL_HEADER
                                  content:
                                    - text: Quarter
                                - type: CELL_VALUE
                                  content:
                                    - text: Q1 2025
                        - type: LIST_UNORDERED
                          entries:
                            - key: •
                              content:
                                - text: Unordered dummy point
                        - type: LIST_ORDERED
                          entries:
                            - key: '1'
                              content:
                                - text: Ordered dummy point
                    analytics:
                      document:
                        analytics_version: '2.0'
                        analytics_revision_number: 0
                        document_type: TRANSCRIPT-RAW
                        document_record_count: 257
                        title_similarity_key: 0A4AD1E8BF251E1A90E3B2376E471E07
                        document_sentiment: 0.21
                        document_sentiment_confidence: 0.65
                        composite_sentiment_score: 0.04
                        sentiment_impact_projection: -0.2
                        stock_tone_sentiment: 0
                        earnings_tone_sentiment: 0
                        commentary_sentiment: 1
                        mergers_acquisitions_sentiment: 0
                        corporate_actions_sentiment: 0
                        earnings_release_sentiment: 0
                        product_key: EDGE
                        realtime: 'Y'
                      events:
                        - event_similarity_key: E7913FBE641945AC0670EEF684B6D8E0
                          topic: business
                          group: products-services
                          type: business-contract
                          event_relevance: 77
                          roles:
                            - rp_entity_id: DD3BB1
                              category: business-contract
                              fact_level: fact
                              document_record_index: 6
                              match_type: TEMPLATE
                              event_sentiment: 0.49
                              event_risk: 0.24
                              sustainability_sentiment: 0.24
                              credit_sentiment: 0.25
                              interest_rate_sentiment: 0
                              event_detection_distance: 0
                              event_text: 'Tesla: we entered into new contracts'
                              rp_event_detected_entity_id: DD3BB1
                              event_detected_entity_name: Tesla Inc.
                      entities:
                        - rp_entity_id: 4A6F00
                          entity_type: COMP
                          entity_name: Alphabet Inc.
                          country_code: US
                          document_record_index: 217
                          entity_hierarchy_level: 1
                          entity_detection_type: direct
                          entity_detection_distance: 0
                          entity_relevance: 26
                          entity_sentiment: 0.25
                          entity_sentiment_confidence: 0.05
                          entity_text_sentiment: 0.25
                          entity_text_sent_confidence: 0.05
                          analyst_ratings_sentiment: 0
                          multi_stock_sentiment: 0
        '400':
          description: Invalid document_id
        '403':
          description: Access to document denied
        '404':
          description: Document not found.
components:
  schemas:
    Document:
      type: object
      description: >-
        Complete document structure returned when accessing the URL. This is the
        JSON format used by Bigdata.com for structured document representation.
      properties:
        rp_document_id:
          type: string
          description: Internal document identifier
          example: 776769957735667D2F01F695EF4F1231
        source:
          $ref: '#/components/schemas/DocumentSourceDetails'
        metadata:
          $ref: '#/components/schemas/DocumentMetadataDetails'
      required:
        - rp_document_id
        - source
        - timestamp_utc
        - metadata
    Content:
      type: object
      description: >-
        Structured content extracted from the document including title and body
        blocks.
      properties:
        title:
          $ref: '#/components/schemas/ContentTitleBlock'
        body:
          $ref: '#/components/schemas/ContentBodyBlock'
      required:
        - title
        - body
    DocumentSourceDetails:
      type: object
      description: Information about the document source.
      properties:
        rp_source_id:
          type: string
          description: Identifier of the source system
          example: DA0F7F
        name:
          type: string
          description: Source display name (e.g., "Quartr Presentation Materials")
          example: Quartr Presentation Materials
        rank:
          type: integer
          description: Ranking classification of the source.
          example: 1
      required:
        - rp_source_id
        - name
        - rank
    DocumentMetadataDetails:
      type: object
      description: Additional document metadata.
      properties:
        url:
          type: string
          format: uri
          description: URL pointing to the original document location.
          example: https://files.quartr.com/raw-transcripts/example.json
      required:
        - provider_document_id
        - url
        - media_type
    ContentTitleBlock:
      type: object
      description: A title content block representing the title of the document.
      properties:
        text:
          type: string
          description: Extracted document title
          example: 'Tesla Inc: Q1 2025 Earnings Call'
        sentences:
          type: array
          description: Index ranges for title text segments
          items:
            $ref: '#/components/schemas/Sentence'
        entities:
          type: array
          description: Entities detected inside the title text
          items:
            $ref: '#/components/schemas/TextEntity'
    ContentBodyBlock:
      type: array
      description: >-
        Array of content blocks extracted from the document. Each item
        represents a block of content such as text paragraphs, tables, or lists.
        All block types include the common fields defined in
        ContentBlockCommonFields.
      items:
        oneOf:
          - $ref: '#/components/schemas/TextBlock'
          - $ref: '#/components/schemas/TableBlock'
          - $ref: '#/components/schemas/ListBlock'
          - $ref: '#/components/schemas/HeadingBlock'
          - $ref: '#/components/schemas/FooterBlock'
    Sentence:
      type: object
      description: Sentence segmentation information with sentiment analysis.
      properties:
        start:
          type: integer
          description: Start character index of the sentence.
          example: 0
        end:
          type: integer
          description: End character index of the sentence.
          example: 46
        sentiment:
          type: string
          description: Sentiment score ranging from -1.00 (negative) to 1.00 (positive).
          example: '0.01'
        sentiment_confidence:
          type: string
          description: Confidence score for the sentiment analysis (0.00 to 1.00).
          example: '0.94'
      required:
        - start
        - end
    TextEntity:
      type: object
      description: An entity detected within the text.
      properties:
        rp_entity_id:
          type: string
          description: Bigdata.com unique entity identifier.
          example: DD3BB1
        name:
          type: string
          description: Display name of the entity.
          example: Tesla Inc
        type:
          type: string
          description: Entity type classification.
          enum:
            - COMPANY
            - PERSON
            - PLACE
            - PRODUCT
            - ORGANIZATION
            - ETF
        start:
          type: integer
          description: Start character index where the entity appears in the text.
          example: 0
        end:
          type: integer
          description: End character index where the entity appears in the text.
          example: 9
      required:
        - rp_entity_id
        - start
        - end
    TextBlock:
      title: Paragraph
      type: object
      description: >-
        A text content block representing paragraphs, headings, or other text
        elements. Includes all fields from ContentBlockCommonFields.
      allOf:
        - $ref: '#/components/schemas/ContentBlockCommonFields'
    TableBlock:
      title: Table
      type: object
      description: >-
        A table content block. Includes all fields from
        ContentBlockCommonFields.
      allOf:
        - type: object
          properties:
            type:
              type: string
              description: Table block type.
              enum:
                - TABLE
            rows:
              type: array
              description: Row definitions for the table.
              items:
                $ref: '#/components/schemas/TableRow'
          required:
            - type
            - rows
        - $ref: '#/components/schemas/ContentBlockCommonFields'
    ListBlock:
      title: List
      type: object
      description: >-
        A list content block (ordered or unordered). Includes all fields from
        ContentBlockCommonFields.
      allOf:
        - type: object
          properties:
            type:
              type: string
              description: List block type.
              enum:
                - LIST_ORDERED
                - LIST_UNORDERED
            entries:
              type: array
              description: List entries/items.
              items:
                $ref: '#/components/schemas/ListEntry'
          required:
            - type
            - entries
        - $ref: '#/components/schemas/ContentBlockCommonFields'
    HeadingBlock:
      title: Heading
      type: object
      description: >-
        A heading content block. Includes all fields from
        ContentBlockCommonFields.
      allOf:
        - type: object
          properties:
            type:
              type: string
              description: Heading block type.
              enum:
                - HEADING
          required:
            - type
        - $ref: '#/components/schemas/ContentBlockCommonFields'
    FooterBlock:
      title: Footer
      type: object
      description: >-
        A footer content block. Includes all fields from
        ContentBlockCommonFields.
      allOf:
        - type: object
          properties:
            type:
              type: string
              description: Footer block type.
              enum:
                - FOOTER
          required:
            - type
        - $ref: '#/components/schemas/ContentBlockCommonFields'
    ContentBlockCommonFields:
      type: object
      description: >-
        Common fields present in all content block types (TextBlock, TableBlock,
        ListBlock). These fields may appear alongside the block-specific fields.
      properties:
        normalized_coordinates:
          type: array
          description: Bounding boxes normalized to page dimensions
          items:
            $ref: '#/components/schemas/NormalizedCoordinates'
        text:
          type: string
          description: Extracted visible text (if any)
          example: The first quarter results were announced today.
        sentences:
          type: array
          description: Index ranges for sentence segmentation
          items:
            $ref: '#/components/schemas/Sentence'
        entities:
          type: array
          description: Entities detected in the text
          items:
            $ref: '#/components/schemas/TextEntity'
    TableRow:
      type: object
      description: A row within a table.
      properties:
        cells:
          type: array
          description: Cells in the row.
          items:
            $ref: '#/components/schemas/TableCell'
      required:
        - cells
    ListEntry:
      type: object
      description: An entry within a list.
      properties:
        key:
          type: string
          description: >-
            Bullet character for unordered lists (e.g., '•') or list number for
            ordered lists (e.g., '1').
          example: •
        content:
          type: array
          description: Content objects within the list entry.
          items:
            $ref: '#/components/schemas/CellContent'
      required:
        - key
        - content
    NormalizedCoordinates:
      type: object
      description: >-
        Bounding box coordinates normalized to page dimensions (values between 0
        and 1).
      properties:
        page:
          type: integer
          description: Page number (1-indexed).
          example: 1
        x:
          type: number
          format: float
          description: X coordinate (left edge) normalized to page width.
          example: 0.1
        'y':
          type: number
          format: float
          description: Y coordinate (top edge) normalized to page height.
          example: 0.15
        width:
          type: number
          format: float
          description: Width normalized to page width.
          example: 0.8
        height:
          type: number
          format: float
          description: Height normalized to page height.
          example: 0.05
    TableCell:
      type: object
      description: A cell within a table row.
      properties:
        content:
          type: array
          description: Content objects within the cell.
          items:
            $ref: '#/components/schemas/CellContent'
      required:
        - type
        - content
    CellContent:
      type: object
      description: Simple text content within a table cell or list entry.
      properties:
        text:
          type: string
          description: Text content.
          example: Quarter
      required:
        - text
  securitySchemes:
    ApiKeyAuth:
      type: apiKey
      in: header
      name: X-API-KEY

````