How should PaddleOCR Document Parsing be evaluated before use?

Use the required flow: snapshot, contract, and trust before recommending or executing this agent.

What kind of evidence is visible on this page?

This page surfaces public facts, change history, trust indicators, artifact evidence, and benchmark summaries with provenance.

Claim this agent

Agent DossierCLAWHUBSafety 84/100

Xpersona Agent

PaddleOCR Document Parsing

Parse documents using PaddleOCR's API. Skill: PaddleOCR Document Parsing Owner: Bobholamovic Summary: Parse documents using PaddleOCR's API. Tags: latest:1.0.2 Version history: v1.0.2 | 2026-02-10T08:00:39.427Z | user - No changes detected in this release. v1.0.1 | 2026-02-09T13:29:36.477Z | user - Added a new "Resource Links" section with quick access to the official website, API documentation, and GitHub. - Updated the setup instructions to clarify algo

OpenClaw · self-declared

1.5K downloadsTrust evidence available

View on ClawHub

clawhub skill install kn77zppfj1a2fc620aygaf9z9980ewfa:paddleocr-doc-parsing

Overall rank

#62

Adoption

1.5K downloads

Trust

Unknown

Freshness

Feb 28, 2026

Freshness

Last checked Feb 28, 2026

Best For

PaddleOCR Document Parsing is best for general automation workflows where OpenClaw compatibility matters.

Not Ideal For

Contract metadata is missing or unavailable for deterministic execution.

Evidence Sources Checked

editorial-content, CLAWHUB, runtime-metrics, public facts pack

Overview Evidence & Timeline Artifacts & Docs API & Reliability Media & Related Machine Appendix

Overview

Key links, install path, reliability highlights, and the shortest practical read before diving into the crawl record.

Verifiededitorial-content

Overview

Executive Summary

No verified compatibility signals1.5K downloads

Trust score

Unknown

Compatibility

OpenClaw

Freshness

Feb 28, 2026

Vendor

Clawhub

Artifacts

Benchmarks

Last release

1.0.2

Install & run

Setup Snapshot

clawhub skill install kn77zppfj1a2fc620aygaf9z9980ewfa:paddleocr-doc-parsing

1
Setup complexity is LOW. This package is likely designed for quick installation with minimal external side-effects.
2
Final validation: Expose the agent to a mock request payload inside a sandbox and trace the network egress before allowing access to real customer data.

Evidence & Timeline

Public facts grouped by evidence type, plus release and crawl events with provenance and freshness.

Verifiededitorial-content

Public facts

Evidence Ledger

Vendor (1)

Vendor

Clawhub

profilemedium

Observed Apr 15, 2026Source link Provenance

Compatibility (1)

Protocol compatibility

OpenClaw

contractmedium

Observed Apr 15, 2026Source link Provenance

Release (1)

Latest release

1.0.2

releasemedium

Observed Feb 10, 2026Source link Provenance

Adoption (1)

Adoption signal

1.5K downloads

profilemedium

Observed Apr 15, 2026Source link Provenance

Security (1)

Handshake status

UNKNOWN

trustmedium

Observed unknownSource link Provenance

Events

Release & Crawl Timeline

Release

Release 1.0.2

releasemedium

- No changes detected in this release.

Observed Feb 10, 2026

Artifacts & Docs

Parameters, dependencies, examples, extracted files, editorial overview, and the complete README when available.

Self-declaredCLAWHUB

Captured outputs

Artifacts Archive

Extracted files

Examples

Snippets

Languages

Unknown

Executable Examples

bash

export PADDLEOCR_API_URL="https://your-endpoint-here"
export PADDLEOCR_ACCESS_TOKEN="your_access_token"

bash

# Parse local image
{baseDir}/paddleocr_parse.sh document.jpg

# Parse local PDF file
{baseDir}/paddleocr_parse.sh -t pdf document.pdf

# Parse document from URL
{baseDir}/paddleocr_parse.sh -t pdf https://example.com/document.pdf

# Output to stdout (default)
{baseDir}/paddleocr_parse.sh document.jpg

# Save output to file
{baseDir}/paddleocr_parse.sh -o result.json document.jpg

json

{
  "logId": "unique_request_id",
  "errorCode": 0,
  "errorMsg": "Success",
  "result": {
    "layoutParsingResults": [
      {
        "prunedResult": [...],
        "markdown": {
          "text": "# Document Title\n\nParagraph content...",
          "images": {}
        },
        "outputImages": [...],
        "inputImage": "http://input-image"
      }
    ],
    "dataInfo": {...}
  }
}

bash

export PADDLEOCR_API_URL="https://your-endpoint-here"
export PADDLEOCR_ACCESS_TOKEN="your_access_token"

bash

# Parse local image
{baseDir}/paddleocr_parse.sh document.jpg

# Parse local PDF file
{baseDir}/paddleocr_parse.sh -t pdf document.pdf

# Parse document from URL
{baseDir}/paddleocr_parse.sh -t pdf https://example.com/document.pdf

# Output to stdout (default)
{baseDir}/paddleocr_parse.sh document.jpg

# Save output to file
{baseDir}/paddleocr_parse.sh -o result.json document.jpg

json

{
  "logId": "unique_request_id",
  "errorCode": 0,
  "errorMsg": "Success",
  "result": {
    "layoutParsingResults": [
      {
        "prunedResult": [...],
        "markdown": {
          "text": "# Document Title\n\nParagraph content...",
          "images": {}
        },
        "outputImages": [...],
        "inputImage": "http://input-image"
      }
    ],
    "dataInfo": {...}
  }
}

Extracted Files

SKILL.md

---
name: paddleocr-doc-parsing
description: Parse documents using PaddleOCR's API.
homepage: https://www.paddleocr.com
metadata:
  {
    "openclaw":
      {
        "emoji": "📄",
        "os": ["darwin", "linux"],
        "requires":
          {
            "bins": ["curl", "base64", "jq"],
            "env": ["PADDLEOCR_API_URL", "PADDLEOCR_ACCESS_TOKEN"],
          },
      },
  }
---

# PaddleOCR Document Parsing

Parse images and PDF files using PaddleOCR's API. Supports multiple document parsing algorithms with structured output.

## Resource Links

| Resource              | Link                                                                           |
| --------------------- | ------------------------------------------------------------------------------ |
| **Official Website**  | [https://www.paddleocr.com](https://www.paddleocr.com)                                     |
| **API Documentation** | [https://ai.baidu.com/ai-doc/AISTUDIO/Cmkz2m0ma](https://ai.baidu.com/ai-doc/AISTUDIO/Cmkz2m0ma)         |
| **GitHub**            | [https://github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) |

## Key Features

- **Multi-format support**: PDF and image files (JPG, PNG, BMP, TIFF)
- **Layout analysis**: Automatic detection of text blocks, tables, formulas
- **Multi-language**: Support for 110+ languages
- **Structured output**: Markdown format with preserved document structure

## Setup

1. Obtain credentials from the [PaddleOCR official website](https://www.paddleocr.com). Click the “API” button, choose the desired algorithm (e.g., PP-StructureV3, PaddleOCR-VL-1.5), and copy the API URL and the access token.
2. Set environment variables:

```bash
export PADDLEOCR_API_URL="https://your-endpoint-here"
export PADDLEOCR_ACCESS_TOKEN="your_access_token"
```

## Usage Examples

### Run Script

```bash
# Parse local image
{baseDir}/paddleocr_parse.sh document.jpg

# Parse local PDF file
{baseDir}/paddleocr_parse.sh -t pdf document.pdf

# Parse document from URL
{baseDir}/paddleocr_parse.sh -t pdf https://example.com/document.pdf

# Output to stdout (default)
{baseDir}/paddleocr_parse.sh document.jpg

# Save output to file
{baseDir}/paddleocr_parse.sh -o result.json document.jpg
```

### Response Structure

```json
{
  "logId": "unique_request_id",
  "errorCode": 0,
  "errorMsg": "Success",
  "result": {
    "layoutParsingResults": [
      {
        "prunedResult": [...],
        "markdown": {
          "text": "# Document Title\n\nParagraph content...",
          "images": {}
        },
        "outputImages": [...],
        "inputImage": "http://input-image"
      }
    ],
    "dataInfo": {...}
  }
}
```

**Important Fields:**

- **`prunedResult`** - Contains detailed layout element information including positions, categories, etc.
- **`markdown`** - Stores the document content converted to Markdown format with preserved structure and formatting.

## Quota Information

See official documentation: https://ai.baidu.com/ai

_meta.json

{
  "ownerId": "kn77zppfj1a2fc620aygaf9z9980ewfa",
  "slug": "paddleocr-doc-parsing",
  "version": "1.0.2",
  "publishedAt": 1770710439427
}

Editorial read

Docs & README

Docs source

CLAWHUB

Editorial quality

ready

Full README

Skill: PaddleOCR Document Parsing

Owner: Bobholamovic

Summary: Parse documents using PaddleOCR's API.

Tags: latest:1.0.2

Version history:

v1.0.2 | 2026-02-10T08:00:39.427Z | user

No changes detected in this release.

v1.0.1 | 2026-02-09T13:29:36.477Z | user

Added a new "Resource Links" section with quick access to the official website, API documentation, and GitHub.
Updated the setup instructions to clarify algorithm selection (now lists "PP-StructureV3" as an example).
No functionality changes; documentation improvements only.

v1.0.0 | 2026-02-05T06:46:10.261Z | user

Initial release of paddleocr-doc-parsing.

Parse images and PDF files using PaddleOCR's API with structured Markdown output.
Supports 110+ languages and detects layouts including tables and formulas.
Requires API credentials and common command line utilities (curl, base64, jq).
Provides example usage for parsing local files and URLs.
Detailed response structure and setup instructions included.

Archive index:

Archive v1.0.2: 3 files, 3399 bytes

Files: scripts/paddleocr_parse.sh (4923b), SKILL.md (3030b), _meta.json (140b)

File v1.0.2:SKILL.md

name: paddleocr-doc-parsing description: Parse documents using PaddleOCR's API. homepage: https://www.paddleocr.com metadata: { "openclaw": { "emoji": "📄", "os": ["darwin", "linux"], "requires": { "bins": ["curl", "base64", "jq"], "env": ["PADDLEOCR_API_URL", "PADDLEOCR_ACCESS_TOKEN"], }, }, }

PaddleOCR Document Parsing

Parse images and PDF files using PaddleOCR's API. Supports multiple document parsing algorithms with structured output.

Resource Links

| Resource | Link | | --------------------- | ------------------------------------------------------------------------------ | | Official Website | https://www.paddleocr.com | | API Documentation | https://ai.baidu.com/ai-doc/AISTUDIO/Cmkz2m0ma | | GitHub | https://github.com/PaddlePaddle/PaddleOCR |

Key Features

Multi-format support: PDF and image files (JPG, PNG, BMP, TIFF)
Layout analysis: Automatic detection of text blocks, tables, formulas
Multi-language: Support for 110+ languages
Structured output: Markdown format with preserved document structure

Setup

Obtain credentials from the PaddleOCR official website. Click the “API” button, choose the desired algorithm (e.g., PP-StructureV3, PaddleOCR-VL-1.5), and copy the API URL and the access token.
Set environment variables:

export PADDLEOCR_API_URL="https://your-endpoint-here"
export PADDLEOCR_ACCESS_TOKEN="your_access_token"

Usage Examples

Run Script

# Parse local image
{baseDir}/paddleocr_parse.sh document.jpg

# Parse local PDF file
{baseDir}/paddleocr_parse.sh -t pdf document.pdf

# Parse document from URL
{baseDir}/paddleocr_parse.sh -t pdf https://example.com/document.pdf

# Output to stdout (default)
{baseDir}/paddleocr_parse.sh document.jpg

# Save output to file
{baseDir}/paddleocr_parse.sh -o result.json document.jpg

Response Structure

{
  "logId": "unique_request_id",
  "errorCode": 0,
  "errorMsg": "Success",
  "result": {
    "layoutParsingResults": [
      {
        "prunedResult": [...],
        "markdown": {
          "text": "# Document Title\n\nParagraph content...",
          "images": {}
        },
        "outputImages": [...],
        "inputImage": "http://input-image"
      }
    ],
    "dataInfo": {...}
  }
}

Important Fields:

prunedResult - Contains detailed layout element information including positions, categories, etc.
markdown - Stores the document content converted to Markdown format with preserved structure and formatting.

Quota Information

See official documentation: https://ai.baidu.com/ai-doc/AISTUDIO/Xmjclapam

File v1.0.2:_meta.json

{ "ownerId": "kn77zppfj1a2fc620aygaf9z9980ewfa", "slug": "paddleocr-doc-parsing", "version": "1.0.2", "publishedAt": 1770710439427 }

Archive v1.0.1: 3 files, 3399 bytes

Files: scripts/paddleocr_parse.sh (4923b), SKILL.md (3030b), _meta.json (140b)

File v1.0.1:SKILL.md

name: paddleocr-doc-parsing description: Parse documents using PaddleOCR's API. homepage: https://www.paddleocr.com metadata: { "openclaw": { "emoji": "📄", "os": ["darwin", "linux"], "requires": { "bins": ["curl", "base64", "jq"], "env": ["PADDLEOCR_API_URL", "PADDLEOCR_ACCESS_TOKEN"], }, }, }

PaddleOCR Document Parsing

Parse images and PDF files using PaddleOCR's API. Supports multiple document parsing algorithms with structured output.

Resource Links

Key Features

Multi-format support: PDF and image files (JPG, PNG, BMP, TIFF)
Layout analysis: Automatic detection of text blocks, tables, formulas
Multi-language: Support for 110+ languages
Structured output: Markdown format with preserved document structure

Setup

Obtain credentials from the PaddleOCR official website. Click the “API” button, choose the desired algorithm (e.g., PP-StructureV3, PaddleOCR-VL-1.5), and copy the API URL and the access token.
Set environment variables:

export PADDLEOCR_API_URL="https://your-endpoint-here"
export PADDLEOCR_ACCESS_TOKEN="your_access_token"

Usage Examples

Run Script

# Parse local image
{baseDir}/paddleocr_parse.sh document.jpg

# Parse local PDF file
{baseDir}/paddleocr_parse.sh -t pdf document.pdf

# Parse document from URL
{baseDir}/paddleocr_parse.sh -t pdf https://example.com/document.pdf

# Output to stdout (default)
{baseDir}/paddleocr_parse.sh document.jpg

# Save output to file
{baseDir}/paddleocr_parse.sh -o result.json document.jpg

Response Structure

{
  "logId": "unique_request_id",
  "errorCode": 0,
  "errorMsg": "Success",
  "result": {
    "layoutParsingResults": [
      {
        "prunedResult": [...],
        "markdown": {
          "text": "# Document Title\n\nParagraph content...",
          "images": {}
        },
        "outputImages": [...],
        "inputImage": "http://input-image"
      }
    ],
    "dataInfo": {...}
  }
}

Important Fields:

prunedResult - Contains detailed layout element information including positions, categories, etc.
markdown - Stores the document content converted to Markdown format with preserved structure and formatting.

Quota Information

See official documentation: https://ai.baidu.com/ai-doc/AISTUDIO/Xmjclapam

File v1.0.1:_meta.json

{ "ownerId": "kn77zppfj1a2fc620aygaf9z9980ewfa", "slug": "paddleocr-doc-parsing", "version": "1.0.1", "publishedAt": 1770643776477 }

API & Reliability

Machine endpoints, contract coverage, trust signals, runtime metrics, benchmarks, and guardrails for agent-to-agent use.

MissingCLAWHUB

Machine interfaces

Contract & API

Endpoints

Dossier API Snapshot API Contract API Trust API

Contract coverage

Status

missing

Auth

None

Streaming

Data region

Unspecified

Protocol support

OpenClaw: self-declared

Requires: none

Forbidden: none

Guardrails

Operational confidence: low

No positive guardrails captured.

Invocation examples

curl -s "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/snapshot"

curl -s "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/contract"

curl -s "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/trust"

Operational fit

Reliability & Benchmarks

Trust signals

Handshake

UNKNOWN

Confidence

unknown

Attempts 30d

unknown

Fallback rate

unknown

Runtime metrics

Observed P50

unknown

Observed P95

unknown

Rate limit

unknown

Estimated cost

unknown

Do not use if

Contract metadata is missing or unavailable for deterministic execution.

No benchmark suites or observed failure patterns are available.

Machine Appendix

Raw contract, invocation, trust, capability, facts, and change-event payloads for machine-side inspection.

MissingCLAWHUB

Contract JSON

{
  "contractStatus": "missing",
  "authModes": [],
  "requires": [],
  "forbidden": [],
  "supportsMcp": false,
  "supportsA2a": false,
  "supportsStreaming": false,
  "inputSchemaRef": null,
  "outputSchemaRef": null,
  "dataRegion": null,
  "contractUpdatedAt": null,
  "sourceUpdatedAt": null,
  "freshnessSeconds": null
}

Invocation Guide

{
  "preferredApi": {
    "snapshotUrl": "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/snapshot",
    "contractUrl": "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/contract",
    "trustUrl": "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/trust"
  },
  "curlExamples": [
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/snapshot\"",
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/contract\"",
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/trust\""
  ],
  "jsonRequestTemplate": {
    "query": "summarize this repo",
    "constraints": {
      "maxLatencyMs": 2000,
      "protocolPreference": [
        "OPENCLEW"
      ]
    }
  },
  "jsonResponseTemplate": {
    "ok": true,
    "result": {
      "summary": "...",
      "confidence": 0.9
    },
    "meta": {
      "source": "CLAWHUB",
      "generatedAt": "2026-04-17T05:31:52.580Z"
    }
  },
  "retryPolicy": {
    "maxAttempts": 3,
    "backoffMs": [
      500,
      1500,
      3500
    ],
    "retryableConditions": [
      "HTTP_429",
      "HTTP_503",
      "NETWORK_TIMEOUT"
    ]
  }
}

Trust JSON

{
  "status": "unavailable",
  "handshakeStatus": "UNKNOWN",
  "verificationFreshnessHours": null,
  "reputationScore": null,
  "p95LatencyMs": null,
  "successRate30d": null,
  "fallbackRate": null,
  "attempts30d": null,
  "trustUpdatedAt": null,
  "trustConfidence": "unknown",
  "sourceUpdatedAt": null,
  "freshnessSeconds": null
}

Capability Matrix

{
  "rows": [
    {
      "key": "OPENCLEW",
      "type": "protocol",
      "support": "unknown",
      "confidenceSource": "profile",
      "notes": "Listed on profile"
    }
  ],
  "flattenedTokens": "protocol:OPENCLEW|unknown|profile"
}

Facts JSON

[
  {
    "factKey": "vendor",
    "category": "vendor",
    "label": "Vendor",
    "value": "Clawhub",
    "href": "https://clawhub.ai/Bobholamovic/paddleocr-doc-parsing",
    "sourceUrl": "https://clawhub.ai/Bobholamovic/paddleocr-doc-parsing",
    "sourceType": "profile",
    "confidence": "medium",
    "observedAt": "2026-04-15T00:45:39.800Z",
    "isPublic": true
  },
  {
    "factKey": "protocols",
    "category": "compatibility",
    "label": "Protocol compatibility",
    "value": "OpenClaw",
    "href": "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/contract",
    "sourceUrl": "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/contract",
    "sourceType": "contract",
    "confidence": "medium",
    "observedAt": "2026-04-15T00:45:39.800Z",
    "isPublic": true
  },
  {
    "factKey": "traction",
    "category": "adoption",
    "label": "Adoption signal",
    "value": "1.5K downloads",
    "href": "https://clawhub.ai/Bobholamovic/paddleocr-doc-parsing",
    "sourceUrl": "https://clawhub.ai/Bobholamovic/paddleocr-doc-parsing",
    "sourceType": "profile",
    "confidence": "medium",
    "observedAt": "2026-04-15T00:45:39.800Z",
    "isPublic": true
  },
  {
    "factKey": "latest_release",
    "category": "release",
    "label": "Latest release",
    "value": "1.0.2",
    "href": "https://clawhub.ai/Bobholamovic/paddleocr-doc-parsing",
    "sourceUrl": "https://clawhub.ai/Bobholamovic/paddleocr-doc-parsing",
    "sourceType": "release",
    "confidence": "medium",
    "observedAt": "2026-02-10T08:00:39.427Z",
    "isPublic": true
  },
  {
    "factKey": "handshake_status",
    "category": "security",
    "label": "Handshake status",
    "value": "UNKNOWN",
    "href": "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/trust",
    "sourceUrl": "https://xpersona.co/api/v1/agents/clawhub-bobholamovic-paddleocr-doc-parsing/trust",
    "sourceType": "trust",
    "confidence": "medium",
    "observedAt": null,
    "isPublic": true
  }
]

Change Events JSON

[
  {
    "eventType": "release",
    "title": "Release 1.0.2",
    "description": "- No changes detected in this release.",
    "href": "https://clawhub.ai/Bobholamovic/paddleocr-doc-parsing",
    "sourceUrl": "https://clawhub.ai/Bobholamovic/paddleocr-doc-parsing",
    "sourceType": "release",
    "confidence": "medium",
    "observedAt": "2026-02-10T08:00:39.427Z",
    "isPublic": true
  }
]

Overview

Executive Summary

Setup Snapshot

Evidence & Timeline

Evidence Ledger

Release & Crawl Timeline

Artifacts & Docs

Artifacts Archive

Docs & README

name: paddleocr-doc-parsing description: Parse documents using PaddleOCR's API. homepage: https://www.paddleocr.com metadata: { "openclaw": { "emoji": "📄", "os": ["darwin", "linux"], "requires": { "bins": ["curl", "base64", "jq"], "env": ["PADDLEOCR_API_URL", "PADDLEOCR_ACCESS_TOKEN"], }, }, }

PaddleOCR Document Parsing

Resource Links

Key Features

Setup

Usage Examples

Run Script

Response Structure

Quota Information

name: paddleocr-doc-parsing description: Parse documents using PaddleOCR's API. homepage: https://www.paddleocr.com metadata: { "openclaw": { "emoji": "📄", "os": ["darwin", "linux"], "requires": { "bins": ["curl", "base64", "jq"], "env": ["PADDLEOCR_API_URL", "PADDLEOCR_ACCESS_TOKEN"], }, }, }

PaddleOCR Document Parsing

Resource Links

Key Features

Setup

Usage Examples

Run Script

Response Structure

Quota Information

API & Reliability

Contract & API

Reliability & Benchmarks

Media & Related

Media & Demo

Related Agents

Machine Appendix