Crawler Summary

mineru answer-first brief

Parse PDFs into clean Markdown using MinerU's VLM engine. Use when: (1) Converting PDF to Markdown, (2) Extracting text/tables/formulas from PDFs, (3) Batch processing multiple PDFs, (4) Saving parsed content to Obsidian or knowledge bases. Supports LaTeX formulas, tables, images, and async parallel processing. --- name: mineru description: "Parse PDFs into clean Markdown using MinerU's VLM engine. Use when: (1) Converting PDF to Markdown, (2) Extracting text/tables/formulas from PDFs, (3) Batch processing multiple PDFs, (4) Saving parsed content to Obsidian or knowledge bases. Supports LaTeX formulas, tables, images, and async parallel processing." homepage: https://mineru.net metadata: openclaw: emoji: "šŸ“„" requires: bins Published capability contract available. No trust telemetry is available yet. 1 GitHub stars reported by the source. Last updated 2/24/2026.

Freshness

Last checked 2/24/2026

Best For

Contract is available with explicit auth and schema references.

Not Ideal For

mineru is not ideal for teams that need stronger public trust telemetry, lower setup complexity, or more explicit contract coverage before production rollout.

Evidence Sources Checked

editorial-content, capability-contract, runtime-metrics, public facts pack

Claim this agent
Agent DossierGitHubSafety: 89/100

mineru

Parse PDFs into clean Markdown using MinerU's VLM engine. Use when: (1) Converting PDF to Markdown, (2) Extracting text/tables/formulas from PDFs, (3) Batch processing multiple PDFs, (4) Saving parsed content to Obsidian or knowledge bases. Supports LaTeX formulas, tables, images, and async parallel processing. --- name: mineru description: "Parse PDFs into clean Markdown using MinerU's VLM engine. Use when: (1) Converting PDF to Markdown, (2) Extracting text/tables/formulas from PDFs, (3) Batch processing multiple PDFs, (4) Saving parsed content to Obsidian or knowledge bases. Supports LaTeX formulas, tables, images, and async parallel processing." homepage: https://mineru.net metadata: openclaw: emoji: "šŸ“„" requires: bins

OpenClawself-declared

Public facts

6

Change events

0

Artifacts

0

Freshness

Feb 24, 2026

Verifiededitorial-contentNo verified compatibility signals1 GitHub stars

Published capability contract available. No trust telemetry is available yet. 1 GitHub stars reported by the source. Last updated 2/24/2026.

1 GitHub starsSchema refs publishedTrust evidence available

Trust score

Unknown

Compatibility

OpenClaw

Freshness

Feb 24, 2026

Vendor

Mineru

Artifacts

0

Benchmarks

0

Last release

Unpublished

Executive Summary

Key links, install path, and a quick operational read before the deeper crawl record.

Verifiededitorial-content

Summary

Published capability contract available. No trust telemetry is available yet. 1 GitHub stars reported by the source. Last updated 2/24/2026.

Setup snapshot

git clone https://github.com/Nebutra/MinerU-Skill.git
  1. 1

    Setup complexity is LOW. This package is likely designed for quick installation with minimal external side-effects.

  2. 2

    Final validation: Expose the agent to a mock request payload inside a sandbox and trace the network egress before allowing access to real customer data.

Evidence Ledger

Everything public we have scraped or crawled about this agent, grouped by evidence type with provenance.

Verifiededitorial-content
Vendor (1)

Vendor

Mineru

profilemedium
Observed Feb 24, 2026Source linkProvenance
Compatibility (2)

Protocol compatibility

OpenClaw

contractmedium
Observed Feb 24, 2026Source linkProvenance

Auth modes

api_key

contracthigh
Observed Feb 24, 2026Source linkProvenance
Artifact (1)

Machine-readable schemas

OpenAPI or schema references published

contracthigh
Observed Feb 24, 2026Source linkProvenance
Adoption (1)

Adoption signal

1 GitHub stars

profilemedium
Observed Feb 24, 2026Source linkProvenance
Security (1)

Handshake status

UNKNOWN

trustmedium
Observed unknownSource linkProvenance

Release & Crawl Timeline

Merged public release, docs, artifact, benchmark, pricing, and trust refresh events.

Self-declaredagent-index

Artifacts Archive

Extracted files, examples, snippets, parameters, dependencies, permissions, and artifact metadata.

Self-declaredGITHUB OPENCLEW

Extracted files

0

Examples

6

Snippets

0

Languages

typescript

Parameters

Executable Examples

bash

export MINERU_TOKEN="your-token-here"

bash

python3 scripts/mineru_v2.py --file ./document.pdf --output ./output/

bash

python3 scripts/mineru_v2.py \
  --dir ./pdfs/ \
  --output ./output/ \
  --workers 10 \
  --resume

bash

python3 scripts/mineru_v2.py \
  --dir ./pdfs/ \
  --output "~/Library/Mobile Documents/com~apple~CloudDocs/Obsidian/VaultName/" \
  --resume

text

--dir PATH        Input directory of PDFs
--file PATH       Single PDF file  
--output PATH     Output directory (default: ./output/)
--workers N       Concurrent workers (default: 5, max: 15)
--resume          Skip already processed files
--timeout SEC     Per-file timeout (default: 600)

text

output/
ā”œā”€ā”€ document-name/
│   ā”œā”€ā”€ document-name.md    # Main Markdown
│   ā”œā”€ā”€ images/             # Extracted images
│   └── content.json        # Metadata

Docs & README

Full documentation captured from public sources, including the complete README when available.

Self-declaredGITHUB OPENCLEW

Docs source

GITHUB OPENCLEW

Editorial quality

ready

Parse PDFs into clean Markdown using MinerU's VLM engine. Use when: (1) Converting PDF to Markdown, (2) Extracting text/tables/formulas from PDFs, (3) Batch processing multiple PDFs, (4) Saving parsed content to Obsidian or knowledge bases. Supports LaTeX formulas, tables, images, and async parallel processing. --- name: mineru description: "Parse PDFs into clean Markdown using MinerU's VLM engine. Use when: (1) Converting PDF to Markdown, (2) Extracting text/tables/formulas from PDFs, (3) Batch processing multiple PDFs, (4) Saving parsed content to Obsidian or knowledge bases. Supports LaTeX formulas, tables, images, and async parallel processing." homepage: https://mineru.net metadata: openclaw: emoji: "šŸ“„" requires: bins

Full README

name: mineru description: "Parse PDFs into clean Markdown using MinerU's VLM engine. Use when: (1) Converting PDF to Markdown, (2) Extracting text/tables/formulas from PDFs, (3) Batch processing multiple PDFs, (4) Saving parsed content to Obsidian or knowledge bases. Supports LaTeX formulas, tables, images, and async parallel processing." homepage: https://mineru.net metadata: openclaw: emoji: "šŸ“„" requires: bins: ["python3"] env: ["MINERU_TOKEN"] install: - id: pip kind: pip packages: ["requests", "aiohttp"] label: "Install Python dependencies (pip)"

MinerU PDF Parser

Parse PDF documents into Markdown with LaTeX formula preservation, table extraction, and image handling.

Setup

Get API token from https://mineru.net/user-center/api-token (free: 2000 pages/day, 200MB max):

export MINERU_TOKEN="your-token-here"

Commands

Single File

python3 scripts/mineru_v2.py --file ./document.pdf --output ./output/

Batch Directory with Resume

python3 scripts/mineru_v2.py \
  --dir ./pdfs/ \
  --output ./output/ \
  --workers 10 \
  --resume

Direct to Obsidian

python3 scripts/mineru_v2.py \
  --dir ./pdfs/ \
  --output "~/Library/Mobile Documents/com~apple~CloudDocs/Obsidian/VaultName/" \
  --resume

CLI Options

--dir PATH        Input directory of PDFs
--file PATH       Single PDF file  
--output PATH     Output directory (default: ./output/)
--workers N       Concurrent workers (default: 5, max: 15)
--resume          Skip already processed files
--timeout SEC     Per-file timeout (default: 600)

Script Selection

| Script | Use When | |--------|----------| | mineru_v2.py | Default - async parallel | | mineru_async.py | Fast network, need 15+ workers | | mineru_stable.py | Unstable network, sequential |

Output

output/
ā”œā”€ā”€ document-name/
│   ā”œā”€ā”€ document-name.md    # Main Markdown
│   ā”œā”€ā”€ images/             # Extracted images
│   └── content.json        # Metadata

Supported Documents

  • Academic papers (LaTeX formulas)
  • Exam papers (č€ƒē ”, é«˜č€ƒ)
  • Financial reports (tables)
  • Textbooks (formulas + diagrams)
  • Scanned PDFs (enable OCR)

Performance

| Workers | Speed | |---------|-------| | 1 (sequential) | 1.2 files/min | | 5 | 3.1 files/min | | 15 | 5.6 files/min |

Error Handling

  • 3x auto-retry with exponential backoff
  • Use --resume to skip completed files
  • Check logs for failed files

API Reference

For detailed API documentation, see references/api_reference.md.

Contract & API

Machine endpoints, protocol fit, contract coverage, invocation examples, and guardrails for agent-to-agent use.

Verifiedcapability-contract

Contract coverage

Status

ready

Auth

api_key

Streaming

No

Data region

global

Protocol support

OpenClaw: self-declared

Requires: openclew, lang:typescript

Forbidden: none

Guardrails

Operational confidence: medium

Contract is available with explicit auth and schema references.
Trust confidence is not low and verification freshness is acceptable.
Invocation examples
curl -s "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/snapshot"
curl -s "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/contract"
curl -s "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/trust"

Reliability & Benchmarks

Trust and runtime signals, benchmark suites, failure patterns, and practical risk constraints.

Missingruntime-metrics

Trust signals

Handshake

UNKNOWN

Confidence

unknown

Attempts 30d

unknown

Fallback rate

unknown

Runtime metrics

Observed P50

unknown

Observed P95

unknown

Rate limit

unknown

Estimated cost

unknown

No benchmark suites or observed failure patterns are available.

Media & Demo

Every public screenshot, visual asset, demo link, and owner-provided destination tied to this agent.

Missingno-media
No screenshots, media assets, or demo links are available.

Related Agents

Neighboring agents from the same protocol and source ecosystem for comparison and shortlist building.

Self-declaredprotocol-neighbors
GITHUB_REPOSactivepieces

Rank

70

AI Agents & MCPs & AI Workflow Automation • (~400 MCP servers for AI agents) • AI Automation / AI Agent with MCPs • AI Workflows & AI Agents • MCPs for AI Agents

Traction

No public download signal

Freshness

Updated 2d ago

OPENCLAW
GITHUB_REPOScherry-studio

Rank

70

AI productivity studio with smart chat, autonomous agents, and 300+ assistants. Unified access to frontier LLMs

Traction

No public download signal

Freshness

Updated 5d ago

MCPOPENCLAW
GITHUB_REPOSAionUi

Rank

70

Free, local, open-source 24/7 Cowork app and OpenClaw for Gemini CLI, Claude Code, Codex, OpenCode, Qwen Code, Goose CLI, Auggie, and more | 🌟 Star if you like it!

Traction

No public download signal

Freshness

Updated 6d ago

MCPOPENCLAW
GITHUB_REPOSCopilotKit

Rank

70

The Frontend for Agents & Generative UI. React + Angular

Traction

No public download signal

Freshness

Updated 23d ago

OPENCLAW
Machine Appendix

Contract JSON

{
  "contractStatus": "ready",
  "authModes": [
    "api_key"
  ],
  "requires": [
    "openclew",
    "lang:typescript"
  ],
  "forbidden": [],
  "supportsMcp": false,
  "supportsA2a": false,
  "supportsStreaming": false,
  "inputSchemaRef": "https://github.com/Nebutra/MinerU-Skill#input",
  "outputSchemaRef": "https://github.com/Nebutra/MinerU-Skill#output",
  "dataRegion": "global",
  "contractUpdatedAt": "2026-02-24T19:45:27.363Z",
  "sourceUpdatedAt": "2026-02-24T19:45:27.363Z",
  "freshnessSeconds": 4420856
}

Invocation Guide

{
  "preferredApi": {
    "snapshotUrl": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/snapshot",
    "contractUrl": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/contract",
    "trustUrl": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/trust"
  },
  "curlExamples": [
    "curl -s \"https://xpersona.co/api/v1/agents/nebutra-mineru-skill/snapshot\"",
    "curl -s \"https://xpersona.co/api/v1/agents/nebutra-mineru-skill/contract\"",
    "curl -s \"https://xpersona.co/api/v1/agents/nebutra-mineru-skill/trust\""
  ],
  "jsonRequestTemplate": {
    "query": "summarize this repo",
    "constraints": {
      "maxLatencyMs": 2000,
      "protocolPreference": [
        "OPENCLEW"
      ]
    }
  },
  "jsonResponseTemplate": {
    "ok": true,
    "result": {
      "summary": "...",
      "confidence": 0.9
    },
    "meta": {
      "source": "GITHUB_OPENCLEW",
      "generatedAt": "2026-04-16T23:46:23.386Z"
    }
  },
  "retryPolicy": {
    "maxAttempts": 3,
    "backoffMs": [
      500,
      1500,
      3500
    ],
    "retryableConditions": [
      "HTTP_429",
      "HTTP_503",
      "NETWORK_TIMEOUT"
    ]
  }
}

Trust JSON

{
  "status": "unavailable",
  "handshakeStatus": "UNKNOWN",
  "verificationFreshnessHours": null,
  "reputationScore": null,
  "p95LatencyMs": null,
  "successRate30d": null,
  "fallbackRate": null,
  "attempts30d": null,
  "trustUpdatedAt": null,
  "trustConfidence": "unknown",
  "sourceUpdatedAt": null,
  "freshnessSeconds": null
}

Capability Matrix

{
  "rows": [
    {
      "key": "OPENCLEW",
      "type": "protocol",
      "support": "unknown",
      "confidenceSource": "profile",
      "notes": "Listed on profile"
    }
  ],
  "flattenedTokens": "protocol:OPENCLEW|unknown|profile"
}

Facts JSON

[
  {
    "factKey": "protocols",
    "category": "compatibility",
    "label": "Protocol compatibility",
    "value": "OpenClaw",
    "href": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/contract",
    "sourceUrl": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/contract",
    "sourceType": "contract",
    "confidence": "medium",
    "observedAt": "2026-02-24T19:45:27.363Z",
    "isPublic": true
  },
  {
    "factKey": "auth_modes",
    "category": "compatibility",
    "label": "Auth modes",
    "value": "api_key",
    "href": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/contract",
    "sourceUrl": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/contract",
    "sourceType": "contract",
    "confidence": "high",
    "observedAt": "2026-02-24T19:45:27.363Z",
    "isPublic": true
  },
  {
    "factKey": "schema_refs",
    "category": "artifact",
    "label": "Machine-readable schemas",
    "value": "OpenAPI or schema references published",
    "href": "https://github.com/Nebutra/MinerU-Skill#input",
    "sourceUrl": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/contract",
    "sourceType": "contract",
    "confidence": "high",
    "observedAt": "2026-02-24T19:45:27.363Z",
    "isPublic": true
  },
  {
    "factKey": "vendor",
    "category": "vendor",
    "label": "Vendor",
    "value": "Mineru",
    "href": "https://mineru.net",
    "sourceUrl": "https://mineru.net",
    "sourceType": "profile",
    "confidence": "medium",
    "observedAt": "2026-02-24T19:43:14.176Z",
    "isPublic": true
  },
  {
    "factKey": "traction",
    "category": "adoption",
    "label": "Adoption signal",
    "value": "1 GitHub stars",
    "href": "https://github.com/Nebutra/MinerU-Skill",
    "sourceUrl": "https://github.com/Nebutra/MinerU-Skill",
    "sourceType": "profile",
    "confidence": "medium",
    "observedAt": "2026-02-24T19:43:14.176Z",
    "isPublic": true
  },
  {
    "factKey": "handshake_status",
    "category": "security",
    "label": "Handshake status",
    "value": "UNKNOWN",
    "href": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/trust",
    "sourceUrl": "https://xpersona.co/api/v1/agents/nebutra-mineru-skill/trust",
    "sourceType": "trust",
    "confidence": "medium",
    "observedAt": null,
    "isPublic": true
  }
]

Change Events JSON

[]

Sponsored

Ads related to mineru and adjacent AI workflows.