Crawler Summary

webscrape answer-first brief

Collects and packages public target signals into file-based JSON and JSONL artifacts for OpenClaw analysis. Use when running daily target scraping, updating target lists, and generating last-N-days target reports. --- name: webscrape description: Collects and packages public target signals into file-based JSON and JSONL artifacts for OpenClaw analysis. Use when running daily target scraping, updating target lists, and generating last-N-days target reports. --- WebScrape Quick Start Provide stable, file-based web scraping target data for OpenClaw. - Python scripts gather and normalize public target signals. - Scripts in this re Published capability contract available. No trust telemetry is available yet. Last updated 2/24/2026.

Freshness

Last checked 2/23/2026

Best For

Contract is available with explicit auth and schema references.

Not Ideal For

webscrape is not ideal for teams that need stronger public trust telemetry, lower setup complexity, or more explicit contract coverage before production rollout.

Evidence Sources Checked

editorial-content, capability-contract, runtime-metrics, public facts pack

Claim this agent
Agent DossierGitHubSafety: 100/100

webscrape

Collects and packages public target signals into file-based JSON and JSONL artifacts for OpenClaw analysis. Use when running daily target scraping, updating target lists, and generating last-N-days target reports. --- name: webscrape description: Collects and packages public target signals into file-based JSON and JSONL artifacts for OpenClaw analysis. Use when running daily target scraping, updating target lists, and generating last-N-days target reports. --- WebScrape Quick Start Provide stable, file-based web scraping target data for OpenClaw. - Python scripts gather and normalize public target signals. - Scripts in this re

OpenClawself-declared

Public facts

6

Change events

1

Artifacts

0

Freshness

Feb 23, 2026

Verifiededitorial-contentNo verified compatibility signals

Published capability contract available. No trust telemetry is available yet. Last updated 2/24/2026.

Schema refs publishedTrust evidence available

Trust score

Unknown

Compatibility

OpenClaw

Freshness

Feb 23, 2026

Vendor

Amazeeio

Artifacts

0

Benchmarks

0

Last release

Unpublished

Executive Summary

Key links, install path, and a quick operational read before the deeper crawl record.

Verifiededitorial-content

Summary

Published capability contract available. No trust telemetry is available yet. Last updated 2/24/2026.

Setup snapshot

git clone https://github.com/amazeeio/webscrape.git
  1. 1

    Setup complexity is LOW. This package is likely designed for quick installation with minimal external side-effects.

  2. 2

    Final validation: Expose the agent to a mock request payload inside a sandbox and trace the network egress before allowing access to real customer data.

Evidence Ledger

Everything public we have scraped or crawled about this agent, grouped by evidence type with provenance.

Verifiededitorial-content
Vendor (1)

Vendor

Amazeeio

profilemedium
Observed Feb 24, 2026Source linkProvenance
Compatibility (2)

Protocol compatibility

OpenClaw

contractmedium
Observed Feb 24, 2026Source linkProvenance

Auth modes

api_key

contracthigh
Observed Feb 24, 2026Source linkProvenance
Artifact (1)

Machine-readable schemas

OpenAPI or schema references published

contracthigh
Observed Feb 24, 2026Source linkProvenance
Security (1)

Handshake status

UNKNOWN

trustmedium
Observed unknownSource linkProvenance
Integration (1)

Crawlable docs

6 indexed pages on the official domain

search_documentmedium
Observed Apr 15, 2026Source linkProvenance

Release & Crawl Timeline

Merged public release, docs, artifact, benchmark, pricing, and trust refresh events.

Self-declaredagent-index

Artifacts Archive

Extracted files, examples, snippets, parameters, dependencies, permissions, and artifact metadata.

Self-declaredGITHUB OPENCLEW

Extracted files

0

Examples

6

Snippets

0

Languages

typescript

Parameters

Executable Examples

bash

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install -r requirements.txt

bash

python3 -m collector.pipeline.scrape
# optional: run one profile only
python3 -m collector.pipeline.scrape --profile <profile>

bash

python3 -m collector.pipeline.report --profile <profile> --target <slug> --days 1
python3 -m collector.pipeline.report --profile <profile> --target <slug> --days 7

bash

python3 -m collector.pipeline.manage_targets add \
  --profile <profile> \
  --slug <slug> \
  --name "<display name>" \
  --source webpage=https://example.com \
  --source blog=https://example.com/blog \
  --source rss=https://example.com/feed.xml

bash

python3 -m collector.pipeline.manage_targets list --profile <profile>
python3 -m collector.pipeline.manage_targets list --profile <profile> --active-only

bash

python3 -m collector.pipeline.manage_targets remove --profile <profile> --slug <slug>

Docs & README

Full documentation captured from public sources, including the complete README when available.

Self-declaredGITHUB OPENCLEW

Docs source

GITHUB OPENCLEW

Editorial quality

ready

Collects and packages public target signals into file-based JSON and JSONL artifacts for OpenClaw analysis. Use when running daily target scraping, updating target lists, and generating last-N-days target reports. --- name: webscrape description: Collects and packages public target signals into file-based JSON and JSONL artifacts for OpenClaw analysis. Use when running daily target scraping, updating target lists, and generating last-N-days target reports. --- WebScrape Quick Start Provide stable, file-based web scraping target data for OpenClaw. - Python scripts gather and normalize public target signals. - Scripts in this re

Full README

name: webscrape description: Collects and packages public target signals into file-based JSON and JSONL artifacts for OpenClaw analysis. Use when running daily target scraping, updating target lists, and generating last-N-days target reports.

WebScrape

Quick Start

Provide stable, file-based web scraping target data for OpenClaw.

  • Python scripts gather and normalize public target signals.
  • Scripts in this repository do not call LLMs.
  • OpenClaw reads generated artifacts and performs analysis/alerting.

Set up Python once from repository root:

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install -r requirements.txt

Daily Operating Flow

Run this at roughly the same time each day:

python3 -m collector.pipeline.scrape
# optional: run one profile only
python3 -m collector.pipeline.scrape --profile <profile>

After scrape completes, generate report windows as needed:

python3 -m collector.pipeline.report --profile <profile> --target <slug> --days 1
python3 -m collector.pipeline.report --profile <profile> --target <slug> --days 7

Interpretation guideline:

  • Daily urgent review: analyze --days 1 reports.
  • Weekly deeper review: analyze --days 7 reports.

Common Commands

Add target (auto-discovers initial sources):

python3 -m collector.pipeline.manage_targets add \
  --profile <profile> \
  --slug <slug> \
  --name "<display name>" \
  --source webpage=https://example.com \
  --source blog=https://example.com/blog \
  --source rss=https://example.com/feed.xml

Supported --source types include:

  • webpage
  • blog
  • rss
  • changelog
  • sitemap
  • linkedin_company
  • x_profile

List targets:

python3 -m collector.pipeline.manage_targets list --profile <profile>
python3 -m collector.pipeline.manage_targets list --profile <profile> --active-only

Deactivate target:

python3 -m collector.pipeline.manage_targets remove --profile <profile> --slug <slug>

Artifacts

Primary outputs:

  • data/runs/<profile>/latest.json
  • data/runs/<profile>/YYYY-MM-DD/run.json
  • data/runs/<profile>/YYYY-MM-DD/targets/<slug>/items.jsonl
  • data/reports/<profile>/YYYY-MM-DD/<slug>/last-<N>-days.json

Use report files for analysis, especially:

  • data/reports/<profile>/YYYY-MM-DD/<slug>/last-1-days.json for daily urgent change review
  • data/reports/<profile>/YYYY-MM-DD/<slug>/last-7-days.json for weekly behavior review

Report payload shape (current):

{
  "profile": "<profile>",
  "windowDays": 1,
  "target": "acquia",
  "timeRange": {
    "startDate": "YYYY-MM-DD",
    "endDate": "YYYY-MM-DD"
  },
  "dayOverDayDiff": {
    "baselineDate": "YYYY-MM-DD",
    "currentDate": "YYYY-MM-DD",
    "totals": {
      "added": 0,
      "removed": 0,
      "changed": 0
    },
    "added": [],
    "removed": [],
    "changed": []
  }
}

How OpenClaw should interpret change sections:

  • added: new pages/posts/content not seen in the baseline date.
  • removed: previously visible content no longer present. This can be high-signal (for example: deleted announcements, removed docs, withdrawn claims, removed pricing/security language).
  • changed: same item identity, but content changed. This can be high-signal when messaging, release details, positioning, legal text, or product capabilities are updated.

Important: do not treat only added as relevant. In many cases, removed and changed carry equal or higher strategic importance.

Important Rules

  • Use public, unauthenticated sources only.
  • Keep runs deterministic for stable diffs.
  • Continue runs even when one source fails; rely on run error artifacts.

Advanced Operations

For maintenance and edge-case operations (rediscovery, cleanup, manual source edits, troubleshooting), see:

Contract & API

Machine endpoints, protocol fit, contract coverage, invocation examples, and guardrails for agent-to-agent use.

Verifiedcapability-contract

Contract coverage

Status

ready

Auth

api_key

Streaming

No

Data region

global

Protocol support

OpenClaw: self-declared

Requires: openclew, lang:typescript

Forbidden: none

Guardrails

Operational confidence: medium

Contract is available with explicit auth and schema references.
Trust confidence is not low and verification freshness is acceptable.
Invocation examples
curl -s "https://xpersona.co/api/v1/agents/amazeeio-webscrape/snapshot"
curl -s "https://xpersona.co/api/v1/agents/amazeeio-webscrape/contract"
curl -s "https://xpersona.co/api/v1/agents/amazeeio-webscrape/trust"

Reliability & Benchmarks

Trust and runtime signals, benchmark suites, failure patterns, and practical risk constraints.

Missingruntime-metrics

Trust signals

Handshake

UNKNOWN

Confidence

unknown

Attempts 30d

unknown

Fallback rate

unknown

Runtime metrics

Observed P50

unknown

Observed P95

unknown

Rate limit

unknown

Estimated cost

unknown

No benchmark suites or observed failure patterns are available.

Media & Demo

Every public screenshot, visual asset, demo link, and owner-provided destination tied to this agent.

Missingno-media
No screenshots, media assets, or demo links are available.

Related Agents

Neighboring agents from the same protocol and source ecosystem for comparison and shortlist building.

Self-declaredprotocol-neighbors
GITHUB_REPOSactivepieces

Rank

70

AI Agents & MCPs & AI Workflow Automation • (~400 MCP servers for AI agents) • AI Automation / AI Agent with MCPs • AI Workflows & AI Agents • MCPs for AI Agents

Traction

No public download signal

Freshness

Updated 2d ago

OPENCLAW
GITHUB_REPOScherry-studio

Rank

70

AI productivity studio with smart chat, autonomous agents, and 300+ assistants. Unified access to frontier LLMs

Traction

No public download signal

Freshness

Updated 5d ago

MCPOPENCLAW
GITHUB_REPOSAionUi

Rank

70

Free, local, open-source 24/7 Cowork app and OpenClaw for Gemini CLI, Claude Code, Codex, OpenCode, Qwen Code, Goose CLI, Auggie, and more | 🌟 Star if you like it!

Traction

No public download signal

Freshness

Updated 6d ago

MCPOPENCLAW
GITHUB_REPOSCopilotKit

Rank

70

The Frontend for Agents & Generative UI. React + Angular

Traction

No public download signal

Freshness

Updated 23d ago

OPENCLAW
Machine Appendix

Contract JSON

{
  "contractStatus": "ready",
  "authModes": [
    "api_key"
  ],
  "requires": [
    "openclew",
    "lang:typescript"
  ],
  "forbidden": [],
  "supportsMcp": false,
  "supportsA2a": false,
  "supportsStreaming": false,
  "inputSchemaRef": "https://github.com/amazeeio/webscrape#input",
  "outputSchemaRef": "https://github.com/amazeeio/webscrape#output",
  "dataRegion": "global",
  "contractUpdatedAt": "2026-02-24T19:44:11.030Z",
  "sourceUpdatedAt": "2026-02-24T19:44:11.030Z",
  "freshnessSeconds": 4419925
}

Invocation Guide

{
  "preferredApi": {
    "snapshotUrl": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/snapshot",
    "contractUrl": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/contract",
    "trustUrl": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/trust"
  },
  "curlExamples": [
    "curl -s \"https://xpersona.co/api/v1/agents/amazeeio-webscrape/snapshot\"",
    "curl -s \"https://xpersona.co/api/v1/agents/amazeeio-webscrape/contract\"",
    "curl -s \"https://xpersona.co/api/v1/agents/amazeeio-webscrape/trust\""
  ],
  "jsonRequestTemplate": {
    "query": "summarize this repo",
    "constraints": {
      "maxLatencyMs": 2000,
      "protocolPreference": [
        "OPENCLEW"
      ]
    }
  },
  "jsonResponseTemplate": {
    "ok": true,
    "result": {
      "summary": "...",
      "confidence": 0.9
    },
    "meta": {
      "source": "GITHUB_OPENCLEW",
      "generatedAt": "2026-04-16T23:29:36.242Z"
    }
  },
  "retryPolicy": {
    "maxAttempts": 3,
    "backoffMs": [
      500,
      1500,
      3500
    ],
    "retryableConditions": [
      "HTTP_429",
      "HTTP_503",
      "NETWORK_TIMEOUT"
    ]
  }
}

Trust JSON

{
  "status": "unavailable",
  "handshakeStatus": "UNKNOWN",
  "verificationFreshnessHours": null,
  "reputationScore": null,
  "p95LatencyMs": null,
  "successRate30d": null,
  "fallbackRate": null,
  "attempts30d": null,
  "trustUpdatedAt": null,
  "trustConfidence": "unknown",
  "sourceUpdatedAt": null,
  "freshnessSeconds": null
}

Capability Matrix

{
  "rows": [
    {
      "key": "OPENCLEW",
      "type": "protocol",
      "support": "unknown",
      "confidenceSource": "profile",
      "notes": "Listed on profile"
    },
    {
      "key": "be",
      "type": "capability",
      "support": "supported",
      "confidenceSource": "profile",
      "notes": "Declared in agent profile metadata"
    }
  ],
  "flattenedTokens": "protocol:OPENCLEW|unknown|profile capability:be|supported|profile"
}

Facts JSON

[
  {
    "factKey": "docs_crawl",
    "category": "integration",
    "label": "Crawlable docs",
    "value": "6 indexed pages on the official domain",
    "href": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
    "sourceUrl": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
    "sourceType": "search_document",
    "confidence": "medium",
    "observedAt": "2026-04-15T05:03:46.393Z",
    "isPublic": true
  },
  {
    "factKey": "protocols",
    "category": "compatibility",
    "label": "Protocol compatibility",
    "value": "OpenClaw",
    "href": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/contract",
    "sourceUrl": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/contract",
    "sourceType": "contract",
    "confidence": "medium",
    "observedAt": "2026-02-24T19:44:11.030Z",
    "isPublic": true
  },
  {
    "factKey": "auth_modes",
    "category": "compatibility",
    "label": "Auth modes",
    "value": "api_key",
    "href": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/contract",
    "sourceUrl": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/contract",
    "sourceType": "contract",
    "confidence": "high",
    "observedAt": "2026-02-24T19:44:11.030Z",
    "isPublic": true
  },
  {
    "factKey": "schema_refs",
    "category": "artifact",
    "label": "Machine-readable schemas",
    "value": "OpenAPI or schema references published",
    "href": "https://github.com/amazeeio/webscrape#input",
    "sourceUrl": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/contract",
    "sourceType": "contract",
    "confidence": "high",
    "observedAt": "2026-02-24T19:44:11.030Z",
    "isPublic": true
  },
  {
    "factKey": "vendor",
    "category": "vendor",
    "label": "Vendor",
    "value": "Amazeeio",
    "href": "https://github.com/amazeeio/webscrape",
    "sourceUrl": "https://github.com/amazeeio/webscrape",
    "sourceType": "profile",
    "confidence": "medium",
    "observedAt": "2026-02-24T19:43:14.176Z",
    "isPublic": true
  },
  {
    "factKey": "handshake_status",
    "category": "security",
    "label": "Handshake status",
    "value": "UNKNOWN",
    "href": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/trust",
    "sourceUrl": "https://xpersona.co/api/v1/agents/amazeeio-webscrape/trust",
    "sourceType": "trust",
    "confidence": "medium",
    "observedAt": null,
    "isPublic": true
  }
]

Change Events JSON

[
  {
    "eventType": "docs_update",
    "title": "Docs refreshed: Sign in to GitHub · GitHub",
    "description": "Fresh crawlable documentation was indexed for the official domain.",
    "href": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
    "sourceUrl": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
    "sourceType": "search_document",
    "confidence": "medium",
    "observedAt": "2026-04-15T05:03:46.393Z",
    "isPublic": true
  }
]

Sponsored

Ads related to webscrape and adjacent AI workflows.