Rank
70
AI Agents & MCPs & AI Workflow Automation • (~400 MCP servers for AI agents) • AI Automation / AI Agent with MCPs • AI Workflows & AI Agents • MCPs for AI Agents
Traction
No public download signal
Freshness
Updated 2d ago
Xpersona Agent
Systematic tuning loop for any AI system. Use when asked to: (1) tune/optimize prompts, tools, or agent behavior, (2) improve system performance iteratively, (3) set up evaluation criteria for a system, (4) run optimization experiments. Collaboratively defines objectives and scoring with the user, then iterates with git checkpointing. --- name: context-tuning description: > Systematic tuning loop for any AI system. Use when asked to: (1) tune/optimize prompts, tools, or agent behavior, (2) improve system performance iteratively, (3) set up evaluation criteria for a system, (4) run optimization experiments. Collaboratively defines objectives and scoring with the user, then iterates with git checkpointing. --- Context Tuning Skill Systematic, evalua
git clone https://github.com/vinceyyy/context-tuning-skill.gitOverall rank
#23
Adoption
No public adoption signal
Trust
Unknown
Freshness
Apr 15, 2026
Freshness
Last checked Apr 15, 2026
Best For
context-tuning is best for be workflows where OpenClaw compatibility matters.
Not Ideal For
Contract metadata is missing or unavailable for deterministic execution.
Evidence Sources Checked
editorial-content, GITHUB OPENCLEW, runtime-metrics, public facts pack
Key links, install path, reliability highlights, and the shortest practical read before diving into the crawl record.
Overview
Systematic tuning loop for any AI system. Use when asked to: (1) tune/optimize prompts, tools, or agent behavior, (2) improve system performance iteratively, (3) set up evaluation criteria for a system, (4) run optimization experiments. Collaboratively defines objectives and scoring with the user, then iterates with git checkpointing. --- name: context-tuning description: > Systematic tuning loop for any AI system. Use when asked to: (1) tune/optimize prompts, tools, or agent behavior, (2) improve system performance iteratively, (3) set up evaluation criteria for a system, (4) run optimization experiments. Collaboratively defines objectives and scoring with the user, then iterates with git checkpointing. --- Context Tuning Skill Systematic, evalua Capability contract not published. No trust telemetry is available yet. Last updated 4/15/2026.
Trust score
Unknown
Compatibility
OpenClaw
Freshness
Apr 15, 2026
Vendor
Vinceyyy
Artifacts
0
Benchmarks
0
Last release
Unpublished
Install & run
git clone https://github.com/vinceyyy/context-tuning-skill.gitSetup complexity is LOW. This package is likely designed for quick installation with minimal external side-effects.
Final validation: Expose the agent to a mock request payload inside a sandbox and trace the network egress before allowing access to real customer data.
Public facts grouped by evidence type, plus release and crawl events with provenance and freshness.
Public facts
Vendor
Vinceyyy
Protocol compatibility
OpenClaw
Handshake status
UNKNOWN
Crawlable docs
6 indexed pages on the official domain
Parameters, dependencies, examples, extracted files, editorial overview, and the complete README when available.
Captured outputs
Extracted files
0
Examples
6
Snippets
0
Languages
typescript
Parameters
text
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 1: OBJECTIVE DISCOVERY │
│ Understand what user wants to optimize → Refine through dialog │
└─────────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 2: SCORING SYSTEM DESIGN │
│ Propose dimensions & rubric → Refine with user feedback │
└─────────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 3: BASELINE & VALIDATION │
│ Run system once → Score with rubric → Validate with user │
└─────────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 4: CODEBASE ANALYSIS │
│ Map tunable components → Compare to best practices │
└─────────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 5: ITERATION LOOP │
│ Evaluate → Identify weakness → Apply ONE fix → Checkpoint │
└─────────────────────────────────────────────────────────────────┘text
"So if I understand correctly, you want to optimize [system] to: - [Primary goal] - [Secondary goal] - While avoiding [failure mode] Is that right? Anything to add or adjust?"
markdown
# Tuning Session
**Started**: {timestamp}
**System**: {description of what's being tuned}
**Status**: Defining objectives
## Objectives
### Primary Goal
{what success looks like}
### Secondary Goals
- {goal 2}
- {goal 3}
### Known Issues
- {current problem 1}
- {current problem 2}
## Scoring System
(to be defined)
## Iteration Log
(to be added)text
Based on your objectives, I propose evaluating on these dimensions: 1. **[Dimension Name]** (weight: X%) - What it measures: [description] - Why it matters: [maps to objective X] 2. **[Dimension Name]** (weight: X%) - What it measures: [description] - Why it matters: [maps to objective Y] Does this capture what matters? Should we add, remove, or adjust anything?
text
For **[Dimension]**, I'd score like this: | Score | Criteria | |-------|----------| | 9-10 | [excellent - specific description] | | 7-8 | [good - specific description] | | 4-6 | [needs work - specific description] | | 1-3 | [poor - specific description] | | 0 | [failure - specific description] | Does this match your intuition? Any criteria to adjust?
markdown
## Scoring System
**Threshold**: {N.N}
**Max Iterations**: {N}
### Dimensions
#### {Dimension 1} ({weight}%)
{description}
| Score | Criteria |
|-------|----------|
| 9-10 | ... |
| 7-8 | ... |
| 4-6 | ... |
| 1-3 | ... |
#### {Dimension 2} ({weight}%)
...Editorial read
Docs source
GITHUB OPENCLEW
Editorial quality
ready
Systematic tuning loop for any AI system. Use when asked to: (1) tune/optimize prompts, tools, or agent behavior, (2) improve system performance iteratively, (3) set up evaluation criteria for a system, (4) run optimization experiments. Collaboratively defines objectives and scoring with the user, then iterates with git checkpointing. --- name: context-tuning description: > Systematic tuning loop for any AI system. Use when asked to: (1) tune/optimize prompts, tools, or agent behavior, (2) improve system performance iteratively, (3) set up evaluation criteria for a system, (4) run optimization experiments. Collaboratively defines objectives and scoring with the user, then iterates with git checkpointing. --- Context Tuning Skill Systematic, evalua
Systematic, evaluation-driven optimization for AI systems. Collaboratively define what "good" means, then iteratively improve until you get there.
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 1: OBJECTIVE DISCOVERY │
│ Understand what user wants to optimize → Refine through dialog │
└─────────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 2: SCORING SYSTEM DESIGN │
│ Propose dimensions & rubric → Refine with user feedback │
└─────────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 3: BASELINE & VALIDATION │
│ Run system once → Score with rubric → Validate with user │
└─────────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 4: CODEBASE ANALYSIS │
│ Map tunable components → Compare to best practices │
└─────────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ PHASE 5: ITERATION LOOP │
│ Evaluate → Identify weakness → Apply ONE fix → Checkpoint │
└─────────────────────────────────────────────────────────────────┘
Goal: Understand what the user wants to optimize.
Ask the user (one or two at a time, not all at once):
Based on answers, reflect back understanding:
"So if I understand correctly, you want to optimize [system] to:
- [Primary goal]
- [Secondary goal]
- While avoiding [failure mode]
Is that right? Anything to add or adjust?"
Once confirmed, create initial session notes at docs/tuning/{date}-session.md:
# Tuning Session
**Started**: {timestamp}
**System**: {description of what's being tuned}
**Status**: Defining objectives
## Objectives
### Primary Goal
{what success looks like}
### Secondary Goals
- {goal 2}
- {goal 3}
### Known Issues
- {current problem 1}
- {current problem 2}
## Scoring System
(to be defined)
## Iteration Log
(to be added)
Goal: Create a custom rubric tailored to the user's objectives.
Based on objectives, propose 2-4 evaluation dimensions. Each dimension should:
Example proposal format:
Based on your objectives, I propose evaluating on these dimensions:
1. **[Dimension Name]** (weight: X%)
- What it measures: [description]
- Why it matters: [maps to objective X]
2. **[Dimension Name]** (weight: X%)
- What it measures: [description]
- Why it matters: [maps to objective Y]
Does this capture what matters? Should we add, remove, or adjust anything?
See references/rubric-templates.md for common dimension patterns.
For each dimension, propose specific scoring criteria:
For **[Dimension]**, I'd score like this:
| Score | Criteria |
|-------|----------|
| 9-10 | [excellent - specific description] |
| 7-8 | [good - specific description] |
| 4-6 | [needs work - specific description] |
| 1-3 | [poor - specific description] |
| 0 | [failure - specific description] |
Does this match your intuition? Any criteria to adjust?
Confirm with user:
Add to session notes:
## Scoring System
**Threshold**: {N.N}
**Max Iterations**: {N}
### Dimensions
#### {Dimension 1} ({weight}%)
{description}
| Score | Criteria |
|-------|----------|
| 9-10 | ... |
| 7-8 | ... |
| 4-6 | ... |
| 1-3 | ... |
#### {Dimension 2} ({weight}%)
...
Goal: Verify the scoring system works and establish baseline.
Run git status --porcelain. If dirty, ask user to commit or stash first.
Execute the system with a representative input. Capture full output/trace.
Apply the scoring system. Show work:
**Baseline Evaluation**
Input: {what was tested}
**{Dimension 1}**: {score}/10
- Evidence: {specific observation}
- Reasoning: {why this score}
**{Dimension 2}**: {score}/10
- Evidence: {specific observation}
- Reasoning: {why this score}
**Overall**: {weighted score}
Ask for confirmation:
"Does this scoring feel right?
- Does a {X}/10 on {Dimension 1} match your intuition?
- Is there anything the rubric missed or misjudged?
- Should we adjust the criteria before proceeding?"
If adjustments needed, return to Phase 2. Otherwise, proceed.
git add docs/tuning/{date}-session.md
git commit -m "tune: begin session - baseline {overall_score}"
Goal: Understand what can be tuned and identify opportunities.
Explore the codebase to identify:
| Component Type | What to Look For | |----------------|------------------| | System prompts | Main instructions, role definitions | | Tool definitions | Names, descriptions, parameters | | Tool implementations | Return values, error handling | | Orchestration | Agent loops, routing logic, handoffs | | Context management | What's included, summarization, memory |
Document findings:
## Tunable Components
### Prompts
- `path/to/prompt.py`: Main system prompt (~200 lines)
- `path/to/agent.py`: Agent instructions
### Tools
- `tool_name`: {purpose} - description could be clearer
- `other_tool`: {purpose} - parameters ambiguous
### Orchestration
- Single agent / Multi-agent with {pattern}
- Loop exits when: {conditions}
See references/component-checklist.md for what good looks like.
Identify gaps:
## Improvement Opportunities
### High Priority (likely impact on failing dimensions)
- [ ] {Specific issue}: {maps to Dimension X}
- [ ] {Specific issue}: {maps to Dimension Y}
### Medium Priority
- [ ] {Issue}
### Low Priority / Nice to Have
- [ ] {Issue}
Based on the baseline score and codebase analysis:
**Weakest dimension**: {dimension} at {score}
**Root cause hypothesis**: {what I think is causing it}
**Proposed first fix**: {specific change}
Does this plan make sense? Ready to start iterating?
Goal: Systematically improve until threshold met or plateau reached.
Run system 3x for stability. Score each dimension. Report:
**Iteration {N}**
| Dimension | Score | vs Threshold | Δ from Last |
|-----------|-------|--------------|-------------|
| {Dim 1} | X.X | {pass/fail} | +/-X.X |
| {Dim 2} | X.X | {pass/fail} | +/-X.X |
| **Overall** | X.X | {pass/fail} | +/-X.X |
| Condition | Criteria | Action | |-----------|----------|--------| | SUCCESS | All dimensions ≥ threshold | Go to Completion | | PLATEAU | <0.3 improvement over 3 iterations | Go to Completion | | MAX_ITER | Reached limit | Go to Completion | | REGRESSION | Score dropped significantly | Revert and try different fix |
Find lowest dimension below threshold. Analyze evidence for failure pattern.
See references/failure-patterns.md for pattern catalog.
ONE change per iteration to isolate effects.
See references/fix-techniques.md for technique selection.
Before applying, self-review:
Update session notes with iteration entry:
### Iteration {N} - {timestamp}
**Scores**: {dim1}={X.X}, {dim2}={X.X}
**Target**: {dimension} (at {X.X})
**Pattern**: {what went wrong}
**Evidence**: {specific example}
**Change**:
- File: {path}
- Technique: {from fix-techniques}
```diff
- {old}
+ {new}
Result: {improved/no change/regression}
Commit:
```bash
git add -A
git commit -m "tune(iter-{N}): {description} [{dim}: {before}→{after}]"
## Summary
**Status**: {success/plateau/max_iterations}
**Iterations**: {N}
**Improvement**: {baseline} → {final} (+{delta})
### Score Progression
| Iter | {Dim1} | {Dim2} | Overall |
|------|--------|--------|---------|
| 0 | X.X | X.X | X.X |
| ... | ... | ... | ... |
### What Worked
- {technique}: {dimension} {before}→{after}
### What Didn't Work
- {technique}: {result}
### Recommendations
- {any remaining improvements to consider}
Final commit:
git commit -m "tune: complete - {status} [overall: {baseline}→{final}]"
git revert HEAD --no-edit
Record: **Result**: REGRESSION - reverted
Try different technique.
Read session notes, find last iteration, resume from Phase 5.
Machine endpoints, contract coverage, trust signals, runtime metrics, benchmarks, and guardrails for agent-to-agent use.
Machine interfaces
Contract coverage
Status
missing
Auth
None
Streaming
No
Data region
Unspecified
Protocol support
Requires: none
Forbidden: none
Guardrails
Operational confidence: low
curl -s "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/snapshot"
curl -s "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/contract"
curl -s "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/trust"
Operational fit
Trust signals
Handshake
UNKNOWN
Confidence
unknown
Attempts 30d
unknown
Fallback rate
unknown
Runtime metrics
Observed P50
unknown
Observed P95
unknown
Rate limit
unknown
Estimated cost
unknown
Do not use if
Raw contract, invocation, trust, capability, facts, and change-event payloads for machine-side inspection.
Contract JSON
{
"contractStatus": "missing",
"authModes": [],
"requires": [],
"forbidden": [],
"supportsMcp": false,
"supportsA2a": false,
"supportsStreaming": false,
"inputSchemaRef": null,
"outputSchemaRef": null,
"dataRegion": null,
"contractUpdatedAt": null,
"sourceUpdatedAt": null,
"freshnessSeconds": null
}Invocation Guide
{
"preferredApi": {
"snapshotUrl": "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/snapshot",
"contractUrl": "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/contract",
"trustUrl": "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/trust"
},
"curlExamples": [
"curl -s \"https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/snapshot\"",
"curl -s \"https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/contract\"",
"curl -s \"https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/trust\""
],
"jsonRequestTemplate": {
"query": "summarize this repo",
"constraints": {
"maxLatencyMs": 2000,
"protocolPreference": [
"OPENCLEW"
]
}
},
"jsonResponseTemplate": {
"ok": true,
"result": {
"summary": "...",
"confidence": 0.9
},
"meta": {
"source": "GITHUB_OPENCLEW",
"generatedAt": "2026-04-17T04:47:33.639Z"
}
},
"retryPolicy": {
"maxAttempts": 3,
"backoffMs": [
500,
1500,
3500
],
"retryableConditions": [
"HTTP_429",
"HTTP_503",
"NETWORK_TIMEOUT"
]
}
}Trust JSON
{
"status": "unavailable",
"handshakeStatus": "UNKNOWN",
"verificationFreshnessHours": null,
"reputationScore": null,
"p95LatencyMs": null,
"successRate30d": null,
"fallbackRate": null,
"attempts30d": null,
"trustUpdatedAt": null,
"trustConfidence": "unknown",
"sourceUpdatedAt": null,
"freshnessSeconds": null
}Capability Matrix
{
"rows": [
{
"key": "OPENCLEW",
"type": "protocol",
"support": "unknown",
"confidenceSource": "profile",
"notes": "Listed on profile"
},
{
"key": "be",
"type": "capability",
"support": "supported",
"confidenceSource": "profile",
"notes": "Declared in agent profile metadata"
}
],
"flattenedTokens": "protocol:OPENCLEW|unknown|profile capability:be|supported|profile"
}Facts JSON
[
{
"factKey": "vendor",
"category": "vendor",
"label": "Vendor",
"value": "Vinceyyy",
"href": "https://github.com/vinceyyy/context-tuning-skill",
"sourceUrl": "https://github.com/vinceyyy/context-tuning-skill",
"sourceType": "profile",
"confidence": "medium",
"observedAt": "2026-04-15T05:21:22.124Z",
"isPublic": true
},
{
"factKey": "protocols",
"category": "compatibility",
"label": "Protocol compatibility",
"value": "OpenClaw",
"href": "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/contract",
"sourceUrl": "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/contract",
"sourceType": "contract",
"confidence": "medium",
"observedAt": "2026-04-15T05:21:22.124Z",
"isPublic": true
},
{
"factKey": "docs_crawl",
"category": "integration",
"label": "Crawlable docs",
"value": "6 indexed pages on the official domain",
"href": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
"sourceUrl": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
"sourceType": "search_document",
"confidence": "medium",
"observedAt": "2026-04-15T05:03:46.393Z",
"isPublic": true
},
{
"factKey": "handshake_status",
"category": "security",
"label": "Handshake status",
"value": "UNKNOWN",
"href": "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/trust",
"sourceUrl": "https://xpersona.co/api/v1/agents/vinceyyy-context-tuning-skill/trust",
"sourceType": "trust",
"confidence": "medium",
"observedAt": null,
"isPublic": true
}
]Change Events JSON
[
{
"eventType": "docs_update",
"title": "Docs refreshed: Sign in to GitHub · GitHub",
"description": "Fresh crawlable documentation was indexed for the official domain.",
"href": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
"sourceUrl": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
"sourceType": "search_document",
"confidence": "medium",
"observedAt": "2026-04-15T05:03:46.393Z",
"isPublic": true
}
]Sponsored
Ads related to context-tuning and adjacent AI workflows.