Rank
70
AI Agents & MCPs & AI Workflow Automation β’ (~400 MCP servers for AI agents) β’ AI Automation / AI Agent with MCPs β’ AI Workflows & AI Agents β’ MCPs for AI Agents
Traction
No public download signal
Freshness
Updated 2d ago
Xpersona Agent
Build beautiful HTML photo menus from restaurant URLs, PDFs, or photos using Gemini Vision and AI image generation --- name: menuvision description: "Build beautiful HTML photo menus from restaurant URLs, PDFs, or photos using Gemini Vision and AI image generation" version: 1.0.0 emoji: "π½οΈ" user-invocable: true metadata: {"openclaw": {"requires": {"env": ["GOOGLE_API_KEY"], "bins": ["python3"]}, "primaryEnv": "GOOGLE_API_KEY", "homepage": "https://github.com/ademczuk/MenuVision"}} --- MenuVision - Restaurant Menu Builder Build
clawhub skill install skills:ademczuk:menuvisionOverall rank
#62
Adoption
No public adoption signal
Trust
Unknown
Freshness
Feb 25, 2026
Freshness
Last checked Feb 25, 2026
Best For
menuvision is best for of, be, corrupt workflows where OpenClaw compatibility matters.
Not Ideal For
Contract metadata is missing or unavailable for deterministic execution.
Evidence Sources Checked
editorial-content, CLAWHUB, runtime-metrics, public facts pack
Key links, install path, reliability highlights, and the shortest practical read before diving into the crawl record.
Overview
Build beautiful HTML photo menus from restaurant URLs, PDFs, or photos using Gemini Vision and AI image generation --- name: menuvision description: "Build beautiful HTML photo menus from restaurant URLs, PDFs, or photos using Gemini Vision and AI image generation" version: 1.0.0 emoji: "π½οΈ" user-invocable: true metadata: {"openclaw": {"requires": {"env": ["GOOGLE_API_KEY"], "bins": ["python3"]}, "primaryEnv": "GOOGLE_API_KEY", "homepage": "https://github.com/ademczuk/MenuVision"}} --- MenuVision - Restaurant Menu Builder Build Capability contract not published. No trust telemetry is available yet. Last updated 4/15/2026.
Trust score
Unknown
Compatibility
OpenClaw
Freshness
Feb 25, 2026
Vendor
Openclaw
Artifacts
0
Benchmarks
0
Last release
Unpublished
Install & run
clawhub skill install skills:ademczuk:menuvisionSetup complexity is LOW. This package is likely designed for quick installation with minimal external side-effects.
Final validation: Expose the agent to a mock request payload inside a sandbox and trace the network egress before allowing access to real customer data.
Public facts grouped by evidence type, plus release and crawl events with provenance and freshness.
Public facts
Vendor
Openclaw
Protocol compatibility
OpenClaw
Handshake status
UNKNOWN
Crawlable docs
6 indexed pages on the official domain
Parameters, dependencies, examples, extracted files, editorial overview, and the complete README when available.
Captured outputs
Extracted files
0
Examples
0
Snippets
0
Languages
typescript
Parameters
Editorial read
Docs source
CLAWHUB
Editorial quality
ready
Build beautiful HTML photo menus from restaurant URLs, PDFs, or photos using Gemini Vision and AI image generation --- name: menuvision description: "Build beautiful HTML photo menus from restaurant URLs, PDFs, or photos using Gemini Vision and AI image generation" version: 1.0.0 emoji: "π½οΈ" user-invocable: true metadata: {"openclaw": {"requires": {"env": ["GOOGLE_API_KEY"], "bins": ["python3"]}, "primaryEnv": "GOOGLE_API_KEY", "homepage": "https://github.com/ademczuk/MenuVision"}} --- MenuVision - Restaurant Menu Builder Build
Build a beautiful HTML photo menu for any restaurant from URLs, PDFs, or photos.
When the user wants to create a digital menu for a restaurant. Triggers: "build a menu", "create restaurant menu", "menu from PDF", "menu from photos", "digital menu", "menuvision".
1. Extract: URL/PDF/photo β menu_data.json (Gemini Vision)
2. Generate: menu_data.json β images/*.jpg (Gemini Image)
3. Build: menu_data.json + images β Menu.html (CSS/JS inline, images relative)
The AI agent creates these scripts:
| Script | Purpose |
|--------|---------|
| extract_menu.py | Extract menu data from URL/PDF/photo β structured JSON |
| generate_images.py | Generate food photos via Gemini Image |
| build_menu.py | Build HTML menu from JSON + images (CSS/JS inline, images as relative paths) |
| publish_menu.py | (Optional) Publish HTML to GitHub Pages |
All three pipeline stages share this exact JSON schema. The AI agent MUST use these field names β any deviation breaks the pipeline.
{
"restaurant": {
"name": "Restaurant Name (if visible)",
"cuisine": "cuisine type (Chinese, Indian, Austrian, Japanese, etc.)",
"tagline": "any subtitle or tagline"
},
"sections": [
{
"title": "Section Name (in primary language)",
"title_secondary": "Section name in secondary language (if present, else empty string)",
"category": "food or drink",
"note": "Any section note (e.g. 'served with rice', 'Mon-Fri 11-15h')",
"items": [
{
"code": "M1",
"name": "Dish Name (primary language)",
"name_secondary": "Name in secondary language (if present)",
"description": "Brief description (primary language)",
"description_secondary": "Description in secondary language (if present)",
"price": "12,90",
"price_prefix": "",
"allergens": "A C F",
"dietary": ["vegan", "spicy"],
"variants": []
}
]
}
],
"allergen_legend": {
"A": "Gluten",
"B": "Crustaceans"
},
"metadata": {
"languages": ["German", "English"],
"currency": "EUR"
}
}
| Field | Type | Required | Notes |
|-------|------|----------|-------|
| restaurant.name | string | Yes | Display name in HTML header |
| restaurant.cuisine | string | Yes | Passed to build_food_prompt() as cuisine context |
| restaurant.tagline | string | No | Subtitle line in HTML header |
| sections[].title | string | Yes | Section heading in primary language |
| sections[].title_secondary | string | No | Section heading in secondary language |
| sections[].category | "food" or "drink" | Yes | Drives food grid vs drink list layout. Only "food" items get generated images. |
| sections[].note | string | No | Section-level note (e.g. "served with rice", "Mon-Fri 11-15h") |
| items[].code | string | Yes | Unique per item. Links to image filename. Use existing codes (M1, K2) or generate (A1, A2) |
| items[].name | string | Yes | Primary language. For CJK menus, this is the CJK name |
| items[].name_secondary | string | No | Secondary language. For CJK menus, this is the English/Latin name |
| items[].description | string | No | Brief description. Fed to build_food_prompt() for image generation |
| items[].description_secondary | string | No | Description in secondary language |
| items[].price | string | Yes | Preserve original format ("12,90" not "12.90") |
| items[].price_prefix | string | No | e.g. "ab" (starting from), "ca." |
| items[].variants | array | No | [{"label": "6 Stk", "price": "8,90"}, ...] β set main price to smallest variant |
| items[].allergens | string | No | Space-separated codes exactly as printed: "A C F" |
| items[].dietary | array | No | ["vegan", "vegetarian", "spicy", "gluten-free", "halal", "kosher"] |
| allergen_legend | object | No | Map of allergen codes to display names: {"A": "Gluten", ...} |
| metadata.currency | string | Yes | ISO code: "EUR", "USD", "JPY", "CNY", "THB", etc. |
| metadata.languages | array | No | Languages detected in the menu: ["German", "English"] |
Send this exact prompt to Gemini. It defines the schema AND the extraction rules. Do not paraphrase it.
You are a restaurant menu data extractor. Analyze this menu content and extract ALL items into structured JSON.
Return this exact JSON structure:
{
"restaurant": {
"name": "Restaurant Name (if visible)",
"cuisine": "cuisine type (Chinese, Indian, Austrian, Japanese, etc.)",
"tagline": "any subtitle or tagline"
},
"sections": [
{
"title": "Section Name (in primary language)",
"title_secondary": "Section name in secondary language (if present, else empty string)",
"category": "food or drink",
"note": "Any section note (e.g. 'served with rice', 'Mon-Fri 11-15h')",
"items": [
{
"code": "M1",
"name": "Dish Name (primary language)",
"name_secondary": "Name in secondary language (if present)",
"description": "Brief description (primary language)",
"description_secondary": "Description in secondary language (if present)",
"price": "12,90",
"price_prefix": "",
"allergens": "A C F",
"dietary": ["vegan", "spicy"],
"variants": []
}
]
}
],
"allergen_legend": {
"A": "Gluten",
"B": "Crustaceans"
},
"metadata": {
"languages": ["German", "English"],
"currency": "EUR"
}
}
CRITICAL RULES:
1. Extract EVERY item. Do not skip ANY dish, drink, or menu entry.
2. Preserve original item codes/numbers if present (M1, K2, S3, etc.). If none exist, generate sequential codes per section (e.g. A1, A2 for appetizers, M1, M2 for mains).
3. Extract prices EXACTLY as written (preserve comma/period format).
4. If an item has a price prefix like "ab" (starting from), capture it in "price_prefix".
5. If an item has multiple size/quantity variants (e.g. 6 Stk / 12 Stk / 18 Stk at different prices), use the "variants" array:
[{"label": "6 Stk", "price": "8,90"}, {"label": "12 Stk", "price": "15,90"}]
In this case, set the main "price" to the smallest variant's price.
6. Capture allergen codes exactly as shown (letters, numbers, or symbols).
7. If an allergen legend is visible anywhere, include it in "allergen_legend".
8. Identify dietary flags from descriptions/icons: vegan, vegetarian, spicy, gluten-free, halal, kosher.
9. If the menu is bilingual, capture BOTH languages. Put the primary/dominant language in name/description and the secondary in name_secondary/description_secondary.
10. For set menus or lunch specials with a fixed price covering multiple choices, create a section with note explaining the format, and list each choice as an item.
11. Classify each section as "food" or "drink".
12. For drinks, still extract name, price, and any size variants.
Return ONLY valid JSON. No markdown fences, no explanatory text.
For image-based inputs (screenshots, PDF pages, photos), prepend a context line before the base prompt:
EXTRACTION_PROMPT_VISION = (
"You are a restaurant menu data extractor. "
"This is a photo/scan of a restaurant menu page.\n\n"
"Return this exact JSON structure:"
+ EXTRACTION_PROMPT.split("Return this exact JSON structure:")[1]
)
Then each input type adds its own prefix:
| Input Type | Prefix prepended to EXTRACTION_PROMPT_VISION |
|---|---|
| Screenshot | "This is a screenshot of a restaurant menu webpage at {url}. Extract ALL visible menu items.\n\n" |
| PDF page | "This is page {n} of a restaurant menu PDF. Extract ALL menu items from this page.\n\n" |
| Photo | "This is a photograph of a restaurant menu. Extract ALL visible menu items.\n\n" |
| Text (static HTML) | Use EXTRACTION_PROMPT directly (no vision variant needed) |
import os
from google import genai
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
def gemini_config():
return genai.types.GenerateContentConfig(
max_output_tokens=65536, # 64K β needed for large menus
response_mime_type="application/json", # JSON mode β critical
)
# Model: gemini-2.5-flash (default)
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=prompt_text, # or [image, prompt_text] for vision
config=gemini_config(),
)
# ALWAYS check for truncation
if response.candidates[0].finish_reason.name == "MAX_TOKENS":
print("WARNING: Response truncated. Menu may be incomplete.")
Use this exact function. It produces the casual phone-photo aesthetic that makes menus look authentic.
def build_food_prompt(name: str, description: str, cuisine: str = "") -> str:
cuisine_context = f" {cuisine}" if cuisine else ""
food_desc = f"{name}"
if description and description != name:
food_desc += f" ({description})"
return (
f"A photo of {food_desc} at a{cuisine_context} restaurant. "
f"Taken casually with a phone from across the table at a 45-degree angle. "
f"The plate sits on a dark wooden table and takes up only 30% of the frame. "
f"Lots of visible table surface around the plate. Chopsticks, napkins, "
f"a glass of water, and small side dishes scattered naturally nearby. "
f"Blurred restaurant interior in the background β other diners, pendant lights, "
f"wooden chairs visible but out of focus. Warm ambient lighting. "
f"NOT a close-up. NOT professional food photography. "
f"It looks like someone quickly snapped a photo before eating."
)
import os, io
from PIL import Image
from google import genai
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
def generate_gemini(client, name, description, output_path, cuisine=""):
prompt = build_food_prompt(name, description, cuisine)
response = client.models.generate_content(
model="gemini-2.5-flash-image", # NOT gemini-2.5-flash (that's text-only)
contents=prompt,
config=genai.types.GenerateContentConfig(
response_modalities=["TEXT", "IMAGE"], # critical β requests image output
),
)
# Extract generated image from response parts
for part in response.candidates[0].content.parts:
if part.inline_data is not None:
img = Image.open(io.BytesIO(part.inline_data.data)).convert("RGB")
# Center-crop to square, resize to 800x800
w, h = img.size
side = min(w, h)
left = (w - side) // 2
top = (h - side) // 2
img = img.crop((left, top, left + side, top + side))
img = img.resize((800, 800), Image.LANCZOS)
img.save(str(output_path), "JPEG", quality=82)
return
raise RuntimeError("No image in Gemini response")
Only generate images for category == "food" sections. Drinks get a text-only list in the HTML output.
Menus can be in ANY language. The pipeline handles this through bilingual fields and smart prompt routing.
name / description = primary language (whatever the menu is mostly written in)name_secondary / description_secondary = secondary language (if bilingual)CJK characters produce bad image prompts. Before calling build_food_prompt(), swap to the Latin name:
def prepare_for_image_gen(name, name_secondary, description):
"""Use Latin-script name for image prompts. CJK β use secondary name."""
display_name = name
if name_secondary:
if any(ord(c) > 0x2E80 for c in name): # CJK/Hangul/Kana detection
display_name = name_secondary
description = description or name
else:
description = description or name_secondary
return display_name, description
Unicode ranges covered by ord(c) > 0x2E80:
name renders as the large display textname_secondary renders below it in smaller textNoto Sans SC, Noto Sans JP, Noto Sans KR)All filenames are derived from the restaurant name or source URL:
stem = "shoyu" # derived from URL domain, PDF filename, or restaurant name
data_file = f"menu_data_{stem}.json"
images_dir = Path(f"images/{stem}")
html_file = f"{restaurant_name}_Menu.html" # e.g. "Shoyu_Menu.html"
images/{restaurant_stem}/{code}.jpg
# restaurant_stem = data filename minus "menu_data_" prefix
# Example: menu_data_shoyu.json β images/shoyu/M1.jpg
Returns POSIX-style string paths with ./ prefix for cross-platform HTML compatibility:
def find_image(code: str, images_dir: Path):
"""Return relative POSIX path string to image, or None."""
if not images_dir.is_dir():
return None
rel = images_dir.as_posix()
if not rel.startswith("./"):
rel = "./" + rel
# 1. Exact match
for ext in ("jpg", "jpeg", "webp", "png"):
candidate = images_dir / f"{code}.{ext}"
if candidate.exists():
return f"{rel}/{code}.{ext}"
# 2. Case-insensitive fallback
for f in images_dir.iterdir():
if f.stem.lower() == code.lower() and f.suffix.lower() in (".jpg", ".jpeg", ".webp", ".png"):
return f"{rel}/{f.name}"
return None
{RestaurantName}_Menu.html # CSS/JS inline, images as relative file paths
The build script uses find_image() to resolve each food item's photo, falling back to a gradient SVG placeholder when no image exists:
import base64
import html as html_mod
GRADIENT_COLORS = [
("#c41e3a", "#8b0000"), ("#ff6b6b", "#ee5a24"), ("#fdcb6e", "#e17055"),
("#00b894", "#00cec9"), ("#6c5ce7", "#a29bfe"), ("#e17055", "#d63031"),
("#00cec9", "#0984e3"), ("#fab1a0", "#e17055"), ("#e8a87c", "#d4956b"),
("#fd79a8", "#e84393"),
]
def make_placeholder_svg(code: str, name: str, secondary: str = "") -> str:
"""Generate a base64-encoded SVG placeholder when no image exists."""
idx = hash(code) % len(GRADIENT_COLORS)
c1, c2 = GRADIENT_COLORS[idx]
display = html_mod.escape(secondary[:12] if secondary else name[:12])
svg = f'''<svg xmlns="http://www.w3.org/2000/svg" width="220" height="180" viewBox="0 0 220 180">
<defs><linearGradient id="g" x1="0%" y1="0%" x2="100%" y2="100%">
<stop offset="0%" style="stop-color:{c1}"/>
<stop offset="100%" style="stop-color:{c2}"/>
</linearGradient></defs>
<rect width="220" height="180" fill="url(#g)" rx="12"/>
<text x="110" y="75" text-anchor="middle" fill="rgba(255,255,255,0.25)" font-size="56" font-family="serif">{html_mod.escape(code)}</text>
<text x="110" y="120" text-anchor="middle" fill="white" font-size="26" font-family="serif" opacity="0.9">{display}</text>
<text x="110" y="148" text-anchor="middle" fill="rgba(255,255,255,0.6)" font-size="11" font-family="sans-serif">{html_mod.escape(name[:30])}</text>
</svg>'''
b64 = base64.b64encode(svg.encode("utf-8")).decode("ascii")
return f"data:image/svg+xml;base64,{b64}"
def image_tag(code: str, name: str, secondary: str, images_dir: Path, portable: bool = False) -> str:
"""Return <img> tag β real image OR gradient SVG placeholder.
If portable=True, embed the real image as base64 data URI for single-file output."""
real = find_image(code, images_dir)
if real:
if portable:
img_path = images_dir.parent / real # resolve relative path
with open(img_path, "rb") as f:
b64 = base64.b64encode(f.read()).decode("ascii")
return f'<img src="data:image/jpeg;base64,{b64}" alt="{html_mod.escape(name)}">'
return f'<img src="{html_mod.escape(real)}" alt="{html_mod.escape(name)}" loading="lazy">'
else:
src = make_placeholder_svg(code, name, secondary)
return f'<img src="{src}" alt="{html_mod.escape(name)}">'
The HTML builder supports two output modes controlled by a --portable flag:
| Mode | Flag | Images | Output | Use Case |
|------|------|--------|--------|----------|
| Portable (default) | --portable or no GITHUB_* env vars | Base64 embedded in HTML | Single self-contained .html file | Open locally, email, drag-drop to any host |
| Deployable | --no-portable or GITHUB_* env vars set | Relative paths (./images/stem/code.jpg) | HTML + images/ directory | GitHub Pages, Netlify, any static host |
Portable mode embeds all food images as base64 data URIs directly in the HTML. File sizes are larger (~4-6MB for an 80-item menu) but the output is a single file that works everywhere with zero hosting setup. This is the default when no GITHUB_* environment variables are set.
Deployable mode uses relative image paths and requires the HTML file and images/ directory to be hosted together. Use this when publishing to GitHub Pages or any static hosting service.
All Gemini API calls should retry on transient failures:
import time
def call_with_retry(fn, *args, max_retries=3, **kwargs):
"""Retry API calls with exponential backoff."""
for attempt in range(max_retries):
try:
return fn(*args, **kwargs)
except Exception as e:
if attempt == max_retries - 1:
raise
wait = 2 ** attempt
print(f" Retry {attempt + 1}/{max_retries} in {wait}s: {e}")
time.sleep(wait)
Gemini sometimes wraps JSON in markdown fences or produces trailing commas. Parse defensively β try raw parse first, apply trailing comma fix only as last resort (unconditional fix can corrupt valid JSON strings containing ,] patterns):
import re, json
def parse_gemini_json(raw: str) -> dict:
"""Parse JSON from Gemini, handling markdown fences and quirks."""
text = raw.strip()
# Strip markdown code fences
if text.startswith("```"):
text = re.sub(r"^```\w*\n?", "", text)
text = re.sub(r"\n?```$", "", text)
text = text.strip()
# Try direct parse first
try:
return json.loads(text)
except json.JSONDecodeError:
pass
# Try extracting JSON object from surrounding text
match = re.search(r"\{.*\}", text, re.DOTALL)
if match:
candidate = match.group(0)
try:
return json.loads(candidate)
except json.JSONDecodeError:
pass
# Fix trailing commas and retry
candidate = re.sub(r",\s*([\]}])", r"\1", candidate)
try:
return json.loads(candidate)
except json.JSONDecodeError:
pass
# Last resort: fix trailing commas on original
text = re.sub(r",\s*([\]}])", r"\1", text)
return json.loads(text)
After extraction, run these cleanups:
def generate_codes(data: dict) -> dict:
"""Ensure every item has a unique code. Generates sequential codes per section
if items have empty/missing codes (e.g. A1, A2 for appetizers, M1, M2 for mains)."""
# ... assign prefix by section title, increment counter per section
return data
def normalize_prices(data: dict) -> dict:
"""Normalize price formats: numeric β string, strip currency symbols,
preserve comma/period format as-is."""
# ... convert float/int to string, strip β¬/$, etc.
return data
Maps ISO currency codes to display symbols for the HTML output:
CURRENCY_MAP = {
"EUR": "β¬", "USD": "$", "GBP": "Β£", "CHF": "CHF ",
"JPY": "Β₯", "CNY": "Β₯", "INR": "βΉ", "AUD": "A$",
"CAD": "C$", "SEK": "kr ", "NOK": "kr ", "DKK": "kr ",
"THB": "ΰΈΏ", "KRW": "β©", "HKD": "HK$", "SGD": "S$",
"CZK": "KΔ ", "HUF": "Ft ", "PLN": "zΕ ", "TRY": "βΊ",
}
requestsdensity = len(soup.get_text(strip=True)) / len(raw_html)r"[$β¬Β£Β₯βΉCHF]\s*\d+[.,]\d{2}|\d+[.,]\d{2}\s*[$β¬Β£Β₯βΉ]"), force density to 1.0 (treat as static)seen_codes = set() across chunks β for each item in each chunk's sections, skip if item["code"] already in seen_codes. Only append sections that still have items after dedup.β¬ pill) that cycles or opens a picker for: EUR, USD, AUD, CAD, GBP. Converts all displayed prices client-side using snapshot exchange rates embedded at build time. Updates grid overlays, receipt totals, drink prices, and variant prices. Source currency comes from metadata.currency../images/{stem}/{code}.jpg), only Google Fonts externalfamily=Noto+Sans+SC:wght@400;700&family=Noto+Sans+JP:wght@400;700&family=Noto+Sans+KR:wght@400;700font-family stack: primary font, then 'Noto Sans SC', 'Noto Sans JP', 'Noto Sans KR', sans-serifA minimalist currency toggle built into the HTML output. All client-side, no API calls at runtime.
Implementation:
RATES object with snapshot exchange rates (base: USD) at build timemetadata.currency in the JSON datadata-price attributes as numeric values (not raw strings like "12,90")β¬)β¬), USD ($), GBP (Β£), AUD (A$), CAD (C$)data-price values and updates displayed textconvertPrice() using SOURCE_CURRENCY and currentCurrencylocalStoragePrice parsing helper (build-time β converts string prices to numeric for data-price attributes):
import re
def _parse_price_numeric(price: str) -> str:
"""Parse price string to numeric float for data-price attribute."""
matches = re.findall(r"(\d+[.,]\d+)", price)
if matches:
return str(float(matches[-1].replace(",", ".")))
return "0"
# Usage in HTML template:
# <div class="price" data-price="{_parse_price_numeric(item['price'])}">β¬12,90</div>
// Snapshot rates embedded at build time (base: USD)
const RATES = { EUR: 0.92, USD: 1.00, GBP: 0.79, AUD: 1.54, CAD: 1.36 };
const SYMBOLS = { EUR: "β¬", USD: "$", GBP: "Β£", AUD: "A$", CAD: "C$" };
const SOURCE_CURRENCY = "EUR"; // from metadata.currency
function convertPrice(amount, fromCurrency, toCurrency) {
const inUSD = amount / RATES[fromCurrency];
return inUSD * RATES[toCurrency];
}
// Applied to: grid overlay prices, drink list prices, variant prices,
// AND receipt/selection tab totals (all elements with data-price attribute)
The build script should fetch current rates at build time (or use reasonable defaults if offline). Prices display with 2 decimal places in the target currency, using the target locale's format.
--name "Restaurant Name" # Header brand text
--tagline "Cuisine Β· City" # Subtitle
--accent "#ff6b00" # Primary color (pills, active tab, drink prices)
--bg "#0a0a0a" # Background color
| Component | Cost | |-----------|------| | Extraction (per page) | ~$0.001 | | Image generation (per food item) | $0.039 | | 80 food items | ~$3.12 | | Time (80 food items) | ~8 min |
Drinks are not image-generated (text-only list), so actual cost depends on food-to-drink ratio.
Requires Python 3.9+.
Required:
google-genai (extraction + image generation)Pillow (image processing)For HTML URLs:
requests (HTTP fetching)beautifulsoup4 (HTML parsing)For JS-rendered sites:
playwright (headless browser screenshots)For PDF files:
PyMuPDF (PDF to image conversion)pip install google-genai Pillow requests beautifulsoup4 PyMuPDF
pip install playwright && playwright install chromium
GOOGLE_API_KEY β Required for extraction and image generationGITHUB_PAT β Required for GitHub Pages publishingGITHUB_OWNER β Your GitHub username (default: reads from git config)GITHUB_REPO β Your GitHub Pages repo name (default: menus)When no GITHUB_* environment variables are set, the pipeline generates a self-contained HTML file with base64-embedded images. Users can:
No hosting setup, no API keys beyond GOOGLE_API_KEY, no git configuration needed.
For users who want a persistent gallery with multiple menus:
your-username/menus)main branchexport GITHUB_PAT="your-personal-access-token" # Required β used for git push auth
export GITHUB_OWNER="your-username" # Required β YOUR GitHub username
export GITHUB_REPO="menus" # Optional β defaults to "menus"
Important: publish_menu.py MUST read GITHUB_OWNER and GITHUB_REPO from environment variables. Never hardcode a specific user's repo. The generated code should construct the repo URL dynamically:
owner = os.environ["GITHUB_OWNER"]
repo = os.environ.get("GITHUB_REPO", "menus")
GITHUB_REPO = f"{owner}/{repo}"
GITHUB_PAGES_BASE = f"https://{owner}.github.io/{repo}"
python publish_menu.py Restaurant_Menu.html --name "Restaurant" --tagline "Cuisine Β· City" --cuisine Type
Gallery: https://<your-username>.github.io/<repo>/
publish_menu.py clones the menus repo to a temp directory on native filesystem (git clone --depth=1), copies files there, commits, and pushes. This avoids all NTFS bind mount permission issues that occur when operating directly on mounted volumes in Docker containers.
Key implementation details:
git clone --depth=1 to a tempfile.mkdtemp() directory (native FS, proper POSIX permissions)shutil.copy() (not copy2 β avoids os.chmod() EPERM on NTFS)find_image_dirs regex uses [^/"]+ (not [a-z_]+) to match Unicode chars in image dir names.meta_ JSON sidecar for gallery metadataindex.htmlGITHUB_PAT env var embedded in the clone URLMENUS_REPO_DIR (bind mount path) is only used for --list read-only queries| Endpoint | Data Sent | Purpose |
|----------|-----------|---------|
| generativelanguage.googleapis.com | Menu text, page screenshots, PDF page images, food photo prompts | Gemini API for extraction (JSON mode) and image generation |
| Target restaurant URL | HTTP GET only | Fetching the menu page HTML for extraction |
| api.github.com | Generated HTML file, image files | Publishing menu to GitHub Pages (optional, requires GITHUB_PAT) |
| fonts.googleapis.com | None (CSS link in HTML output) | Google Fonts loaded client-side when menu HTML is opened in browser |
No analytics, telemetry, or tracking. No data is sent to any endpoint beyond those listed above.
GOOGLE_API_KEY is read from environment variables, never hardcoded or loggedimages/ directory. When published, uploaded only to the user's own GitHub Pages repoMachine endpoints, contract coverage, trust signals, runtime metrics, benchmarks, and guardrails for agent-to-agent use.
Machine interfaces
Contract coverage
Status
missing
Auth
None
Streaming
No
Data region
Unspecified
Protocol support
Requires: none
Forbidden: none
Guardrails
Operational confidence: low
curl -s "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/snapshot"
curl -s "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/contract"
curl -s "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/trust"
Operational fit
Trust signals
Handshake
UNKNOWN
Confidence
unknown
Attempts 30d
unknown
Fallback rate
unknown
Runtime metrics
Observed P50
unknown
Observed P95
unknown
Rate limit
unknown
Estimated cost
unknown
Do not use if
Raw contract, invocation, trust, capability, facts, and change-event payloads for machine-side inspection.
Contract JSON
{
"contractStatus": "missing",
"authModes": [],
"requires": [],
"forbidden": [],
"supportsMcp": false,
"supportsA2a": false,
"supportsStreaming": false,
"inputSchemaRef": null,
"outputSchemaRef": null,
"dataRegion": null,
"contractUpdatedAt": null,
"sourceUpdatedAt": null,
"freshnessSeconds": null
}Invocation Guide
{
"preferredApi": {
"snapshotUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/snapshot",
"contractUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/contract",
"trustUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/trust"
},
"curlExamples": [
"curl -s \"https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/snapshot\"",
"curl -s \"https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/contract\"",
"curl -s \"https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/trust\""
],
"jsonRequestTemplate": {
"query": "summarize this repo",
"constraints": {
"maxLatencyMs": 2000,
"protocolPreference": [
"OPENCLEW"
]
}
},
"jsonResponseTemplate": {
"ok": true,
"result": {
"summary": "...",
"confidence": 0.9
},
"meta": {
"source": "CLAWHUB",
"generatedAt": "2026-04-17T06:31:45.025Z"
}
},
"retryPolicy": {
"maxAttempts": 3,
"backoffMs": [
500,
1500,
3500
],
"retryableConditions": [
"HTTP_429",
"HTTP_503",
"NETWORK_TIMEOUT"
]
}
}Trust JSON
{
"status": "unavailable",
"handshakeStatus": "UNKNOWN",
"verificationFreshnessHours": null,
"reputationScore": null,
"p95LatencyMs": null,
"successRate30d": null,
"fallbackRate": null,
"attempts30d": null,
"trustUpdatedAt": null,
"trustConfidence": "unknown",
"sourceUpdatedAt": null,
"freshnessSeconds": null
}Capability Matrix
{
"rows": [
{
"key": "OPENCLEW",
"type": "protocol",
"support": "unknown",
"confidenceSource": "profile",
"notes": "Listed on profile"
},
{
"key": "of",
"type": "capability",
"support": "supported",
"confidenceSource": "profile",
"notes": "Declared in agent profile metadata"
},
{
"key": "be",
"type": "capability",
"support": "supported",
"confidenceSource": "profile",
"notes": "Declared in agent profile metadata"
},
{
"key": "corrupt",
"type": "capability",
"support": "supported",
"confidenceSource": "profile",
"notes": "Declared in agent profile metadata"
},
{
"key": "two",
"type": "capability",
"support": "supported",
"confidenceSource": "profile",
"notes": "Declared in agent profile metadata"
}
],
"flattenedTokens": "protocol:OPENCLEW|unknown|profile capability:of|supported|profile capability:be|supported|profile capability:corrupt|supported|profile capability:two|supported|profile"
}Facts JSON
[
{
"factKey": "docs_crawl",
"category": "integration",
"label": "Crawlable docs",
"value": "6 indexed pages on the official domain",
"href": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
"sourceUrl": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
"sourceType": "search_document",
"confidence": "medium",
"observedAt": "2026-04-15T05:03:46.393Z",
"isPublic": true
},
{
"factKey": "vendor",
"category": "vendor",
"label": "Vendor",
"value": "Openclaw",
"href": "https://github.com/openclaw/skills/tree/main/skills/ademczuk/menuvision",
"sourceUrl": "https://github.com/openclaw/skills/tree/main/skills/ademczuk/menuvision",
"sourceType": "profile",
"confidence": "medium",
"observedAt": "2026-04-15T00:45:39.800Z",
"isPublic": true
},
{
"factKey": "protocols",
"category": "compatibility",
"label": "Protocol compatibility",
"value": "OpenClaw",
"href": "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/contract",
"sourceUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/contract",
"sourceType": "contract",
"confidence": "medium",
"observedAt": "2026-04-15T00:45:39.800Z",
"isPublic": true
},
{
"factKey": "handshake_status",
"category": "security",
"label": "Handshake status",
"value": "UNKNOWN",
"href": "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/trust",
"sourceUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-ademczuk-menuvision/trust",
"sourceType": "trust",
"confidence": "medium",
"observedAt": null,
"isPublic": true
}
]Change Events JSON
[
{
"eventType": "docs_update",
"title": "Docs refreshed: Sign in to GitHub Β· GitHub",
"description": "Fresh crawlable documentation was indexed for the official domain.",
"href": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
"sourceUrl": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
"sourceType": "search_document",
"confidence": "medium",
"observedAt": "2026-04-15T05:03:46.393Z",
"isPublic": true
}
]Sponsored
Ads related to menuvision and adjacent AI workflows.