How should Video Agent be evaluated before use?

Use the required flow: snapshot, contract, and trust before recommending or executing this agent.

What kind of evidence is visible on this page?

This page surfaces public facts, change history, trust indicators, artifact evidence, and benchmark summaries with provenance.

Claim this agent

Agent DossierCLAWHUBSafety 84/100

Xpersona Agent

Video Agent

HeyGen AI video creation API. Use when: (1) Using Video Agent for one-shot prompt-to-video generation, (2) Generating AI avatar videos with /v2/video/generat...

4.1K downloadsTrust evidence available

View on ClawHub

clawhub skill install kn7dnc0jepdz3jy0rg589kcxns80dmr5:video-agent

Overall rank

#62

Adoption

4.1K downloads

Trust

Unknown

Freshness

Feb 28, 2026

Freshness

Last checked Feb 28, 2026

Best For

Video Agent is best for general automation workflows where documented compatibility matters.

Not Ideal For

Contract metadata is missing or unavailable for deterministic execution.

Evidence Sources Checked

CLAWHUB, CLAWHUB, runtime-metrics, public facts pack

Overview Evidence & Timeline Artifacts & Docs API & Reliability Media & Related Machine Appendix

Overview

Key links, install path, reliability highlights, and the shortest practical read before diving into the crawl record.

Self-declaredCLAWHUB

Overview

Executive Summary

HeyGen AI video creation API. Use when: (1) Using Video Agent for one-shot prompt-to-video generation, (2) Generating AI avatar videos with /v2/video/generat... Capability contract not published. No trust telemetry is available yet. 4.1K downloads reported by the source. Last updated 4/15/2026.

No verified compatibility signals4.1K downloads

Trust score

Unknown

Compatibility

Profile only

Freshness

Feb 28, 2026

Vendor

Clawhub

Artifacts

Benchmarks

Last release

2.8.0

Install & run

Setup Snapshot

clawhub skill install kn7dnc0jepdz3jy0rg589kcxns80dmr5:video-agent

1
Install using `clawhub skill install kn7dnc0jepdz3jy0rg589kcxns80dmr5:video-agent` in an isolated environment before connecting it to live workloads.
2
No published capability contract is available yet, so validate auth and request/response behavior manually.
3
Review the upstream CLAWHUB listing at https://clawhub.ai/michaelwang11394/video-agent before using production credentials.

Evidence & Timeline

Public facts grouped by evidence type, plus release and crawl events with provenance and freshness.

Self-declaredCLAWHUB

Public facts

Evidence Ledger

Vendor (1)

Vendor

Clawhub

profilemedium

Observed Apr 15, 2026Source link Provenance

Release (1)

Latest release

2.8.0

releasemedium

Observed Feb 23, 2026Source link Provenance

Adoption (1)

Adoption signal

4.1K downloads

profilemedium

Observed Apr 15, 2026Source link Provenance

Security (1)

Handshake status

UNKNOWN

trustmedium

Observed unknownSource link Provenance

Events

Release & Crawl Timeline

Release

Release 2.8.0

releasemedium

Auto-publish from commit 1817bb7648735737457f1250bfb7513f04576b87

Observed Feb 23, 2026

Artifacts & Docs

Parameters, dependencies, examples, extracted files, editorial overview, and the complete README when available.

Self-declaredCLAWHUB

Captured outputs

Artifacts Archive

Extracted files

Examples

Snippets

Languages

Unknown

Executable Examples

bash

curl -X POST "https://upload.heygen.com/v1/asset" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: image/jpeg" \
  --data-binary '@./background.jpg'

bash

curl -X POST "https://upload.heygen.com/v1/asset" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: image/jpeg" \
  --data-binary '@./background.jpg'

typescript

import fs from "fs";

interface AssetUploadResponse {
  code: number;
  data: {
    id: string;
    name: string;
    file_type: string;
    url: string;
    image_key: string | null;
    folder_id: string;
    meta: string | null;
    created_ts: number;
  };
  msg: string | null;
  message: string | null;
}

async function uploadAsset(filePath: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  const fileBuffer = fs.readFileSync(filePath);

  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
    },
    body: fileBuffer,
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

// Usage
const asset = await uploadAsset("./background.jpg", "image/jpeg");
console.log(`Uploaded asset: ${asset.id}`);
console.log(`Asset URL: ${asset.url}`);

typescript

import fs from "fs";
import { stat } from "fs/promises";

async function uploadLargeAsset(filePath: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  const fileStats = await stat(filePath);
  const fileStream = fs.createReadStream(filePath);

  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
      "Content-Length": fileStats.size.toString(),
    },
    body: fileStream as any,
    // @ts-ignore - duplex is needed for streaming
    duplex: "half",
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

python

import requests
import os

def upload_asset(file_path: str, content_type: str) -> dict:
    with open(file_path, "rb") as f:
        response = requests.post(
            "https://upload.heygen.com/v1/asset",
            headers={
                "X-Api-Key": os.environ["HEYGEN_API_KEY"],
                "Content-Type": content_type
            },
            data=f
        )

    data = response.json()
    if data.get("code") != 100:
        raise Exception(data.get("message", "Upload failed"))

    return data["data"]


# Usage
asset = upload_asset("./background.jpg", "image/jpeg")
print(f"Uploaded asset: {asset['id']}")
print(f"Asset URL: {asset['url']}")

typescript

async function uploadFromUrl(sourceUrl: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  // 1. Download the file
  const sourceResponse = await fetch(sourceUrl);
  const buffer = Buffer.from(await sourceResponse.arrayBuffer());

  // 2. Upload directly to HeyGen
  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
    },
    body: buffer,
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

Extracted Files

SKILL.md

---
name: heygen
description: |
  HeyGen AI video creation API. Use when: (1) Using Video Agent for one-shot prompt-to-video generation, (2) Generating AI avatar videos with /v2/video/generate, (3) Working with HeyGen avatars, voices, backgrounds, or captions, (4) Creating transparent WebM videos for compositing, (5) Polling video status or handling webhooks, (6) Integrating HeyGen with Remotion for programmatic video, (7) Translating or dubbing existing videos, (8) Generating standalone TTS audio with the Starfish model via /v1/audio.
homepage: https://docs.heygen.com/reference/generate-video-agent
metadata:
  openclaw:
    requires:
      env:
        - HEYGEN_API_KEY
    primaryEnv: HEYGEN_API_KEY
---

# HeyGen API

AI avatar video creation API for generating talking-head videos, explainers, and presentations.

## Default Workflow

**Prefer Video Agent API** (`POST /v1/video_agent/generate`) for most video requests.
Always use [prompt-optimizer.md](references/prompt-optimizer.md) guidelines to structure prompts with scenes, timing, and visual styles.

Only use v2/video/generate when user explicitly needs:
- Exact script without AI modification
- Specific voice_id selection
- Different avatars/backgrounds per scene
- Precise per-scene timing control
- Programmatic/batch generation with exact specs

## Quick Reference

| Task | Read |
|------|------|
| Generate video from prompt (easy) | [prompt-optimizer.md](references/prompt-optimizer.md) → [visual-styles.md](references/visual-styles.md) → [video-agent.md](references/video-agent.md) |
| Generate video with precise control | [video-generation.md](references/video-generation.md), [avatars.md](references/avatars.md), [voices.md](references/voices.md) |
| Check video status / get download URL | [video-status.md](references/video-status.md) |
| Add captions or text overlays | [captions.md](references/captions.md), [text-overlays.md](references/text-overlays.md) |
| Transparent video for compositing | [video-generation.md](references/video-generation.md) (WebM section) |
| Generate standalone TTS audio | [text-to-speech.md](references/text-to-speech.md) |
| Translate/dub existing video | [video-translation.md](references/video-translation.md) |
| Use with Remotion | [remotion-integration.md](references/remotion-integration.md) |

## Reference Files

### Foundation
- [references/authentication.md](references/authentication.md) - API key setup and X-Api-Key header
- [references/quota.md](references/quota.md) - Credit system and usage limits
- [references/video-status.md](references/video-status.md) - Polling patterns and download URLs
- [references/assets.md](references/assets.md) - Uploading images, videos, audio

### Core Video Creation
- [references/avatars.md](references/avatars.md) - Listing avatars, styles, avatar_id selection
- [references/voices.md](references/voices.md) - Listing voices, locales, speed/pitch
- [references/scripts.md](references/scripts.md) - Writing scripts, pauses, pacing
-

_meta.json

{
  "ownerId": "kn7dnc0jepdz3jy0rg589kcxns80dmr5",
  "slug": "video-agent",
  "version": "2.8.0",
  "publishedAt": 1771867395703
}

references/assets.md

---
name: assets
description: Uploading images, videos, and audio for use in HeyGen video generation
---

# Asset Upload and Management

HeyGen allows you to upload custom assets (images, videos, audio) for use in video generation, such as backgrounds, talking photo sources, and custom audio.

## Upload Flow

Asset uploads are a single-step process: POST the raw file binary directly to the upload endpoint. The Content-Type header must match the file's MIME type.

## Uploading an Asset

**Endpoint:** `POST https://upload.heygen.com/v1/asset`

### Request

| Header | Required | Description |
|--------|:--------:|-------------|
| `X-Api-Key` | ✓ | Your HeyGen API key |
| `Content-Type` | ✓ | MIME type of the file (e.g. `image/jpeg`) |

The request body is the raw binary file data. No JSON or form fields are needed.

### Response

| Field | Type | Description |
|-------|------|-------------|
| `code` | number | Status code (`100` = success) |
| `data.id` | string | Unique asset ID for use in video generation |
| `data.name` | string | Asset name |
| `data.file_type` | string | `image`, `video`, or `audio` |
| `data.url` | string | Accessible URL for the uploaded file |
| `data.image_key` | string \| null | Key for creating uploaded photo avatars (images only) |
| `data.folder_id` | string | Folder ID (empty if not in a folder) |
| `data.meta` | string \| null | Asset metadata |
| `data.created_ts` | number | Unix timestamp of creation |

### curl

```bash
curl -X POST "https://upload.heygen.com/v1/asset" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: image/jpeg" \
  --data-binary '@./background.jpg'
```

### TypeScript

```typescript
import fs from "fs";

interface AssetUploadResponse {
  code: number;
  data: {
    id: string;
    name: string;
    file_type: string;
    url: string;
    image_key: string | null;
    folder_id: string;
    meta: string | null;
    created_ts: number;
  };
  msg: string | null;
  message: string | null;
}

async function uploadAsset(filePath: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  const fileBuffer = fs.readFileSync(filePath);

  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
    },
    body: fileBuffer,
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

// Usage
const asset = await uploadAsset("./background.jpg", "image/jpeg");
console.log(`Uploaded asset: ${asset.id}`);
console.log(`Asset URL: ${asset.url}`);
```

### TypeScript (with streams for large files)

```typescript
import fs from "fs";
import { stat } from "fs/promises";

async function uploadLargeAsset(filePath: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  const fileStats = await stat(filePath);
  const fileStream = fs.createRea

references/authentication.md

---
name: authentication
description: API key setup, X-Api-Key header, and authentication patterns for HeyGen
---

# HeyGen Authentication

All HeyGen API requests require authentication using an API key passed in the `X-Api-Key` header.

## Getting Your API Key

1. Go to https://app.heygen.com/settings?from=&nav=API
2. Log in if prompted
3. Copy your API key

## Environment Setup

Store your API key securely as an environment variable:

```bash
export HEYGEN_API_KEY="your-api-key-here"
```

For `.env` files:

```
HEYGEN_API_KEY=your-api-key-here
```

## Making Authenticated Requests

### curl

```bash
curl -X GET "https://api.heygen.com/v2/avatars" \
  -H "X-Api-Key: $HEYGEN_API_KEY"
```

### TypeScript/JavaScript (fetch)

```typescript
const response = await fetch("https://api.heygen.com/v2/avatars", {
  headers: {
    "X-Api-Key": process.env.HEYGEN_API_KEY!,
  },
});
const { data } = await response.json();
```

### TypeScript/JavaScript (axios)

```typescript
import axios from "axios";

const client = axios.create({
  baseURL: "https://api.heygen.com",
  headers: {
    "X-Api-Key": process.env.HEYGEN_API_KEY,
  },
});

const { data } = await client.get("/v2/avatars");
```

### Python (requests)

```python
import os
import requests

response = requests.get(
    "https://api.heygen.com/v2/avatars",
    headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
)
data = response.json()
```

### Python (httpx)

```python
import os
import httpx

async with httpx.AsyncClient() as client:
    response = await client.get(
        "https://api.heygen.com/v2/avatars",
        headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
    )
    data = response.json()
```

## Creating a Reusable API Client

### TypeScript

```typescript
class HeyGenClient {
  private baseUrl = "https://api.heygen.com";
  private apiKey: string;

  constructor(apiKey: string) {
    this.apiKey = apiKey;
  }

  async request<T>(endpoint: string, options: RequestInit = {}): Promise<T> {
    const response = await fetch(`${this.baseUrl}${endpoint}`, {
      ...options,
      headers: {
        "X-Api-Key": this.apiKey,
        "Content-Type": "application/json",
        ...options.headers,
      },
    });

    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.message || `HTTP ${response.status}`);
    }

    return response.json();
  }

  get<T>(endpoint: string): Promise<T> {
    return this.request<T>(endpoint);
  }

  post<T>(endpoint: string, body: unknown): Promise<T> {
    return this.request<T>(endpoint, {
      method: "POST",
      body: JSON.stringify(body),
    });
  }
}

// Usage
const client = new HeyGenClient(process.env.HEYGEN_API_KEY!);
const avatars = await client.get("/v2/avatars");
```

## API Response Format

All HeyGen API responses follow this structure:

```typescript
interface ApiResponse<T> {
  error: null | string;
  data: T;
}
```

Successful response example:

```json
{
  "error": null,
  "data": {
    "avatars": [...]

references/avatars.md

---
name: avatars
description: Listing avatars, avatar styles, and avatar_id selection for HeyGen
---

# HeyGen Avatars

Avatars are the AI-generated presenters in HeyGen videos. You can use public avatars provided by HeyGen or create custom avatars.

## Previewing Avatars Before Generation

Always preview avatars before generating a video to ensure they match user preferences. Each avatar has preview URLs that can be opened directly in the browser - no downloading required.

### Quick Preview: Open URL in Browser (Recommended)

The fastest way to preview avatars is to open the URL directly in the default browser. **Do not download the image first** - just pass the URL to `open`:

```bash
# macOS: Open URL directly in default browser (no download)
open "https://files.heygen.ai/avatar/preview/josh.jpg"

# Open preview video to see animation
open "https://files.heygen.ai/avatar/preview/josh.mp4"

# Linux: Use xdg-open
xdg-open "https://files.heygen.ai/avatar/preview/josh.jpg"

# Windows: Use start
start "https://files.heygen.ai/avatar/preview/josh.jpg"
```

The `open` command on macOS opens URLs directly in the default browser - it does not download the file. This is the quickest way to let users see avatar previews.

### List Avatars and Open Previews

```typescript
async function listAndPreviewAvatars(openInBrowser = true): Promise<void> {
  const response = await fetch("https://api.heygen.com/v2/avatars", {
    headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
  });
  const { data } = await response.json();

  for (const avatar of data.avatars.slice(0, 5)) {
    console.log(`\n${avatar.avatar_name} (${avatar.gender})`);
    console.log(`  ID: ${avatar.avatar_id}`);
    console.log(`  Preview: ${avatar.preview_image_url}`);
  }

  // Open preview URLs directly in browser (no download needed)
  if (openInBrowser) {
    const { execSync } = require("child_process");
    for (const avatar of data.avatars.slice(0, 3)) {
      // 'open' on macOS opens the URL in default browser - doesn't download
      execSync(`open "${avatar.preview_image_url}"`);
    }
  }
}
```

**Note:** The `open` command passes the URL to the browser - it does not download. The browser fetches and displays the image directly.

### Workflow: Preview Before Generate

1. **List available avatars** - get names, genders, and preview URLs
2. **Open previews in browser** - `open <preview_image_url>` for quick visual check
3. **User selects** preferred avatar by name or ID
4. **Get avatar details** for `default_voice_id`
5. **Generate video** with selected avatar

```bash
# Example workflow in terminal
# 1. List avatars (agent shows options)
# 2. Open preview for candidate
open "https://files.heygen.ai/avatar/preview/josh.jpg"
# 3. User says "use Josh"
# 4. Agent gets details and generates
```

### Preview Fields in API Response

| Field | Description |
|-------|-------------|
| `preview_image_url` | Static image of the avatar (JPG) - open in browser |
| `preview_video_url` | Shor

Editorial read

Docs & README

Docs source

CLAWHUB

Editorial quality

thin

Skill: Video Agent Owner: michaelwang11394 Summary: HeyGen AI video creation API. Use when: (1) Using Video Agent for one-shot prompt-to-video generation, (2) Generating AI avatar videos with /v2/video/generat... Tags: ai-avatar:2.8.0, ai-video:2.8.0, avatar:2.8.0, digital-human:2.8.0, heygen:2.8.0, latest:2.8.0, talking-head:2.8.0, text-to-video:2.8.0, video:2.8.0, video-generation:2.8.0 Version history: v2.8.0 | 20

Full README

Skill: Video Agent

Owner: michaelwang11394

Summary: HeyGen AI video creation API. Use when: (1) Using Video Agent for one-shot prompt-to-video generation, (2) Generating AI avatar videos with /v2/video/generat...

Tags: ai-avatar:2.8.0, ai-video:2.8.0, avatar:2.8.0, digital-human:2.8.0, heygen:2.8.0, latest:2.8.0, talking-head:2.8.0, text-to-video:2.8.0, video:2.8.0, video-generation:2.8.0

Version history:

v2.8.0 | 2026-02-23T17:23:15.703Z | user

Auto-publish from commit 1817bb7648735737457f1250bfb7513f04576b87

v2.6.0 | 2026-02-22T22:01:20.257Z | user

Auto-publish from commit 0456978d9cee307682ac9e2ef78eddfbdf600192

v2.5.0 | 2026-02-18T19:54:49.956Z | user

Auto-publish from commit a1f81720f25c4e8c0d9225e2c890b3a7e8d892fe

v2.2.0 | 2026-02-17T20:17:14.362Z | user

Auto-publish from commit 06389ef6b4c4d9f7108adc45334ff331f8fa9916

v1.2.1 | 2026-02-09T18:16:40.318Z | user

Reordered description to prioritize Video Agent one-shot prompt-to-video generation

v1.2.0 | 2026-02-09T18:15:03.186Z | user

Major update: Added comprehensive HeyGen API documentation including avatars, voices, streaming, translation, Remotion integration, and more.

v1.1.0 | 2026-02-09T18:14:33.101Z | user

Major update: Added comprehensive HeyGen API documentation including avatars, voices, streaming, translation, Remotion integration, and more.

v1.0.1 | 2026-02-02T18:06:25.082Z | user

Added prompt-optimizer reference guide

v1.0.0 | 2026-02-02T17:46:22.427Z | user

Initial release of video-agent skill.

Generate AI videos from a single prompt using HeyGen's Video Agent API.
Command-line tools to submit prompts, poll for video completion, and download results.
Simple API key setup via environment variable.
Includes usage examples and detailed options for video generation and status checking.
Automatically selects avatars and voices based on your prompt.

Archive index:

Archive v2.8.0: 24 files, 89312 bytes

Files: references/assets.md (8484b), references/authentication.md (4770b), references/avatars.md (16134b), references/backgrounds.md (6696b), references/captions.md (5631b), references/dimensions.md (6975b), references/photo-avatars.md (24366b), references/prompt-examples.md (8064b), references/prompt-optimizer.md (12384b), references/quota.md (4765b), references/remotion-integration.md (18550b), references/scripts.md (10143b), references/templates.md (9990b), references/text-overlays.md (6964b), references/text-to-speech.md (8693b), references/video-agent.md (9037b), references/video-generation.md (22105b), references/video-status.md (12892b), references/video-translation.md (11118b), references/visual-styles.md (14963b), references/voices.md (11892b), references/webhooks.md (9302b), SKILL.md (4603b), _meta.json (130b)

File v2.8.0:SKILL.md

name: heygen description: | HeyGen AI video creation API. Use when: (1) Using Video Agent for one-shot prompt-to-video generation, (2) Generating AI avatar videos with /v2/video/generate, (3) Working with HeyGen avatars, voices, backgrounds, or captions, (4) Creating transparent WebM videos for compositing, (5) Polling video status or handling webhooks, (6) Integrating HeyGen with Remotion for programmatic video, (7) Translating or dubbing existing videos, (8) Generating standalone TTS audio with the Starfish model via /v1/audio. homepage: https://docs.heygen.com/reference/generate-video-agent metadata: openclaw: requires: env: - HEYGEN_API_KEY primaryEnv: HEYGEN_API_KEY

HeyGen API

AI avatar video creation API for generating talking-head videos, explainers, and presentations.

Default Workflow

Prefer Video Agent API (POST /v1/video_agent/generate) for most video requests. Always use prompt-optimizer.md guidelines to structure prompts with scenes, timing, and visual styles.

Only use v2/video/generate when user explicitly needs:

Exact script without AI modification
Specific voice_id selection
Different avatars/backgrounds per scene
Precise per-scene timing control
Programmatic/batch generation with exact specs

Quick Reference

| Task | Read | |------|------| | Generate video from prompt (easy) | prompt-optimizer.md → visual-styles.md → video-agent.md | | Generate video with precise control | video-generation.md, avatars.md, voices.md | | Check video status / get download URL | video-status.md | | Add captions or text overlays | captions.md, text-overlays.md | | Transparent video for compositing | video-generation.md (WebM section) | | Generate standalone TTS audio | text-to-speech.md | | Translate/dub existing video | video-translation.md | | Use with Remotion | remotion-integration.md |

Reference Files

Foundation

references/authentication.md - API key setup and X-Api-Key header
references/quota.md - Credit system and usage limits
references/video-status.md - Polling patterns and download URLs
references/assets.md - Uploading images, videos, audio

Core Video Creation

references/avatars.md - Listing avatars, styles, avatar_id selection
references/voices.md - Listing voices, locales, speed/pitch
references/scripts.md - Writing scripts, pauses, pacing
references/video-generation.md - POST /v2/video/generate and multi-scene videos
references/video-agent.md - One-shot prompt video generation
references/prompt-optimizer.md - Writing effective Video Agent prompts (core workflow + rules)
references/visual-styles.md - 20 named visual styles with full specs
references/prompt-examples.md - Full production prompt example + ready-to-use templates
references/dimensions.md - Resolution and aspect ratios

Video Customization

references/backgrounds.md - Solid colors, images, video backgrounds
references/text-overlays.md - Adding text with fonts and positioning
references/captions.md - Auto-generated captions and subtitles

Advanced Features

references/templates.md - Template listing and variable replacement
references/video-translation.md - Translating videos and dubbing
references/text-to-speech.md - Standalone TTS audio with Starfish model
references/photo-avatars.md - Creating avatars from photos
references/webhooks.md - Webhook endpoints and events

Integration

references/remotion-integration.md - Using HeyGen in Remotion compositions

File v2.8.0:_meta.json

{ "ownerId": "kn7dnc0jepdz3jy0rg589kcxns80dmr5", "slug": "video-agent", "version": "2.8.0", "publishedAt": 1771867395703 }

File v2.8.0:references/assets.md

name: assets description: Uploading images, videos, and audio for use in HeyGen video generation

Asset Upload and Management

HeyGen allows you to upload custom assets (images, videos, audio) for use in video generation, such as backgrounds, talking photo sources, and custom audio.

Upload Flow

Asset uploads are a single-step process: POST the raw file binary directly to the upload endpoint. The Content-Type header must match the file's MIME type.

Uploading an Asset

Endpoint: POST https://upload.heygen.com/v1/asset

Request

| Header | Required | Description | |--------|:--------:|-------------| | X-Api-Key | ✓ | Your HeyGen API key | | Content-Type | ✓ | MIME type of the file (e.g. image/jpeg) |

The request body is the raw binary file data. No JSON or form fields are needed.

Response

| Field | Type | Description | |-------|------|-------------| | code | number | Status code (100 = success) | | data.id | string | Unique asset ID for use in video generation | | data.name | string | Asset name | | data.file_type | string | image, video, or audio | | data.url | string | Accessible URL for the uploaded file | | data.image_key | string | null | Key for creating uploaded photo avatars (images only) | | data.folder_id | string | Folder ID (empty if not in a folder) | | data.meta | string | null | Asset metadata | | data.created_ts | number | Unix timestamp of creation |

curl

curl -X POST "https://upload.heygen.com/v1/asset" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: image/jpeg" \
  --data-binary '@./background.jpg'

TypeScript

import fs from "fs";

interface AssetUploadResponse {
  code: number;
  data: {
    id: string;
    name: string;
    file_type: string;
    url: string;
    image_key: string | null;
    folder_id: string;
    meta: string | null;
    created_ts: number;
  };
  msg: string | null;
  message: string | null;
}

async function uploadAsset(filePath: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  const fileBuffer = fs.readFileSync(filePath);

  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
    },
    body: fileBuffer,
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

// Usage
const asset = await uploadAsset("./background.jpg", "image/jpeg");
console.log(`Uploaded asset: ${asset.id}`);
console.log(`Asset URL: ${asset.url}`);

TypeScript (with streams for large files)

import fs from "fs";
import { stat } from "fs/promises";

async function uploadLargeAsset(filePath: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  const fileStats = await stat(filePath);
  const fileStream = fs.createReadStream(filePath);

  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
      "Content-Length": fileStats.size.toString(),
    },
    body: fileStream as any,
    // @ts-ignore - duplex is needed for streaming
    duplex: "half",
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

Python

import requests
import os

def upload_asset(file_path: str, content_type: str) -> dict:
    with open(file_path, "rb") as f:
        response = requests.post(
            "https://upload.heygen.com/v1/asset",
            headers={
                "X-Api-Key": os.environ["HEYGEN_API_KEY"],
                "Content-Type": content_type
            },
            data=f
        )

    data = response.json()
    if data.get("code") != 100:
        raise Exception(data.get("message", "Upload failed"))

    return data["data"]


# Usage
asset = upload_asset("./background.jpg", "image/jpeg")
print(f"Uploaded asset: {asset['id']}")
print(f"Asset URL: {asset['url']}")

Supported Content Types

| Type | Content-Type | Use Case | |------|--------------|----------| | JPEG | image/jpeg | Backgrounds, talking photos | | PNG | image/png | Backgrounds, overlays | | MP4 | video/mp4 | Video backgrounds | | WebM | video/webm | Video backgrounds | | MP3 | audio/mpeg | Custom audio input | | WAV | audio/wav | Custom audio input |

Uploading from URL

If your asset is already hosted online:

async function uploadFromUrl(sourceUrl: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  // 1. Download the file
  const sourceResponse = await fetch(sourceUrl);
  const buffer = Buffer.from(await sourceResponse.arrayBuffer());

  // 2. Upload directly to HeyGen
  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
    },
    body: buffer,
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

Using Uploaded Assets

As Background Image

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hello, this is a video with a custom background!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "image",
        url: asset.url,  // Use the URL from the upload response
      },
    },
  ],
};

As Talking Photo Source

const talkingPhotoConfig = {
  video_inputs: [
    {
      character: {
        type: "talking_photo",
        talking_photo_id: asset.id,  // Use the ID from the upload response
      },
      voice: {
        type: "text",
        input_text: "Hello from my talking photo!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
  ],
};

As Audio Input

const audioConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "audio",
        audio_url: asset.url,  // Use the URL from the upload response
      },
    },
  ],
};

Complete Upload Workflow

async function createVideoWithCustomBackground(
  backgroundPath: string,
  script: string
): Promise<string> {
  // 1. Upload background
  console.log("Uploading background...");
  const background = await uploadAsset(backgroundPath, "image/jpeg");

  // 2. Create video config
  const config = {
    video_inputs: [
      {
        character: {
          type: "avatar",
          avatar_id: "josh_lite3_20230714",
          avatar_style: "normal",
        },
        voice: {
          type: "text",
          input_text: script,
          voice_id: "1bd001e7e50f421d891986aad5158bc8",
        },
        background: {
          type: "image",
          url: background.url,
        },
      },
    ],
    dimension: { width: 1920, height: 1080 },
  };

  // 3. Generate video
  console.log("Generating video...");
  const response = await fetch("https://api.heygen.com/v2/video/generate", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify(config),
  });

  const { data } = await response.json();
  return data.video_id;
}

Asset Limitations

File size: 10MB maximum
Image dimensions: Recommended to match video dimensions
Audio duration: Should match expected video length
Retention: Assets may be deleted after a period of inactivity

Best Practices

Optimize images - Resize to match video dimensions before uploading
Use appropriate formats - JPEG for photos, PNG for graphics with transparency
Validate before upload - Check file type and size locally first
Handle upload errors - Implement retry logic for failed uploads
Cache asset IDs - Reuse assets across multiple video generations

File v2.8.0:references/authentication.md

name: authentication description: API key setup, X-Api-Key header, and authentication patterns for HeyGen

HeyGen Authentication

All HeyGen API requests require authentication using an API key passed in the X-Api-Key header.

Getting Your API Key

Go to https://app.heygen.com/settings?from=&nav=API
Log in if prompted
Copy your API key

Environment Setup

Store your API key securely as an environment variable:

export HEYGEN_API_KEY="your-api-key-here"

For .env files:

HEYGEN_API_KEY=your-api-key-here

Making Authenticated Requests

curl

curl -X GET "https://api.heygen.com/v2/avatars" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

TypeScript/JavaScript (fetch)

const response = await fetch("https://api.heygen.com/v2/avatars", {
  headers: {
    "X-Api-Key": process.env.HEYGEN_API_KEY!,
  },
});
const { data } = await response.json();

TypeScript/JavaScript (axios)

import axios from "axios";

const client = axios.create({
  baseURL: "https://api.heygen.com",
  headers: {
    "X-Api-Key": process.env.HEYGEN_API_KEY,
  },
});

const { data } = await client.get("/v2/avatars");

Python (requests)

import os
import requests

response = requests.get(
    "https://api.heygen.com/v2/avatars",
    headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
)
data = response.json()

Python (httpx)

import os
import httpx

async with httpx.AsyncClient() as client:
    response = await client.get(
        "https://api.heygen.com/v2/avatars",
        headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
    )
    data = response.json()

Creating a Reusable API Client

TypeScript

class HeyGenClient {
  private baseUrl = "https://api.heygen.com";
  private apiKey: string;

  constructor(apiKey: string) {
    this.apiKey = apiKey;
  }

  async request<T>(endpoint: string, options: RequestInit = {}): Promise<T> {
    const response = await fetch(`${this.baseUrl}${endpoint}`, {
      ...options,
      headers: {
        "X-Api-Key": this.apiKey,
        "Content-Type": "application/json",
        ...options.headers,
      },
    });

    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.message || `HTTP ${response.status}`);
    }

    return response.json();
  }

  get<T>(endpoint: string): Promise<T> {
    return this.request<T>(endpoint);
  }

  post<T>(endpoint: string, body: unknown): Promise<T> {
    return this.request<T>(endpoint, {
      method: "POST",
      body: JSON.stringify(body),
    });
  }
}

// Usage
const client = new HeyGenClient(process.env.HEYGEN_API_KEY!);
const avatars = await client.get("/v2/avatars");

API Response Format

All HeyGen API responses follow this structure:

interface ApiResponse<T> {
  error: null | string;
  data: T;
}

Successful response example:

{
  "error": null,
  "data": {
    "avatars": [...]
  }
}

Error response example:

{
  "error": "Invalid API key",
  "data": null
}

Error Handling

Common authentication errors:

| Status Code | Error | Cause | |-------------|-------|-------| | 401 | Invalid API key | API key is missing or incorrect | | 403 | Forbidden | API key doesn't have required permissions | | 429 | Rate limit exceeded | Too many requests |

Handling Errors

async function makeRequest(endpoint: string) {
  const response = await fetch(`https://api.heygen.com${endpoint}`, {
    headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
  });

  const json = await response.json();

  if (!response.ok || json.error) {
    throw new Error(json.error || `HTTP ${response.status}`);
  }

  return json.data;
}

Rate Limiting

HeyGen enforces rate limits on API requests:

Standard rate limits apply per API key
Some endpoints (like video generation) have stricter limits
Use exponential backoff when receiving 429 errors

async function requestWithRetry(
  fn: () => Promise<Response>,
  maxRetries = 3
): Promise<Response> {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fn();

    if (response.status === 429) {
      const waitTime = Math.pow(2, i) * 1000;
      await new Promise((resolve) => setTimeout(resolve, waitTime));
      continue;
    }

    return response;
  }

  throw new Error("Max retries exceeded");
}

Security Best Practices

Never expose API keys in client-side code - Always make API calls from a backend server
Use environment variables - Don't hardcode API keys in source code
Rotate keys periodically - Generate new API keys regularly
Monitor usage - Check your HeyGen dashboard for unusual activity

File v2.8.0:references/avatars.md

name: avatars description: Listing avatars, avatar styles, and avatar_id selection for HeyGen

HeyGen Avatars

Avatars are the AI-generated presenters in HeyGen videos. You can use public avatars provided by HeyGen or create custom avatars.

Previewing Avatars Before Generation

Always preview avatars before generating a video to ensure they match user preferences. Each avatar has preview URLs that can be opened directly in the browser - no downloading required.

Quick Preview: Open URL in Browser (Recommended)

The fastest way to preview avatars is to open the URL directly in the default browser. Do not download the image first - just pass the URL to open:

# macOS: Open URL directly in default browser (no download)
open "https://files.heygen.ai/avatar/preview/josh.jpg"

# Open preview video to see animation
open "https://files.heygen.ai/avatar/preview/josh.mp4"

# Linux: Use xdg-open
xdg-open "https://files.heygen.ai/avatar/preview/josh.jpg"

# Windows: Use start
start "https://files.heygen.ai/avatar/preview/josh.jpg"

The open command on macOS opens URLs directly in the default browser - it does not download the file. This is the quickest way to let users see avatar previews.

List Avatars and Open Previews

async function listAndPreviewAvatars(openInBrowser = true): Promise<void> {
  const response = await fetch("https://api.heygen.com/v2/avatars", {
    headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
  });
  const { data } = await response.json();

  for (const avatar of data.avatars.slice(0, 5)) {
    console.log(`\n${avatar.avatar_name} (${avatar.gender})`);
    console.log(`  ID: ${avatar.avatar_id}`);
    console.log(`  Preview: ${avatar.preview_image_url}`);
  }

  // Open preview URLs directly in browser (no download needed)
  if (openInBrowser) {
    const { execSync } = require("child_process");
    for (const avatar of data.avatars.slice(0, 3)) {
      // 'open' on macOS opens the URL in default browser - doesn't download
      execSync(`open "${avatar.preview_image_url}"`);
    }
  }
}

Note: The open command passes the URL to the browser - it does not download. The browser fetches and displays the image directly.

Workflow: Preview Before Generate

List available avatars - get names, genders, and preview URLs
Open previews in browser - open <preview_image_url> for quick visual check
User selects preferred avatar by name or ID
Get avatar details for default_voice_id
Generate video with selected avatar

# Example workflow in terminal
# 1. List avatars (agent shows options)
# 2. Open preview for candidate
open "https://files.heygen.ai/avatar/preview/josh.jpg"
# 3. User says "use Josh"
# 4. Agent gets details and generates

Preview Fields in API Response

| Field | Description | |-------|-------------| | preview_image_url | Static image of the avatar (JPG) - open in browser | | preview_video_url | Short video clip showing avatar animation |

Both URLs are publicly accessible - no authentication needed to view.

Listing Available Avatars

curl

curl -X GET "https://api.heygen.com/v2/avatars" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

TypeScript

interface Avatar {
  avatar_id: string;
  avatar_name: string;
  gender: "male" | "female";
  preview_image_url: string;
  preview_video_url: string;
}

interface AvatarsResponse {
  error: null | string;
  data: {
    avatars: Avatar[];
    talking_photos: TalkingPhoto[];
  };
}

async function listAvatars(): Promise<Avatar[]> {
  const response = await fetch("https://api.heygen.com/v2/avatars", {
    headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
  });

  const json: AvatarsResponse = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }

  return json.data.avatars;
}

Python

import requests
import os

def list_avatars() -> list:
    response = requests.get(
        "https://api.heygen.com/v2/avatars",
        headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
    )

    data = response.json()
    if data.get("error"):
        raise Exception(data["error"])

    return data["data"]["avatars"]

Response Format

{
  "error": null,
  "data": {
    "avatars": [
      {
        "avatar_id": "josh_lite3_20230714",
        "avatar_name": "Josh",
        "gender": "male",
        "preview_image_url": "https://files.heygen.ai/...",
        "preview_video_url": "https://files.heygen.ai/..."
      },
      {
        "avatar_id": "angela_expressive_20231010",
        "avatar_name": "Angela",
        "gender": "female",
        "preview_image_url": "https://files.heygen.ai/...",
        "preview_video_url": "https://files.heygen.ai/..."
      }
    ],
    "talking_photos": []
  }
}

Avatar Types

Public Avatars

HeyGen provides a library of public avatars that anyone can use:

// List only public avatars
const avatars = await listAvatars();
const publicAvatars = avatars.filter((a) => !a.avatar_id.startsWith("custom_"));

Private/Custom Avatars

Custom avatars created from your own training footage:

const customAvatars = avatars.filter((a) => a.avatar_id.startsWith("custom_"));

Avatar Styles

Avatars support different rendering styles:

| Style | Description | |-------|-------------| | normal | Full body shot, standard framing | | closeUp | Close-up on face, more expressive | | circle | Avatar in circular frame (talking head) | | voice_only | Audio only, no video rendering |

When to Use Each Style

| Use Case | Recommended Style | |----------|-------------------| | Full-screen presenter video | normal | | Personal/intimate content | closeUp | | Picture-in-picture overlay | circle | | Small corner widget | circle | | Podcast/audio content | voice_only | | Motion graphics with avatar overlay | normal or closeUp + transparent bg |

Using Avatar Styles

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal", // "normal" | "closeUp" | "circle" | "voice_only"
      },
      voice: {
        type: "text",
        input_text: "Hello, world!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
  ],
};

Circle Style for Talking Heads

Circle style is ideal for overlay compositions:

// Circle avatar for picture-in-picture
{
  character: {
    type: "avatar",
    avatar_id: "josh_lite3_20230714",
    avatar_style: "circle",
  },
  voice: { ... },
  background: {
    type: "color",
    value: "#00FF00", // Green for chroma key, or use webm endpoint
  },
}

Searching and Filtering Avatars

By Gender

function filterByGender(avatars: Avatar[], gender: "male" | "female"): Avatar[] {
  return avatars.filter((a) => a.gender === gender);
}

const maleAvatars = filterByGender(avatars, "male");
const femaleAvatars = filterByGender(avatars, "female");

By Name

function searchByName(avatars: Avatar[], query: string): Avatar[] {
  const lowerQuery = query.toLowerCase();
  return avatars.filter((a) =>
    a.avatar_name.toLowerCase().includes(lowerQuery)
  );
}

const results = searchByName(avatars, "josh");

Avatar Groups

Avatars are organized into groups for better management.

List Avatar Groups

curl -X GET "https://api.heygen.com/v2/avatar_group.list?include_public=true" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

Query Parameters

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | include_public | bool | false | Include public avatars in results |

TypeScript

interface AvatarGroupItem {
  id: string;
  name: string;
  created_at: number;
  num_looks: number;
  preview_image: string;
  group_type: string;
  train_status: string;
  default_voice_id: string | null;
}

interface AvatarGroupListResponse {
  error: null | string;
  data: {
    avatar_group_list: AvatarGroupItem[];
  };
}

async function listAvatarGroups(
  includePublic = true
): Promise<AvatarGroupListResponse["data"]> {
  const params = new URLSearchParams({
    include_public: includePublic.toString(),
  });

  const response = await fetch(
    `https://api.heygen.com/v2/avatar_group.list?${params}`,
    { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
  );

  const json: AvatarGroupListResponse = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }

  return json.data;
}

Get Avatars in a Group

curl -X GET "https://api.heygen.com/v2/avatar_group/{group_id}/avatars" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

Using Avatars in Video Generation

Basic Avatar Usage

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Welcome to our product demo!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
  ],
  dimension: { width: 1920, height: 1080 },
};

Multiple Scenes with Different Avatars

const multiSceneConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hi, I'm Josh. Let me introduce my colleague.",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
    {
      character: {
        type: "avatar",
        avatar_id: "angela_expressive_20231010",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hello! I'm Angela. Nice to meet you!",
        voice_id: "2d5b0e6a8c3f47d9a1b2c3d4e5f60718",
      },
    },
  ],
};

Using Avatar's Default Voice

Many avatars have a default_voice_id that's pre-matched for natural results. This is the recommended approach rather than manually selecting voices.

Recommended Flow

1. GET /v2/avatars           → Get list of avatar_ids
2. GET /v2/avatar/{id}/details → Get default_voice_id for chosen avatar
3. POST /v2/video/generate   → Use avatar_id + default_voice_id

Get Avatar Details (v2 API)

Given an avatar_id, fetch its details including the default voice:

curl -X GET "https://api.heygen.com/v2/avatar/{avatar_id}/details" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

Response Format

{
  "error": null,
  "data": {
    "type": "avatar",
    "id": "josh_lite3_20230714",
    "name": "Josh",
    "gender": "male",
    "preview_image_url": "https://files.heygen.ai/...",
    "preview_video_url": "https://files.heygen.ai/...",
    "premium": false,
    "is_public": true,
    "default_voice_id": "1bd001e7e50f421d891986aad5158bc8",
    "tags": ["AVATAR_IV"]
  }
}

TypeScript

interface AvatarDetails {
  type: "avatar";
  id: string;
  name: string;
  gender: "male" | "female";
  preview_image_url: string;
  preview_video_url: string;
  premium: boolean;
  is_public: boolean;
  default_voice_id: string | null;
  tags: string[];
}

async function getAvatarDetails(avatarId: string): Promise<AvatarDetails> {
  const response = await fetch(
    `https://api.heygen.com/v2/avatar/${avatarId}/details`,
    { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
  );

  const json = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }

  return json.data;
}

// Usage: Get default voice for a known avatar
const details = await getAvatarDetails("josh_lite3_20230714");
if (details.default_voice_id) {
  console.log(`Using ${details.name} with default voice: ${details.default_voice_id}`);
} else {
  console.log(`${details.name} has no default voice, select manually`);
}

Complete Example: Generate Video with Any Avatar's Default Voice

async function generateWithAvatarDefaultVoice(
  avatarId: string,
  script: string
): Promise<string> {
  // 1. Get avatar details to find default voice
  const avatar = await getAvatarDetails(avatarId);

  if (!avatar.default_voice_id) {
    throw new Error(`Avatar ${avatar.name} has no default voice`);
  }

  // 2. Generate video with the avatar's default voice
  const videoId = await generateVideo({
    video_inputs: [{
      character: {
        type: "avatar",
        avatar_id: avatar.id,
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: script,
        voice_id: avatar.default_voice_id,
      },
    }],
    dimension: { width: 1920, height: 1080 },
  });

  return videoId;
}

Why Use Default Voice?

Guaranteed gender match - Avatar and voice are pre-paired
Natural lip sync - Default voices are optimized for the avatar
Simpler code - No need to fetch and match voices separately
Better quality - HeyGen has tested this combination

Selecting the Right Avatar

Avatar Categories

HeyGen avatars fall into distinct categories. Match the category to your use case:

| Category | Examples | Best For | |----------|----------|----------| | Business/Professional | Josh, Angela, Wayne | Corporate videos, product demos, training | | Casual/Friendly | Lily, various lifestyle avatars | Social media, informal content | | Themed/Seasonal | Holiday-themed, costume avatars | Specific campaigns, seasonal content | | Expressive | Avatars with "expressive" in name | Engaging storytelling, dynamic content |

Selection Guidelines

For business/professional content:

Choose avatars with neutral attire (business casual or formal)
Avoid themed or seasonal avatars (holiday costumes, casual clothing)
Preview the avatar to verify professional appearance
Consider your audience demographics when selecting gender and appearance

For casual/social content:

More flexibility in avatar choice
Themed avatars can work for specific campaigns
Match avatar energy to content tone

Common Mistakes to Avoid

Using themed avatars for business content - A holiday-themed avatar looks unprofessional in a product demo
Not previewing before generation - Always open <preview_url> to verify appearance
Ignoring avatar style - A circle style avatar may not work for full-screen presentations
Mismatched voice gender - Always use the avatar's default_voice_id or match genders manually

Selection Checklist

Before generating a video:

[ ] Previewed avatar image/video in browser
[ ] Avatar appearance matches content tone (professional vs casual)
[ ] Avatar style (normal, closeUp, circle) fits the video format
[ ] Voice gender matches avatar gender
[ ] Using default_voice_id when available

Helper Functions

Get Avatar by ID

async function getAvatarById(avatarId: string): Promise<Avatar | null> {
  const avatars = await listAvatars();
  return avatars.find((a) => a.avatar_id === avatarId) || null;
}

Validate Avatar ID

async function isValidAvatarId(avatarId: string): Promise<boolean> {
  const avatar = await getAvatarById(avatarId);
  return avatar !== null;
}

Get Random Avatar

async function getRandomAvatar(gender?: "male" | "female"): Promise<Avatar> {
  let avatars = await listAvatars();

  if (gender) {
    avatars = avatars.filter((a) => a.gender === gender);
  }

  const randomIndex = Math.floor(Math.random() * avatars.length);
  return avatars[randomIndex];
}

Common Avatar IDs

Some commonly used public avatar IDs (availability may vary):

| Avatar ID | Name | Gender | |-----------|------|--------| | josh_lite3_20230714 | Josh | Male | | angela_expressive_20231010 | Angela | Female | | wayne_20240422 | Wayne | Male | | lily_20230614 | Lily | Female |

Always verify avatar availability by calling the list endpoint before using.

File v2.8.0:references/backgrounds.md

name: backgrounds description: Solid colors, images, and video backgrounds for HeyGen videos

Video Backgrounds

HeyGen supports various background types to customize the appearance of your avatar videos.

Background Types

| Type | Description | |------|-------------| | color | Solid color background | | image | Static image background | | video | Looping video background |

Color Backgrounds

The simplest option - use a solid color:

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hello with a colored background!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "color",
        value: "#FFFFFF", // White background
      },
    },
  ],
};

Common Color Values

| Color | Hex Value | Use Case | |-------|-----------|----------| | White | #FFFFFF | Clean, professional | | Black | #000000 | Dramatic, cinematic | | Blue | #0066CC | Corporate, trustworthy | | Green | #00FF00 | Chroma key (for compositing) | | Gray | #808080 | Neutral, modern |

Using Transparent/Green Screen

For compositing in post-production:

background: {
  type: "color",
  value: "#00FF00", // Green screen
}

Image Backgrounds

Use a static image as background:

From URL

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Check out this custom background!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "image",
        url: "https://example.com/my-background.jpg",
      },
    },
  ],
};

From Uploaded Asset

First upload your image, then use the asset URL:

// 1. Upload the image
const assetId = await uploadFile("./background.jpg", "image/jpeg");

// 2. Use in video config
const videoConfig = {
  video_inputs: [
    {
      character: {...},
      voice: {...},
      background: {
        type: "image",
        url: `https://files.heygen.ai/asset/${assetId}`,
      },
    },
  ],
};

Image Requirements

Formats: JPEG, PNG
Recommended size: Match video dimensions (e.g., 1920x1080 for 1080p)
Aspect ratio: Should match video aspect ratio
File size: Under 10MB recommended

Video Backgrounds

Use a looping video as background:

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Dynamic video background!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "video",
        url: "https://example.com/background-loop.mp4",
      },
    },
  ],
};

Video Requirements

Format: MP4 (H.264 codec recommended)
Looping: Video will loop if shorter than avatar content
Audio: Background video audio is typically muted
File size: Under 100MB recommended

Different Backgrounds Per Scene

Use different backgrounds for each scene:

const multiBackgroundConfig = {
  video_inputs: [
    // Scene 1: Office background
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Let me start with an introduction.",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "image",
        url: "https://example.com/office-bg.jpg",
      },
    },
    // Scene 2: Product showcase
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "closeUp",
      },
      voice: {
        type: "text",
        input_text: "Now let me show you our product.",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "image",
        url: "https://example.com/product-bg.jpg",
      },
    },
    // Scene 3: Call to action
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Get started today!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "color",
        value: "#1a1a2e",
      },
    },
  ],
};

Background Helper Functions

TypeScript

type BackgroundType = "color" | "image" | "video";

interface Background {
  type: BackgroundType;
  value?: string;
  url?: string;
}

function createColorBackground(hexColor: string): Background {
  return { type: "color", value: hexColor };
}

function createImageBackground(imageUrl: string): Background {
  return { type: "image", url: imageUrl };
}

function createVideoBackground(videoUrl: string): Background {
  return { type: "video", url: videoUrl };
}

// Preset backgrounds
const backgrounds = {
  white: createColorBackground("#FFFFFF"),
  black: createColorBackground("#000000"),
  greenScreen: createColorBackground("#00FF00"),
  corporate: createColorBackground("#0066CC"),
};

Best Practices

Match dimensions - Background should match video dimensions
Consider avatar position - Leave space where avatar will appear
Use contrasting colors - Ensure avatar is visible against background
Optimize file sizes - Compress images/videos for faster processing
Test with green screen - For professional post-production workflows
Keep backgrounds simple - Avoid distracting elements behind the avatar

Common Issues

Background Not Showing

// Wrong: missing url/value
background: {
  type: "image"
}

// Correct
background: {
  type: "image",
  url: "https://example.com/bg.jpg"
}

Aspect Ratio Mismatch

If your background doesn't match the video dimensions, it may be cropped or stretched. Always match your background aspect ratio to your video dimensions:

// For 1920x1080 video
// Use 1920x1080 background image

// For 1080x1920 portrait video
// Use 1080x1920 background image

Video Background Audio

Background video audio is typically muted to avoid conflicting with the avatar's voice. If you need background music, add it as a separate audio track in post-production.

File v2.8.0:references/captions.md

name: captions description: Auto-generated captions and subtitle options for HeyGen videos

Video Captions

HeyGen can automatically generate captions (subtitles) for your videos, improving accessibility and engagement.

Enabling Captions

Captions can be enabled when generating a video:

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hello! This video will have automatic captions.",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
  ],
  // Caption settings (availability varies by plan)
  caption: true,
};

Caption Configuration Options

interface CaptionConfig {
  // Enable/disable captions
  enabled: boolean;

  // Caption style
  style?: {
    font_family?: string;
    font_size?: number;
    font_color?: string;
    background_color?: string;
    position?: "top" | "bottom";
  };

  // Language for caption generation
  language?: string;
}

Caption Styles

Basic Captions

const config = {
  video_inputs: [...],
  caption: true, // Enable with default styling
};

Styled Captions

const config = {
  video_inputs: [...],
  caption: {
    enabled: true,
    style: {
      font_family: "Arial",
      font_size: 32,
      font_color: "#FFFFFF",
      background_color: "rgba(0, 0, 0, 0.7)",
      position: "bottom",
    },
  },
};

Multi-Language Captions

For videos in different languages, captions are generated based on the voice language:

// Spanish video with Spanish captions
const spanishConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "¡Hola! Este video tendrá subtítulos en español.",
        voice_id: "spanish_voice_id",
      },
    },
  ],
  caption: true,
};

Working with SRT Files

SRT File Format

Standard SRT format:

1
00:00:00,000 --> 00:00:03,000
Hello! This video will have

2
00:00:03,000 --> 00:00:06,000
automatic captions generated.

3
00:00:06,000 --> 00:00:09,000
They sync with the audio.

Using Custom SRT

For video translation, you can provide your own SRT:

const translationConfig = {
  input_video_id: "original_video_id",
  output_languages: ["es-ES", "fr-FR"],
  srt_key: "path/to/custom.srt", // Custom SRT file
  srt_role: "input", // "input" or "output"
};

Caption Positioning

Bottom (Default)

Standard position for most videos:

caption: {
  enabled: true,
  style: {
    position: "bottom"
  }
}

Top

For videos where bottom space is occupied:

caption: {
  enabled: true,
  style: {
    position: "top"
  }
}

Accessibility Best Practices

Always enable captions - Improves accessibility for deaf/hard-of-hearing viewers
Use high contrast - White text on dark background or vice versa
Readable font size - At least 24px for standard video, larger for mobile
Don't cover important content - Position captions away from key visual elements
Sync timing - Ensure captions match audio timing accurately

Caption Helper Functions

interface CaptionStyle {
  font_family: string;
  font_size: number;
  font_color: string;
  background_color: string;
  position: "top" | "bottom";
}

const captionPresets: Record<string, CaptionStyle> = {
  default: {
    font_family: "Arial",
    font_size: 32,
    font_color: "#FFFFFF",
    background_color: "rgba(0, 0, 0, 0.7)",
    position: "bottom",
  },
  minimal: {
    font_family: "Arial",
    font_size: 28,
    font_color: "#FFFFFF",
    background_color: "transparent",
    position: "bottom",
  },
  bold: {
    font_family: "Arial",
    font_size: 36,
    font_color: "#FFFFFF",
    background_color: "rgba(0, 0, 0, 0.9)",
    position: "bottom",
  },
  branded: {
    font_family: "Roboto",
    font_size: 30,
    font_color: "#00D1FF",
    background_color: "rgba(26, 26, 46, 0.9)",
    position: "bottom",
  },
};

function createCaptionConfig(preset: keyof typeof captionPresets) {
  return {
    enabled: true,
    style: captionPresets[preset],
  };
}

Social Media Caption Considerations

TikTok / Instagram Reels

Position captions in center or upper portion
Avoid bottom 20% (covered by UI elements)
Use larger font sizes for mobile viewing

const socialCaptions = {
  enabled: true,
  style: {
    font_size: 42,
    position: "top", // Avoid bottom UI elements
  },
};

YouTube

Standard bottom captions work well
YouTube also supports closed captions upload

Captions highly recommended (many watch without sound)
Professional styling preferred

Limitations

Caption styles may be limited depending on your subscription tier
Some advanced caption features may require the web interface
Multi-speaker caption detection may have limited availability
Caption accuracy depends on audio quality and speech clarity

Integration with Video Translation

When using video translation, captions are automatically handled:

// Video translation includes caption generation
const translationConfig = {
  input_video_id: "original_video_id",
  output_languages: ["es-ES"],
  // Captions generated in target language
};

See video-translation.md for more details.

File v2.8.0:references/dimensions.md

name: dimensions description: Resolution options (720p/1080p) and aspect ratios for HeyGen videos

Video Dimensions and Resolution

HeyGen supports various video dimensions and aspect ratios to fit different platforms and use cases.

Standard Resolutions

Landscape (16:9)

| Resolution | Width | Height | Use Case | |------------|-------|--------|----------| | 720p | 1280 | 720 | Standard quality, faster processing | | 1080p | 1920 | 1080 | High quality, most common |

Portrait (9:16)

| Resolution | Width | Height | Use Case | |------------|-------|--------|----------| | 720p | 720 | 1280 | Mobile-first content | | 1080p | 1080 | 1920 | High quality vertical |

Square (1:1)

| Resolution | Width | Height | Use Case | |------------|-------|--------|----------| | 720p | 720 | 720 | Social media posts | | 1080p | 1080 | 1080 | High quality square |

Setting Dimensions

TypeScript

// Landscape 1080p
const landscapeConfig = {
  video_inputs: [...],
  dimension: {
    width: 1920,
    height: 1080
  }
};

// Portrait 1080p
const portraitConfig = {
  video_inputs: [...],
  dimension: {
    width: 1080,
    height: 1920
  }
};

// Square 1080p
const squareConfig = {
  video_inputs: [...],
  dimension: {
    width: 1080,
    height: 1080
  }
};

curl

# Landscape 1080p
curl -X POST "https://api.heygen.com/v2/video/generate" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_inputs": [...],
    "dimension": {
      "width": 1920,
      "height": 1080
    }
  }'

Dimension Helper Functions

type AspectRatio = "16:9" | "9:16" | "1:1" | "4:3" | "4:5";
type Quality = "720p" | "1080p";

interface Dimensions {
  width: number;
  height: number;
}

function getDimensions(aspectRatio: AspectRatio, quality: Quality): Dimensions {
  const configs: Record<AspectRatio, Record<Quality, Dimensions>> = {
    "16:9": {
      "720p": { width: 1280, height: 720 },
      "1080p": { width: 1920, height: 1080 },
    },
    "9:16": {
      "720p": { width: 720, height: 1280 },
      "1080p": { width: 1080, height: 1920 },
    },
    "1:1": {
      "720p": { width: 720, height: 720 },
      "1080p": { width: 1080, height: 1080 },
    },
    "4:3": {
      "720p": { width: 960, height: 720 },
      "1080p": { width: 1440, height: 1080 },
    },
    "4:5": {
      "720p": { width: 576, height: 720 },
      "1080p": { width: 864, height: 1080 },
    },
  };

  return configs[aspectRatio][quality];
}

// Usage
const youTubeDimensions = getDimensions("16:9", "1080p");
const tikTokDimensions = getDimensions("9:16", "1080p");
const instagramDimensions = getDimensions("1:1", "1080p");

Platform-Specific Recommendations

YouTube

const youtubeConfig = {
  video_inputs: [...],
  dimension: { width: 1920, height: 1080 }, // 16:9 landscape
};

TikTok / Instagram Reels / YouTube Shorts

const shortFormConfig = {
  video_inputs: [...],
  dimension: { width: 1080, height: 1920 }, // 9:16 portrait
};

Instagram Feed Post

const instagramFeedConfig = {
  video_inputs: [...],
  dimension: { width: 1080, height: 1080 }, // 1:1 square
};

const linkedinConfig = {
  video_inputs: [...],
  dimension: { width: 1920, height: 1080 }, // 16:9 landscape preferred
};

Twitter/X

const twitterConfig = {
  video_inputs: [...],
  dimension: { width: 1280, height: 720 }, // 16:9, 720p is common
};

Avatar IV Dimensions

For Avatar IV (photo-based avatars), dimensions are set via orientation:

type VideoOrientation = "portrait" | "landscape" | "square";

function getAvatarIVDimensions(orientation: VideoOrientation): Dimensions {
  switch (orientation) {
    case "portrait":
      return { width: 720, height: 1280 };
    case "landscape":
      return { width: 1280, height: 720 };
    case "square":
      return { width: 720, height: 720 };
  }
}

Custom Dimensions

HeyGen supports custom dimensions within limits:

const customConfig = {
  video_inputs: [...],
  dimension: {
    width: 1600,
    height: 900  // Custom 16:9 at non-standard resolution
  }
};

Dimension Constraints

Minimum: 128px on any side
Maximum: 4096px on any side
Must be even numbers: Both width and height must be divisible by 2

function validateDimensions(width: number, height: number): boolean {
  if (width < 128 || height < 128) {
    throw new Error("Dimensions must be at least 128px");
  }
  if (width > 4096 || height > 4096) {
    throw new Error("Dimensions cannot exceed 4096px");
  }
  if (width % 2 !== 0 || height % 2 !== 0) {
    throw new Error("Dimensions must be even numbers");
  }
  return true;
}

Resolution vs. Credit Cost

Higher resolutions may consume more credits:

| Resolution | Relative Cost | |------------|---------------| | 720p | Base rate | | 1080p | ~1.5x base rate |

Consider using 720p for drafts and testing, then 1080p for final output.

Background Considerations

Match background image/video dimensions to your video dimensions:

// For 1080p landscape video
const config = {
  video_inputs: [
    {
      character: {...},
      voice: {...},
      background: {
        type: "image",
        url: "https://example.com/1920x1080-background.jpg" // Match video dimensions
      }
    }
  ],
  dimension: { width: 1920, height: 1080 }
};

Creating a Video Config Factory

interface VideoConfigOptions {
  script: string;
  avatarId: string;
  voiceId: string;
  platform: "youtube" | "tiktok" | "instagram_feed" | "instagram_story" | "linkedin";
  quality?: "720p" | "1080p";
}

function createVideoConfig(options: VideoConfigOptions) {
  const platformDimensions: Record<string, Dimensions> = {
    youtube: { width: 1920, height: 1080 },
    tiktok: { width: 1080, height: 1920 },
    instagram_feed: { width: 1080, height: 1080 },
    instagram_story: { width: 1080, height: 1920 },
    linkedin: { width: 1920, height: 1080 },
  };

  const dimension = platformDimensions[options.platform];

  // Scale down for 720p if requested
  if (options.quality === "720p") {
    dimension.width = Math.round((dimension.width * 720) / 1080);
    dimension.height = Math.round((dimension.height * 720) / 1080);
  }

  return {
    video_inputs: [
      {
        character: {
          type: "avatar",
          avatar_id: options.avatarId,
          avatar_style: "normal",
        },
        voice: {
          type: "text",
          input_text: options.script,
          voice_id: options.voiceId,
        },
      },
    ],
    dimension,
  };
}

// Usage
const tiktokVideo = createVideoConfig({
  script: "Hey everyone! Check this out!",
  avatarId: "josh_lite3_20230714",
  voiceId: "1bd001e7e50f421d891986aad5158bc8",
  platform: "tiktok",
  quality: "1080p",
});

File v2.8.0:references/photo-avatars.md

name: photo-avatars description: Creating avatars from photos (talking photos) for HeyGen

Photo Avatars (Talking Photos)

Photo avatars allow you to animate a static photo and make it speak. This is useful for creating personalized video content from portraits, headshots, or any suitable image.

Creating a Photo Avatar from an Uploaded Image

The workflow is: Upload Image → Create Avatar Group → Use in Video

Step 1: Upload the Image

Upload a portrait photo using the asset upload endpoint. The response includes an image_key which you'll use in the next step.

curl -X POST "https://upload.heygen.com/v1/asset" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: image/jpeg" \
  --data-binary '@./portrait.jpg'

Response:

{
  "code": 100,
  "data": {
    "id": "741299e941764988b432ed3a6757878f",
    "name": "741299e941764988b432ed3a6757878f",
    "file_type": "image",
    "url": "https://resource2.heygen.ai/image/.../original.jpg",
    "image_key": "image/741299e941764988b432ed3a6757878f/original.jpg"
  }
}

Important: Save the image_key field (not the id). The image_key is the S3 path used to create the photo avatar.

See assets.md for full upload details.

Step 2: Create Photo Avatar Group

Use the image_key from the upload response to create a photo avatar group. This processes the image and creates a usable photo avatar.

Endpoint: POST https://api.heygen.com/v2/photo_avatar/avatar_group/create

curl -X POST "https://api.heygen.com/v2/photo_avatar/avatar_group/create" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_key": "image/741299e941764988b432ed3a6757878f/original.jpg",
    "name": "My Photo Avatar"
  }'

| Field | Type | Req | Description | |-------|------|:---:|-------------| | image_key | string | ✓ | S3 image key from upload response | | name | string | ✓ | Display name for the avatar | | generation_id | string | | If using AI-generated photo (see below) |

Response:

{
  "error": null,
  "data": {
    "id": "045c260bc0364727b2cbe50442c3a5bf",
    "image_url": "https://files2.heygen.ai/...",
    "created_at": 1771798135.777256,
    "name": "My Photo Avatar",
    "status": "pending",
    "group_id": "045c260bc0364727b2cbe50442c3a5bf",
    "is_motion": false,
    "business_type": "uploaded"
  }
}

The id (same as group_id) is your talking_photo_id for video generation.

Step 3: Wait for Processing

The photo avatar starts with status: "pending" and transitions to "completed" within seconds. Poll the status endpoint:

Endpoint: GET https://api.heygen.com/v2/photo_avatar/{id}

curl "https://api.heygen.com/v2/photo_avatar/045c260bc0364727b2cbe50442c3a5bf" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

Wait until status is "completed" before using in video generation.

Step 4: Use in Video Generation

Use the photo avatar id as talking_photo_id:

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "talking_photo",
        talking_photo_id: "045c260bc0364727b2cbe50442c3a5bf",
      },
      voice: {
        type: "text",
        input_text: "Hello! This is my photo avatar speaking.",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
  ],
  dimension: { width: 1920, height: 1080 },
};

TypeScript: Complete Workflow

import fs from "fs";

interface AssetUploadResponse {
  code: number;
  data: {
    id: string;
    image_key: string;
    url: string;
  };
}

interface PhotoAvatarResponse {
  error: string | null;
  data: {
    id: string;
    group_id: string;
    image_url: string;
    name: string;
    status: string;
    is_motion: boolean;
    business_type: string;
  };
}

async function createPhotoAvatar(
  imagePath: string,
  name: string
): Promise<string> {
  // 1. Upload image
  const fileBuffer = fs.readFileSync(imagePath);
  const uploadResponse = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "image/jpeg",
    },
    body: fileBuffer,
  });

  const uploadJson: AssetUploadResponse = await uploadResponse.json();
  if (uploadJson.code !== 100) {
    throw new Error("Upload failed");
  }

  const imageKey = uploadJson.data.image_key;

  // 2. Create avatar group
  const createResponse = await fetch(
    "https://api.heygen.com/v2/photo_avatar/avatar_group/create",
    {
      method: "POST",
      headers: {
        "X-Api-Key": process.env.HEYGEN_API_KEY!,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ image_key: imageKey, name }),
    }
  );

  const createJson: PhotoAvatarResponse = await createResponse.json();
  if (createJson.error) {
    throw new Error(createJson.error);
  }

  const photoAvatarId = createJson.data.id;

  // 3. Wait for processing
  await waitForPhotoAvatar(photoAvatarId);

  return photoAvatarId;
}

async function waitForPhotoAvatar(id: string): Promise<void> {
  for (let i = 0; i < 30; i++) {
    const response = await fetch(
      `https://api.heygen.com/v2/photo_avatar/${id}`,
      { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
    );

    const json: PhotoAvatarResponse = await response.json();

    if (json.data.status === "completed") return;
    if (json.data.status === "failed") {
      throw new Error("Photo avatar processing failed");
    }

    await new Promise((r) => setTimeout(r, 2000));
  }

  throw new Error("Photo avatar processing timed out");
}

async function createVideoFromPhoto(
  photoPath: string,
  script: string,
  voiceId: string
): Promise<string> {
  // 1. Create photo avatar
  const talkingPhotoId = await createPhotoAvatar(photoPath, "Video Avatar");

  // 2. Generate video
  const response = await fetch("https://api.heygen.com/v2/video/generate", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      video_inputs: [
        {
          character: {
            type: "talking_photo",
            talking_photo_id: talkingPhotoId,
          },
          voice: {
            type: "text",
            input_text: script,
            voice_id: voiceId,
          },
        },
      ],
      dimension: { width: 1920, height: 1080 },
    }),
  });

  const { data } = await response.json();
  return data.video_id;
}

Python: Complete Workflow

import requests
import os
import time

def create_photo_avatar(image_path: str, name: str) -> str:
    api_key = os.environ["HEYGEN_API_KEY"]

    # 1. Upload image
    with open(image_path, "rb") as f:
        upload_resp = requests.post(
            "https://upload.heygen.com/v1/asset",
            headers={
                "X-Api-Key": api_key,
                "Content-Type": "image/jpeg",
            },
            data=f,
        )

    upload_data = upload_resp.json()
    if upload_data.get("code") != 100:
        raise Exception("Upload failed")

    image_key = upload_data["data"]["image_key"]

    # 2. Create avatar group
    create_resp = requests.post(
        "https://api.heygen.com/v2/photo_avatar/avatar_group/create",
        headers={
            "X-Api-Key": api_key,
            "Content-Type": "application/json",
        },
        json={"image_key": image_key, "name": name},
    )

    create_data = create_resp.json()
    if create_data.get("error"):
        raise Exception(create_data["error"])

    photo_avatar_id = create_data["data"]["id"]

    # 3. Wait for processing
    for _ in range(30):
        status_resp = requests.get(
            f"https://api.heygen.com/v2/photo_avatar/{photo_avatar_id}",
            headers={"X-Api-Key": api_key},
        )
        status = status_resp.json()["data"]["status"]
        if status == "completed":
            return photo_avatar_id
        if status == "failed":
            raise Exception("Photo avatar processing failed")
        time.sleep(2)

    raise Exception("Photo avatar processing timed out")

Listing Existing Talking Photos

Retrieve all talking photos in your account:

Endpoint: GET https://api.heygen.com/v1/talking_photo.list

curl "https://api.heygen.com/v1/talking_photo.list" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

Response:

{
  "code": 100,
  "data": [
    {
      "id": "ef0ed70f72c6497793e5e36e434d2aea",
      "image_url": "https://files2.heygen.ai/talking_photo/.../image.WEBP",
      "circle_image": ""
    }
  ]
}

Each id can be used as talking_photo_id in video generation.

Adding Photos to an Existing Group

Add additional photo looks to an existing avatar group:

Endpoint: POST https://api.heygen.com/v2/photo_avatar/avatar_group/add

async function addPhotosToGroup(
  groupId: string,
  imageKeys: string[],
  name: string
): Promise<void> {
  const response = await fetch(
    "https://api.heygen.com/v2/photo_avatar/avatar_group/add",
    {
      method: "POST",
      headers: {
        "X-Api-Key": process.env.HEYGEN_API_KEY!,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        group_id: groupId,
        image_keys: imageKeys,
        name,
      }),
    }
  );

  const json = await response.json();
  if (json.error) {
    throw new Error(json.error);
  }
}

Training a Photo Avatar Group

Train the avatar group for improved animation quality:

Endpoint: POST https://api.heygen.com/v2/photo_avatar/train

curl -X POST "https://api.heygen.com/v2/photo_avatar/train" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"group_id": "045c260bc0364727b2cbe50442c3a5bf"}'

Check training status:

Endpoint: GET https://api.heygen.com/v2/photo_avatar/train/status/{group_id}

Avatar IV Video Generation

Avatar IV is HeyGen's latest photo avatar technology with improved quality and natural motion. It generates a video directly from an uploaded image, bypassing the avatar group creation step.

Endpoint: POST https://api.heygen.com/v2/video/av4/generate

curl -X POST "https://api.heygen.com/v2/video/av4/generate" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_key": "image/741299e941764988b432ed3a6757878f/original.jpg",
    "script": "Hello! This is Avatar IV with enhanced quality.",
    "voice_id": "1bd001e7e50f421d891986aad5158bc8",
    "video_orientation": "landscape",
    "video_title": "My Avatar IV Video"
  }'

| Field | Type | Req | Description | |-------|------|:---:|-------------| | image_key | string | ✓ | S3 image key from asset upload | | script | string | ✓ | Text for the avatar to speak | | voice_id | string | ✓ | Voice to use | | video_orientation | string | | "portrait", "landscape", or "square" | | video_title | string | | Title for the video | | fit | string | | "cover" or "contain" | | custom_motion_prompt | string | | Motion/expression description | | enhance_custom_motion_prompt | boolean | | Enhance the motion prompt with AI |

TypeScript

interface AvatarIVRequest {
  image_key: string;
  script: string;
  voice_id: string;
  video_orientation?: "portrait" | "landscape" | "square";
  video_title?: string;
  fit?: "cover" | "contain";
  custom_motion_prompt?: string;
  enhance_custom_motion_prompt?: boolean;
}

interface AvatarIVResponse {
  error: null | string;
  data: {
    video_id: string;
  };
}

async function generateAvatarIVVideo(
  config: AvatarIVRequest
): Promise<string> {
  const response = await fetch(
    "https://api.heygen.com/v2/video/av4/generate",
    {
      method: "POST",
      headers: {
        "X-Api-Key": process.env.HEYGEN_API_KEY!,
        "Content-Type": "application/json",
      },
      body: JSON.stringify(config),
    }
  );

  const json: AvatarIVResponse = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }

  return json.data.video_id;
}

Avatar IV Options

| Orientation | Dimensions | Use Case | |-------------|------------|----------| | portrait | 720x1280 | TikTok, Stories | | landscape | 1280x720 | YouTube, Web | | square | 720x720 | Instagram Feed |

| Fit | Description | |-----|-------------| | cover | Fill the frame, may crop edges | | contain | Fit entire image, may show background |

Custom Motion Prompts

const videoId = await generateAvatarIVVideo({
  image_key: "image/.../original.jpg",
  script: "Let me tell you about our product.",
  voice_id: "1bd001e7e50f421d891986aad5158bc8",
  custom_motion_prompt: "nodding head and smiling",
  enhance_custom_motion_prompt: true,
});

Generating AI Photo Avatars

Generate synthetic photo avatars from text descriptions instead of uploading a photo.

Endpoint: POST https://api.heygen.com/v2/photo_avatar/photo/generate

IMPORTANT: All 8 fields are REQUIRED. The API will reject requests missing any field. When a user asks to "generate an AI avatar of a professional man", you need to ask for or select values for ALL fields below.

Required Fields (ALL must be provided)

| Field | Type | Allowed Values | |-------|------|----------------| | name | string | Name for the generated avatar | | age | enum | "Young Adult", "Early Middle Age", "Late Middle Age", "Senior", "Unspecified" | | gender | enum | "Woman", "Man", "Unspecified" | | ethnicity | enum | "White", "Black", "Asian American", "East Asian", "South East Asian", "South Asian", "Middle Eastern", "Pacific", "Hispanic", "Unspecified" | | orientation | enum | "square", "horizontal", "vertical" | | pose | enum | "half_body", "close_up", "full_body" | | style | enum | "Realistic", "Pixar", "Cinematic", "Vintage", "Noir", "Cyberpunk", "Unspecified" | | appearance | string | Text prompt describing appearance (clothing, mood, lighting, etc). Max 1000 chars |

curl Example

curl -X POST "https://api.heygen.com/v2/photo_avatar/photo/generate" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Sarah Product Demo",
    "age": "Young Adult",
    "gender": "Woman",
    "ethnicity": "White",
    "orientation": "horizontal",
    "pose": "half_body",
    "style": "Realistic",
    "appearance": "Professional woman with a friendly smile, wearing a navy blue blazer over a white blouse, soft studio lighting, clean neutral background"
  }'

Response:

{
  "error": null,
  "data": {
    "generation_id": "6a7f7f2795de4599bec7cf1e06babe30"
  }
}

Check Generation Status

Endpoint: GET https://api.heygen.com/v2/photo_avatar/generation/{generation_id}

The response includes multiple generated images to choose from:

{
  "error": null,
  "data": {
    "id": "6a7f7f2795de4599bec7cf1e06babe30",
    "status": "success",
    "image_url_list": [
      "https://resource2.heygen.ai/photo_generation/.../image1.jpg",
      "https://resource2.heygen.ai/photo_generation/.../image2.jpg",
      "https://resource2.heygen.ai/photo_generation/.../image3.jpg",
      "https://resource2.heygen.ai/photo_generation/.../image4.jpg"
    ],
    "image_key_list": [
      "photo_generation/.../image1.jpg",
      "photo_generation/.../image2.jpg",
      "photo_generation/.../image3.jpg",
      "photo_generation/.../image4.jpg"
    ]
  }
}

TypeScript

interface GeneratePhotoAvatarRequest {
  name: string;
  age: "Young Adult" | "Early Middle Age" | "Late Middle Age" | "Senior" | "Unspecified";
  gender: "Woman" | "Man" | "Unspecified";
  ethnicity: "White" | "Black" | "Asian American" | "East Asian" | "South East Asian" | "South Asian" | "Middle Eastern" | "Pacific" | "Hispanic" | "Unspecified";
  orientation: "square" | "horizontal" | "vertical";
  pose: "half_body" | "close_up" | "full_body";
  style: "Realistic" | "Pixar" | "Cinematic" | "Vintage" | "Noir" | "Cyberpunk" | "Unspecified";
  appearance: string;
}

interface GeneratePhotoAvatarResponse {
  error: string | null;
  data: {
    generation_id: string;
  };
}

interface PhotoGenerationStatus {
  error: string | null;
  data: {
    id: string;
    status: "pending" | "processing" | "success" | "failed";
    msg: string | null;
    image_url_list?: string[];
    image_key_list?: string[];
  };
}

async function generatePhotoAvatar(
  config: GeneratePhotoAvatarRequest
): Promise<string> {
  const response = await fetch(
    "https://api.heygen.com/v2/photo_avatar/photo/generate",
    {
      method: "POST",
      headers: {
        "X-Api-Key": process.env.HEYGEN_API_KEY!,
        "Content-Type": "application/json",
      },
      body: JSON.stringify(config),
    }
  );

  const json: GeneratePhotoAvatarResponse = await response.json();

  if (json.error) {
    throw new Error(`Photo avatar generation failed: ${json.error}`);
  }

  return json.data.generation_id;
}

async function waitForPhotoGeneration(
  generationId: string
): Promise<string[]> {
  for (let i = 0; i < 60; i++) {
    const response = await fetch(
      `https://api.heygen.com/v2/photo_avatar/generation/${generationId}`,
      { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
    );

    const json: PhotoGenerationStatus = await response.json();

    if (json.error) throw new Error(json.error);

    if (json.data.status === "success") {
      return json.data.image_key_list!;
    }

    if (json.data.status === "failed") {
      throw new Error(json.data.msg ?? "Photo generation failed");
    }

    await new Promise((r) => setTimeout(r, 5000));
  }

  throw new Error("Photo generation timed out");
}

AI Photo → Avatar Group → Video

Use a generated AI photo to create an avatar group, then generate a video:

// 1. Generate AI photo
const generationId = await generatePhotoAvatar({
  name: "Product Demo Host",
  age: "Young Adult",
  gender: "Woman",
  ethnicity: "Unspecified",
  orientation: "horizontal",
  pose: "half_body",
  style: "Realistic",
  appearance: "Professional woman, navy blazer, friendly smile, soft lighting",
});

// 2. Wait for generation and pick first result
const imageKeys = await waitForPhotoGeneration(generationId);
const selectedImageKey = imageKeys[0];

// 3. Create avatar group from the AI photo
const createResponse = await fetch(
  "https://api.heygen.com/v2/photo_avatar/avatar_group/create",
  {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      image_key: selectedImageKey,
      name: "Product Demo Host",
      generation_id: generationId,
    }),
  }
);

const { data } = await createResponse.json();
const talkingPhotoId = data.id;

// 4. Generate video (after status is "completed")
const videoId = await generateVideo({
  video_inputs: [{
    character: {
      type: "talking_photo",
      talking_photo_id: talkingPhotoId,
    },
    voice: {
      type: "text",
      input_text: "Welcome to our product demo!",
      voice_id: "1bd001e7e50f421d891986aad5158bc8",
    },
  }],
  dimension: { width: 1920, height: 1080 },
});

Pre-Generation Checklist

Before calling the AI generation API, ensure you have values for ALL fields:

| # | Field | Question to Ask / Default | |---|-------|---------------------------| | 1 | name | What should we call this avatar? | | 2 | age | Young Adult / Early Middle Age / Late Middle Age / Senior? | | 3 | gender | Woman / Man? | | 4 | ethnicity | Which ethnicity? (see enum values above) | | 5 | orientation | horizontal (landscape) / vertical (portrait) / square? | | 6 | pose | half_body (recommended) / close_up / full_body? | | 7 | style | Realistic (recommended) / Cinematic / other? | | 8 | appearance | Describe clothing, expression, lighting, background |

If the user only provides a vague request like "create a professional looking man", ask them to specify the missing fields OR make reasonable defaults (e.g., "Early Middle Age", "Realistic" style, "half_body" pose, "horizontal" orientation).

Appearance Prompt Tips

The appearance field is a text prompt - be descriptive:

Good prompts:

"Professional woman with shoulder-length brown hair, wearing a light blue button-down shirt, warm friendly smile, soft studio lighting, clean white background"
"Young man with short black hair, casual tech startup style, wearing a dark hoodie, confident expression, modern office background with plants"

Avoid:

Vague descriptions: "a nice person"
Conflicting attributes
Requesting specific real people

Managing Photo Avatars

Get Photo Avatar Details

Endpoint: GET https://api.heygen.com/v2/photo_avatar/{id}

async function getPhotoAvatar(id: string): Promise<PhotoAvatarResponse> {
  const response = await fetch(
    `https://api.heygen.com/v2/photo_avatar/${id}`,
    { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
  );
  return response.json();
}

Delete Photo Avatar

Endpoint: DELETE https://api.heygen.com/v2/photo_avatar/{id}

async function deletePhotoAvatar(id: string): Promise<void> {
  const response = await fetch(
    `https://api.heygen.com/v2/photo_avatar/${id}`,
    {
      method: "DELETE",
      headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
    }
  );

  if (!response.ok) {
    throw new Error("Failed to delete photo avatar");
  }
}

Delete Photo Avatar Group

Endpoint: DELETE https://api.heygen.com/v2/photo_avatar_group/{group_id}

async function deletePhotoAvatarGroup(groupId: string): Promise<void> {
  const response = await fetch(
    `https://api.heygen.com/v2/photo_avatar_group/${groupId}`,
    {
      method: "DELETE",
      headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
    }
  );

  if (!response.ok) {
    throw new Error("Failed to delete photo avatar group");
  }
}

API Reference

| Endpoint | Method | Description | |----------|--------|-------------| | upload.heygen.com/v1/asset | POST | Upload image (returns image_key) | | /v2/photo_avatar/avatar_group/create | POST | Create photo avatar from image_key | | /v2/photo_avatar/avatar_group/add | POST | Add photos to existing group | | /v2/photo_avatar/train | POST | Train avatar group | | /v2/photo_avatar/train/status/{group_id} | GET | Check training status | | /v2/photo_avatar/{id} | GET | Get photo avatar details/status | | /v2/photo_avatar/{id} | DELETE | Delete photo avatar | | /v2/photo_avatar_group/{id} | DELETE | Delete avatar group | | /v2/photo_avatar/photo/generate | POST | Generate AI photo from text | | /v2/photo_avatar/generation/{id} | GET | Check AI generation status | | /v2/video/av4/generate | POST | Avatar IV video from image_key | | /v1/talking_photo.list | GET | List all existing talking photos | | /v2/video/generate | POST | Generate video with talking_photo_id |

Photo Requirements

Technical Requirements

| Aspect | Requirement | |--------|-------------| | Format | JPEG, PNG | | Resolution | Minimum 512x512px | | File size | Under 10MB | | Face visibility | Clear, front-facing |

Quality Guidelines

Lighting - Even, natural lighting on face
Expression - Neutral or slight smile
Background - Simple, uncluttered
Face position - Centered, not cut off
Clarity - Sharp, in focus
Angle - Straight-on or slight angle

Best Practices

Use high-quality photos - Better input = better output
Front-facing portraits - Work best for animation
Neutral expressions - Allow for more natural animation
Use Avatar IV for best quality - Latest generation technology
Train avatar groups - Improves animation quality
Reuse photo avatar IDs - Once created, use the same talking_photo_id across multiple videos

Limitations

Photo quality significantly affects output
Side-profile photos have limited support
Full-body photos may not animate properly
Some expressions may look unnatural
Processing time varies by complexity

File v2.8.0:references/prompt-examples.md

name: prompt-examples description: Full production prompt examples and ready-to-use templates for Video Agent

Video Agent Prompt Examples

Full Example: Brief to Production Prompt

Input Brief

Topic: Monthly company report for a SaaS startup
Key data: $141M ARR (up from $54M), 1.85M signups (+28%), 3M paid videos/month
Customer story: Creator built AI character, 2.5M followers, 20 min/video
Challenge: Organic traffic volatile, -16% last week
Duration: ~90 seconds
Tone: Confident CEO, data-backed

Output Prompt

FORMAT: Bloomberg-style company report. 90 seconds. Fast-paced, data-dense.
Record-breaking month. Proud but analytical.

TONE: Confident, direct, data-backed. Highlights hit hard with numbers.
Customer stories are the emotional core. Challenges are honest — no spin.

AVATAR: Man in simple black crew-neck tee, standing in a modern glass-walled
office at golden hour. Behind him, a wall-mounted display shows the company logo
in soft blue glow. Monitor to his right shows a dashboard with upward-trending
charts. Desk beside him: laptop, half-empty flat white, scattered sticky notes.
Warm afternoon light through floor-to-ceiling windows, long shadows on polished
concrete. Minimal, focused startup HQ.

STYLE — SWISS PULSE (Müller-Brockmann): Grid-locked compositions. Black (#1a1a1a),
white, electric blue (#0066FF), warm amber (#FF9500) for records. Helvetica Bold
headlines, Regular labels. Numbers LARGE. Animated counters count up from 0.
Diagonal compositions on accent moments. Grid wipe transitions. No dissolves.

CRITICAL ON-SCREEN TEXT (display literally):
- "1.85M SIGNUPS — +28% MoM"
- "$2.12M NEW SUBSCRIPTION REVENUE"
- "$54M → $141M ARR"
- "2.5M FOLLOWERS" and "20 MIN / VIDEO"
- Quote: "Use technology to serve the message, not distract from it."
- "ORGANIC: 65% OF SUBS — VOLATILE"

MUSIC: Upbeat electronic with a driving beat. Tycho meets Bloomberg opening theme.
Builds through highlights, warms for customer story, softens for challenges, peaks
on close.

---

SCENE 1 — A-ROLL (8s)
[Avatar center-frame, energetic, leaning slightly forward]
VOICEOVER: "January was a record month. New highs across acquisition, revenue,
and product velocity. Here's the full picture."
Lower-third SLIDES in: "COMPANY NAME | JANUARY 2026" white on blue bar.
Grid wipe.

SCENE 2 — FULL SCREEN B-ROLL (12s)
[NO AVATAR — motion graphic only]
VOICEOVER: "One-point-eight-five million signups — twenty-eight percent month
over month. Two-point-one-two million in new subscription revenue. Both all-time
highs."
LAYER 1: Dark #1a1a1a background with thin grid lines pulsing at 8% opacity.
LAYER 2: "1.85M" SLAMS in from left, white Bold 140pt. "SIGNUPS" types on
         in electric blue 32pt uppercase. "+28% MoM" appears in amber.
LAYER 3: Three stat cards CASCADE from top-right, staggered 0.3s:
         "$2.12M New Revenue" — "$3.4M Business ARR" — "$3M Pro ARR."
         Each number COUNTS UP from 0.
LAYER 4: Bottom ticker scrolls: "Non-brand search +36% • Brand impressions 9.2M
         • Weekly subs +20.5%"
LAYER 5: Grid lines RIPPLE outward on "1.85M" slam. Diagonal amber bar behind
         stat cards.
Hard cut.

SCENE 3 — FULL SCREEN B-ROLL (12s)
[NO AVATAR — motion graphic only]
VOICEOVER: "Zoom out. Twelve months ago — fifty-four million ARR. Today —
one hundred forty-one million. Nearly three X in a single year."
LAYER 1: Dark background, subtle grid scrolling upward.
LAYER 2: Animated line chart DRAWS ITSELF left to right. Y-axis: $50M to $150M.
         Final point "$140.84M" glows amber and pulses.
LAYER 3: Milestone annotations float in at key data points.
LAYER 4: Second smaller chart below — "Paid Videos" 0.91M to 2.97M, same style.
LAYER 5: Thin grid lines converge toward final data point. Scan line sweeps.
Grid wipe.

SCENE 4 — A-ROLL (8s)
[Avatar center-frame, warm tone, genuine smile]
VOICEOVER: "But the numbers only tell half the story. The other half is the
people building on the platform."
Lower-third: "Customer Spotlight"

SCENE 5 — FULL SCREEN B-ROLL (12s)
[NO AVATAR — warm palette]
VOICEOVER: "An AI character built entirely on the platform. Twenty minutes
per video. Two-point-five million Instagram followers. The creator's principle:
use technology to serve the message, not distract from it."
LAYER 1: Dark background with warm amber grid lines at low opacity.
LAYER 2: "CHARACTER NAME" in large white, center-top, 80pt.
LAYER 3: Stats cascade from right: "2.5M Followers" COUNTS UP in amber —
         "20 min/video" — "7x Faster." Each a glowing node.
LAYER 4: Quote card SLIDES UP: "Use technology to serve the message, not
         distract from it." Types on word by word.
LAYER 5: Warm light bloom. Grid lines soften into curved arcs.
Grid wipe.

SCENE 6 — A-ROLL (10s)
[Avatar center-frame, serious/candid]
VOICEOVER: "Now the honest part. Organic drives sixty-five percent of
subscriptions and it's volatile. Non-brand traffic dropped sixteen percent
last week. We've rebuilt attribution and we're investing in SEO."
Lower-third: "Challenges"

SCENE 7 — A-ROLL (7s)
[Avatar center-frame, energy lifts, direct eye contact]
VOICEOVER: "Fifty-four million to one-forty-one in twelve months. Three million
paid videos a month. January set the bar — now we raise it."
End card: Logo centered, blue glow fade-in. Grid lines converge. Music peaks.

---

NARRATION STYLE: CEO energy — conviction backed by data. Fast on highlights.
Warm on customer stories. Candid on challenges. Close with forward momentum.

Ready-to-Use Templates

Tech News Briefing

FORMAT: 75-second high-energy tech briefing. Think: Bloomberg meets Vice.

AVATAR: [Presenter in tech-casual at a multi-monitor station.
Describe clothing, monitor content, desk items, lighting.]

STYLE — DECONSTRUCTED (Brody): Dark grey #1a1a1a, rust orange #D4501E.
Type at angles, overlapping. Gritty textures. Smash cut transitions.

CRITICAL ON-SCREEN TEXT:
- [List every stat, quote, handle that must appear]

SCENE 1 — A-ROLL (8s): Hook with energy. State what's happening.
SCENE 2 — B-ROLL (12s): First story with layered visuals (L1-L5).
SCENE 3 — A-ROLL + OVERLAY (10s): Second story, split frame.
SCENE 4 — B-ROLL (10s): Third story or dramatic data point.
SCENE 5 — A-ROLL (8s): Wrap-up and forward look.

Product Comparison

FORMAT: 60-second comparison. [Product A] vs [Product B]. Data-driven.

AVATAR: [Presenter in review studio. Desk with both products visible.]

STYLE — DIGITAL GRID (Crouwel): Dark #0a0a0a, cyan #00D4FF and amber #FFB800.
Two-color coding: cyan = Product A, amber = Product B. Monospaced type.

CRITICAL ON-SCREEN TEXT:
- [Key stats for each product]
- [Pricing, features, differentiators]

Use SPLIT FRAME B-roll: Product A left, Product B right.

Strategy Presentation

FORMAT: 90-second strategy briefing. Bloomberg meets board meeting.

AVATAR: [Executive in blazer over tee. Conference room with whiteboard frameworks.]

STYLE — SWISS PULSE (Müller-Brockmann): Black/white + blue #0066FF.
Grid-locked. Helvetica. Animated counters. Grid wipe transitions.

CRITICAL ON-SCREEN TEXT:
- [Framework labels, quadrant labels, key quotes]

Build frameworks visually: draw axes, plot positions, animate labels.

Social Ad (30 seconds)

FORMAT: 30-second social ad. Maximum energy. Portrait 9:16.

AVATAR: [Creator-style presenter. Ring light, colorful background.]

STYLE — CARNIVAL SURGE (Lins): Hot pink, yellow, teal. Collage layering.
Text MASSIVE at angles. Confetti. Smash cuts.

Three scenes: Hook (8s) → Value prop (12s) → CTA (10s).
Text fills 50-80% of every frame. Numbers SLAM.

Premium Report

FORMAT: 120-second investor-grade report. Understated authority.

AVATAR: [Tailored merino sweater. Architectural room, diffused natural light.]

STYLE — VELVET STANDARD (Vignelli): Black, white, gold #c9a84c.
Thin ALL CAPS, wide spacing. Generous negative space.
Slow cross-dissolves. Numbers fade in with weight.

File v2.8.0:references/prompt-optimizer.md

name: prompt-optimizer description: Write production-quality prompts for HeyGen Video Agent — from basic ideas to fully art-directed scene-by-scene scripts

Video Agent Prompt Optimizer

Write effective prompts for the HeyGen Video Agent API. Based on patterns from 40+ produced videos.

The core insight: Video Agent is an HTML interpreter. It renders layouts, typography, and structured content natively. Describe B-roll as layered text motion graphics with action verbs ("slams in," "types on," "counts up") — not layout specs ("upper-left, 48pt").

Reference Files

| File | Load when... | |------|-------------| | visual-styles.md | Choosing a visual style (20 styles with full specs) | | prompt-examples.md | Writing a prompt from scratch (full production example + templates) |

Workflow: Brief to Prompt

Pull data — Research the topic: web search, APIs, internal docs. Gather real quotes, stats, handles
Synthesize a thesis — Not a list. A story. "X is happening because Y — here's the proof." Group into 3-5 themes with a narrative arc
Choose a style — Match mood first, content second. Ask: "What should the viewer FEEL?" See visual-styles.md
Write the avatar — Thematic wardrobe matching content's emotional context. Brand logos and content-specific props in the set (see Avatar Guide below)
Extract critical text — List every number, quote, handle, and label that must appear literally
Break into scenes — One concept per scene. Rotate scene types. Never 3+ of same type in a row. At least 2 pure B-roll scenes
Write voiceover — Spell out numbers in VO ("one-point-eight-five million"), use figures on screen ("1.85M"). Narration on EVERY scene including B-roll
Layer each B-roll scene — L1 background, L2 hero, L3 supporting, L4 info bar, L5 effects. Every element must MOVE
Add music direction — Reference artists, describe energy arc
Add narration style — How to deliver: fast/slow, where to pause, emotional register per section

Prompt Anatomy

Every production-quality prompt follows this structure:

FORMAT:    What kind of video, how long, what energy
TONE:      Emotional register, references
AVATAR:    Detailed physical + environment description (60-100 words)
STYLE:     Named aesthetic with colors, typography, motion rules, transitions
CRITICAL ON-SCREEN TEXT:  Exact strings that must appear
SCENE-BY-SCENE:  Individual scene breakdowns with VO and layered visuals
MUSIC:     Genre, reference artists, energy arc
NARRATION STYLE:  How to deliver the voiceover

FORMAT

FORMAT: 75-second high-energy tech daily briefing. Think: a creator who just got amazing news.
FORMAT: Bloomberg-style strategy briefing. 100-120 seconds. CEO-delivered.

TONE

TONE: Confident, direct, data-backed. Highlights hit hard. Lowlights are honest — no spin.
TONE: Edgy, punk tech commentary. Vice News meets The Face magazine — raw, confrontational.

CRITICAL ON-SCREEN TEXT

List every exact string that must appear on screen. Without this, the agent may summarize, round numbers, or rephrase quotes.

CRITICAL ON-SCREEN TEXT (display literally):
- "$141M ARR — All-Time High"
- "1.85M Signups — +28% MoM"
- Quote: "Use technology to serve the message, not distract from it." — Shalev Hani
- "@username" — exact social handle

MUSIC & NARRATION

MUSIC: Driving electronic, heavy bass drops on key numbers. Run the Jewels meets
a tech keynote. Builds relentlessly, only softens for customer stories.

NARRATION STYLE: High energy throughout. Let numbers PUNCH — pause before big ones,
then deliver hard. Customer stories get warmth. The close should feel like a mic drop.

Avatar Description Guide

The avatar is NOT a fixed headshot — design it for each video like a movie character. Think costume designer + set designer.

Thematic Wardrobe Rule

The avatar's outfit and environment MUST match the content's emotional/cultural context:

| Content Type | Avatar Design | NOT This | |---|---|---| | Chinese New Year | Red qipao with gold embroidery, lantern-lit courtyard | "Reporter in a blazer" | | Breaking tech news | Field reporter, windswept hair, earpiece, city skyline | "Anchor at a desk" | | Sleep science | Oversized cream knit, cross-legged on bed, warm lamp | "Analyst in a lab" | | Reddit community | Messy desk, Reddit alien on monitors, upvote arrows on wall | "Researcher in a studio" |

What to Specify

| Element | Weak | Strong | |---------|------|--------| | Clothing | "Business casual" | "Black ribbed merino turtleneck, high collar framing jaw" | | Environment | "An office" | "Glass-walled conference room. Whiteboard with hand-drawn tier pyramid" | | Monitor content | "Computer screens" | "Monitor shows scrolling green terminal text and red security alerts" | | Lighting | "Well lit" | "Cool blue monitor glow from left, warm amber desk lamp from right" |

Template

AVATAR: [Clothing — fabric, color, fit, accessories, posture].
[Setting — specific props, brand logos, what's on the walls].
[Monitors/desk — content visible on screens, items on desk].
[Lighting — direction, color temperature]. [Mood of the space].
60-100 words. 3+ content-specific props. Brand elements visible.

Scene Types

| Type | Format | When to Use | |------|--------|-------------| | A-ROLL | Avatar speaking to camera | Intros, key insights, CTAs, emotional beats | | FULL SCREEN B-ROLL | No avatar — motion graphics only | Data visualization, information-dense content | | A-ROLL + OVERLAY | Split frame: avatar + content | Presenting data while maintaining human connection |

Rotation is mandatory. Never 3+ of the same type in a row. Every prompt needs at least 2 pure B-roll scenes.

Voiceover on EVERY scene. Every B-roll scene MUST include a VOICEOVER: line. Silent B-roll = broken video.

Scene Anatomy

A-ROLL:

SCENE 1 — A-ROLL (10s)
[Avatar center-frame, excited, hands gesturing]
VOICEOVER: "The exact script for this scene."
Lower-third: "TITLE TEXT" white on blue bar.

B-ROLL with layers:

SCENE 2 — FULL SCREEN B-ROLL (12s)
[NO AVATAR — motion graphic only]
VOICEOVER: "The exact script for this scene."
LAYER 1: Dark #1a1a1a background with subtle grid lines pulsing.
LAYER 2: "HEADLINE" SLAMS in from left in white Bold 100pt at -5 degrees.
LAYER 3: Three data cards CASCADE from right, staggered 0.3s.
LAYER 4: Bottom ticker SLIDES in: "supporting text scrolling continuously."
LAYER 5: Grid lines RIPPLE outward from impact point.
Hard cut.

A-ROLL + OVERLAY:

SCENE 3 — A-ROLL + OVERLAY (10s)
[SPLIT — Avatar LEFT 35%. Content RIGHT 65%. NO overlap.]
Avatar gestures toward content side.
VOICEOVER: "The exact script for this scene."
RIGHT SIDE: "HEADLINE" in cyan 60pt. Three stats COUNT UP below.

Alternate which side the avatar appears on between overlay scenes.

The Visual Layer System

Break B-roll into 5 stacked layers. This is the most powerful technique for motion graphics scenes.

| Layer | Purpose | Examples | |-------|---------|---------| | L1 | Background | Textured surface, grid, gradient, color field | | L2 | Hero content | Main headline/number that dominates the frame | | L3 | Supporting data | Cards, stats, bullet points, secondary information | | L4 | Information bar | Tickers, labels, source attributions, quotes | | L5 | Effects | Particles, glitches, grid animations, ambient motion |

Every B-roll: 4+ layers. Every overlay content side: 3+ layers. Every element must MOVE.

Motion Vocabulary

High Energy

| Verb | Example | |------|---------| | SLAMS | "$95M" SLAMS in from left at -5 degrees | | CRASHES | Title CRASHES in from right, screen-shake on impact | | PUNCHES | Quote card PUNCHES up from bottom | | STAMPS | Data blocks STAMP in staggered 0.4s | | SHATTERS | Text SHATTERS after 1.5s, revealing number underneath |

Medium Energy

| Verb | Example | |------|---------| | CASCADE | Three cards CASCADE from top, staggered 0.3s | | SLIDES | Ticker SLIDES in from right — continuous scroll | | DROPS | "TIER 1" DROPS in with white flash | | FILLS | Progress bar FILLS 0 to 90% in orange | | DRAWS | Chart line DRAWS itself left to right |

Low Energy

| Verb | Example | |------|---------| | types on | Quote types on word by word in italic white | | fades in | Logo fades in at center, held for 3 seconds | | FLOATS | Bokeh orbs FLOAT across frame at different speeds | | morphs | Number morphs from 17 to 18.9 | | COUNTS UP | "1.85M" COUNTS UP from 0 in amber 96pt |

Transition Types

| Transition | Energy | Styles It Fits | |------------|--------|---------------| | Smash cut | Aggressive | Deconstructed, Maximalist, Carnival Surge | | White flash frame | Punchy | Deconstructed, Maximalist | | Grid wipe | Systematic | Swiss Pulse, Digital Grid | | Hard cut | Clean | Swiss Pulse, Shadow Cut | | Liquid dissolve | Elegant | Data Drift, Dream State | | Slow cross-dissolve | Refined | Velvet Standard | | Pop cut / bounce | Fun | Play Mode, Carnival Surge | | Snap cut | Urgent | Red Wire, Contact Sheet | | Soft dissolve | Warm | Soft Signal, Warm Grain, Quiet Drama | | Iris wipe | Nostalgic | Heritage Reel |

Timing Guidelines

| Content Type | Duration | |--------------|----------| | Hook/Intro (A-roll) | 6-10 seconds | | Data-heavy B-roll | 10-15 seconds (NEVER ≤5s — causes black frames) | | A-roll + Overlay | 8-12 seconds | | CTA / Close (A-roll) | 6-8 seconds |

Common video lengths: Social clip: 30-45s (5-7 scenes) | Briefing: 60-75s (7-9 scenes) | Deep dive: 90-120s (10-13 scenes)

Speaking pace: ~150 words/minute. Calculate: words / 150 * 60 = seconds

What Doesn't Work

Patterns that consistently produce poor results:

Layout language — Screen coordinates cause empty/black B-roll:

❌ "UPPER-LEFT: headline in 48pt Helvetica"
❌ "CENTER-SCREEN: display at coordinates (400, 300)"
✅ "135K" SLAMS in from left, white Impact 120pt, fills 40% of frame.

Named artists without specs — "Ikko Tanaka style" means nothing to Video Agent. Translate to concrete rules:

❌ "Use an Ikko Tanaka style"
✅ "Flat color blocks, maximum 3 colors per frame, 60% negative space, typography as primary element"

Style examples injected into prompts — Full example scenes from a style library confuse the agent. Use the style's rules, not example scenes.

Forced short B-roll (≤5 seconds) — Too short for rendering. Every tested video with 5s B-roll had empty/black screens. Use 10-15s.

Content as a list, not a story — "Here are 5 tweets" produces flat videos. Always synthesize: "X is happening because Y — here's the proof."

Production Insights

Style Performance (from 40+ videos)

| Rank | Style | Strength | |------|-------|----------| | 1 | Deconstructed (Brody) | Most reliable across all topics | | 2 | Swiss Pulse (Müller-Brockmann) | Best for data-heavy content | | 3 | Digital Grid (Crouwel) | Strong for tech topics | | 4 | Geometric Bold (Tanaka) | Elegant and versatile | | 5 | Maximalist Type (Scher) | High energy, use sparingly |

Duration by Approach

| Approach | Avg Duration | Quality | |----------|-------------|---------| | Natural storyboard + custom avatar | ~106s | Best | | Natural storyboard, no custom avatar | ~69s | Good | | Forced short scenes + custom avatar | ~71s | Mixed | | Layout language prompts | ~48s | Poor |

Quality Checklist

[ ] Thesis-driven — story, not bullet points
[ ] Style named with colors, typography, motion, transitions (see visual-styles.md)
[ ] Avatar has thematic wardrobe + branded environment (60-100 words)
[ ] Critical text listed — every stat, quote, label
[ ] Scenes rotate types — never 3+ same type. At least 2 B-roll scenes
[ ] Every scene has VOICEOVER — including B-roll
[ ] B-roll scenes have 4+ layers, every element has motion verbs
[ ] B-roll scenes are 10-15 seconds (never ≤5s)
[ ] Brand logos appear when discussing companies
[ ] Every element moves — no static frames

File v2.8.0:references/quota.md

name: quota description: Credit system, usage limits, and checking remaining quota for HeyGen

HeyGen Quota and Credits

HeyGen uses a credit-based system for video generation. Understanding quota management helps prevent failed video generation requests.

Checking Remaining Quota

curl

curl -X GET "https://api.heygen.com/v2/user/remaining_quota" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

TypeScript

interface QuotaResponse {
  error: null | string;
  data: {
    remaining_quota: number;
    used_quota: number;
  };
}

const response = await fetch("https://api.heygen.com/v2/user/remaining_quota", {
  headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
});

const { data }: QuotaResponse = await response.json();
console.log(`Remaining credits: ${data.remaining_quota}`);

Python

import requests
import os

response = requests.get(
    "https://api.heygen.com/v2/user/remaining_quota",
    headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
)

data = response.json()["data"]
print(f"Remaining credits: {data['remaining_quota']}")

Response Format

{
  "error": null,
  "data": {
    "remaining_quota": 450,
    "used_quota": 50
  }
}

Credit Consumption

Different operations consume different amounts of credits:

| Operation | Credit Cost | Notes | |-----------|-------------|-------| | Standard video (1 min) | ~1 credit per minute | Varies by resolution | | 720p video | Base rate | Standard quality | | 1080p video | ~1.5x base rate | Higher quality | | Video translation | Varies | Depends on video length | | Streaming avatar | Per session | Real-time usage |

Pre-Generation Quota Check

Always verify sufficient quota before generating videos:

async function generateVideoWithQuotaCheck(videoConfig: VideoConfig) {
  // Check quota first
  const quotaResponse = await fetch(
    "https://api.heygen.com/v2/user/remaining_quota",
    { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
  );

  const { data: quota } = await quotaResponse.json();

  // Estimate required credits (rough estimate: 1 credit per minute)
  const estimatedMinutes = videoConfig.estimatedDuration / 60;
  const requiredCredits = Math.ceil(estimatedMinutes);

  if (quota.remaining_quota < requiredCredits) {
    throw new Error(
      `Insufficient credits. Need ${requiredCredits}, have ${quota.remaining_quota}`
    );
  }

  // Proceed with video generation
  return generateVideo(videoConfig);
}

Quota Management Best Practices

1. Monitor Usage Regularly

async function logQuotaUsage() {
  const response = await fetch(
    "https://api.heygen.com/v2/user/remaining_quota",
    { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
  );

  const { data } = await response.json();

  console.log({
    remaining: data.remaining_quota,
    used: data.used_quota,
    percentUsed: (
      (data.used_quota / (data.remaining_quota + data.used_quota)) *
      100
    ).toFixed(1),
  });
}

2. Set Up Alerts

const QUOTA_WARNING_THRESHOLD = 50;

async function checkQuotaWithAlert() {
  const response = await fetch(
    "https://api.heygen.com/v2/user/remaining_quota",
    { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
  );

  const { data } = await response.json();

  if (data.remaining_quota < QUOTA_WARNING_THRESHOLD) {
    // Send alert (email, Slack, etc.)
    await sendAlert(`Low HeyGen quota: ${data.remaining_quota} credits remaining`);
  }

  return data;
}

3. Use Test Mode for Development

When available, use test mode to avoid consuming credits during development:

const videoConfig = {
  test: true, // Use test mode during development
  video_inputs: [...],
};

// Test videos may have watermarks but don't consume credits

Subscription Tiers

Different subscription tiers have different quota allocations and features:

| Tier | Features | |------|----------| | Free | Limited credits, basic features | | Creator | More credits, standard avatars | | Team | Higher limits, team collaboration | | Enterprise | Custom limits, API access, priority support |

API access typically requires Enterprise tier or higher.

Error Handling for Quota Issues

async function handleQuotaError(error: any) {
  if (error.message.includes("quota") || error.message.includes("credit")) {
    console.error("Quota exceeded. Consider:");
    console.error("1. Upgrading your subscription");
    console.error("2. Waiting for quota reset");
    console.error("3. Purchasing additional credits");

    // Check current quota
    const quota = await getQuota();
    console.error(`Current remaining: ${quota.remaining_quota}`);
  }

  throw error;
}

Archive v2.6.0: 22 files, 86738 bytes

Files: references/assets.md (8484b), references/authentication.md (4770b), references/avatars.md (16134b), references/backgrounds.md (6696b), references/captions.md (5631b), references/dimensions.md (6975b), references/photo-avatars.md (15355b), references/prompt-optimizer.md (40889b), references/quota.md (4765b), references/remotion-integration.md (18550b), references/scripts.md (10143b), references/templates.md (9990b), references/text-overlays.md (6964b), references/text-to-speech.md (8693b), references/video-agent.md (9037b), references/video-generation.md (22105b), references/video-status.md (12892b), references/video-translation.md (11118b), references/voices.md (11892b), references/webhooks.md (9302b), SKILL.md (4302b), _meta.json (130b)

File v2.6.0:SKILL.md

name: heygen description: | HeyGen AI video creation API. Use when: (1) Using Video Agent for one-shot prompt-to-video generation, (2) Generating AI avatar videos with /v2/video/generate, (3) Working with HeyGen avatars, voices, backgrounds, or captions, (4) Creating transparent WebM videos for compositing, (5) Polling video status or handling webhooks, (6) Integrating HeyGen with Remotion for programmatic video, (7) Translating or dubbing existing videos, (8) Generating standalone TTS audio with the Starfish model via /v1/audio. homepage: https://docs.heygen.com/reference/generate-video-agent metadata: openclaw: requires: env: - HEYGEN_API_KEY primaryEnv: HEYGEN_API_KEY

HeyGen API

AI avatar video creation API for generating talking-head videos, explainers, and presentations.

Default Workflow

Prefer Video Agent API (POST /v1/video_agent/generate) for most video requests. Always use prompt-optimizer.md guidelines to structure prompts with scenes, timing, and visual styles.

Only use v2/video/generate when user explicitly needs:

Exact script without AI modification
Specific voice_id selection
Different avatars/backgrounds per scene
Precise per-scene timing control
Programmatic/batch generation with exact specs

Quick Reference

| Task | Read | |------|------| | Generate video from prompt (easy) | prompt-optimizer.md → video-agent.md | | Generate video with precise control | video-generation.md, avatars.md, voices.md | | Check video status / get download URL | video-status.md | | Add captions or text overlays | captions.md, text-overlays.md | | Transparent video for compositing | video-generation.md (WebM section) | | Generate standalone TTS audio | text-to-speech.md | | Translate/dub existing video | video-translation.md | | Use with Remotion | remotion-integration.md |

Reference Files

Foundation

references/authentication.md - API key setup and X-Api-Key header
references/quota.md - Credit system and usage limits
references/video-status.md - Polling patterns and download URLs
references/assets.md - Uploading images, videos, audio

Core Video Creation

references/avatars.md - Listing avatars, styles, avatar_id selection
references/voices.md - Listing voices, locales, speed/pitch
references/scripts.md - Writing scripts, pauses, pacing
references/video-generation.md - POST /v2/video/generate and multi-scene videos
references/video-agent.md - One-shot prompt video generation
references/prompt-optimizer.md - Writing effective Video Agent prompts
references/dimensions.md - Resolution and aspect ratios

Video Customization

references/backgrounds.md - Solid colors, images, video backgrounds
references/text-overlays.md - Adding text with fonts and positioning
references/captions.md - Auto-generated captions and subtitles

Advanced Features

references/templates.md - Template listing and variable replacement
references/video-translation.md - Translating videos and dubbing
references/text-to-speech.md - Standalone TTS audio with Starfish model
references/photo-avatars.md - Creating avatars from photos
references/webhooks.md - Webhook endpoints and events

Integration

references/remotion-integration.md - Using HeyGen in Remotion compositions

File v2.6.0:_meta.json

{ "ownerId": "kn7dnc0jepdz3jy0rg589kcxns80dmr5", "slug": "video-agent", "version": "2.6.0", "publishedAt": 1771797680257 }

File v2.6.0:references/assets.md

name: assets description: Uploading images, videos, and audio for use in HeyGen video generation

Asset Upload and Management

HeyGen allows you to upload custom assets (images, videos, audio) for use in video generation, such as backgrounds, talking photo sources, and custom audio.

Upload Flow

Asset uploads are a single-step process: POST the raw file binary directly to the upload endpoint. The Content-Type header must match the file's MIME type.

Uploading an Asset

Endpoint: POST https://upload.heygen.com/v1/asset

Request

| Header | Required | Description | |--------|:--------:|-------------| | X-Api-Key | ✓ | Your HeyGen API key | | Content-Type | ✓ | MIME type of the file (e.g. image/jpeg) |

The request body is the raw binary file data. No JSON or form fields are needed.

Response

curl

curl -X POST "https://upload.heygen.com/v1/asset" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: image/jpeg" \
  --data-binary '@./background.jpg'

TypeScript

import fs from "fs";

interface AssetUploadResponse {
  code: number;
  data: {
    id: string;
    name: string;
    file_type: string;
    url: string;
    image_key: string | null;
    folder_id: string;
    meta: string | null;
    created_ts: number;
  };
  msg: string | null;
  message: string | null;
}

async function uploadAsset(filePath: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  const fileBuffer = fs.readFileSync(filePath);

  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
    },
    body: fileBuffer,
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

// Usage
const asset = await uploadAsset("./background.jpg", "image/jpeg");
console.log(`Uploaded asset: ${asset.id}`);
console.log(`Asset URL: ${asset.url}`);

TypeScript (with streams for large files)

import fs from "fs";
import { stat } from "fs/promises";

async function uploadLargeAsset(filePath: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  const fileStats = await stat(filePath);
  const fileStream = fs.createReadStream(filePath);

  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
      "Content-Length": fileStats.size.toString(),
    },
    body: fileStream as any,
    // @ts-ignore - duplex is needed for streaming
    duplex: "half",
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

Python

import requests
import os

def upload_asset(file_path: str, content_type: str) -> dict:
    with open(file_path, "rb") as f:
        response = requests.post(
            "https://upload.heygen.com/v1/asset",
            headers={
                "X-Api-Key": os.environ["HEYGEN_API_KEY"],
                "Content-Type": content_type
            },
            data=f
        )

    data = response.json()
    if data.get("code") != 100:
        raise Exception(data.get("message", "Upload failed"))

    return data["data"]


# Usage
asset = upload_asset("./background.jpg", "image/jpeg")
print(f"Uploaded asset: {asset['id']}")
print(f"Asset URL: {asset['url']}")

Supported Content Types

Uploading from URL

If your asset is already hosted online:

async function uploadFromUrl(sourceUrl: string, contentType: string): Promise<AssetUploadResponse["data"]> {
  // 1. Download the file
  const sourceResponse = await fetch(sourceUrl);
  const buffer = Buffer.from(await sourceResponse.arrayBuffer());

  // 2. Upload directly to HeyGen
  const response = await fetch("https://upload.heygen.com/v1/asset", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": contentType,
    },
    body: buffer,
  });

  const json: AssetUploadResponse = await response.json();

  if (json.code !== 100) {
    throw new Error(json.message ?? "Upload failed");
  }

  return json.data;
}

Using Uploaded Assets

As Background Image

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hello, this is a video with a custom background!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "image",
        url: asset.url,  // Use the URL from the upload response
      },
    },
  ],
};

As Talking Photo Source

const talkingPhotoConfig = {
  video_inputs: [
    {
      character: {
        type: "talking_photo",
        talking_photo_id: asset.id,  // Use the ID from the upload response
      },
      voice: {
        type: "text",
        input_text: "Hello from my talking photo!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
  ],
};

As Audio Input

const audioConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "audio",
        audio_url: asset.url,  // Use the URL from the upload response
      },
    },
  ],
};

Complete Upload Workflow

async function createVideoWithCustomBackground(
  backgroundPath: string,
  script: string
): Promise<string> {
  // 1. Upload background
  console.log("Uploading background...");
  const background = await uploadAsset(backgroundPath, "image/jpeg");

  // 2. Create video config
  const config = {
    video_inputs: [
      {
        character: {
          type: "avatar",
          avatar_id: "josh_lite3_20230714",
          avatar_style: "normal",
        },
        voice: {
          type: "text",
          input_text: script,
          voice_id: "1bd001e7e50f421d891986aad5158bc8",
        },
        background: {
          type: "image",
          url: background.url,
        },
      },
    ],
    dimension: { width: 1920, height: 1080 },
  };

  // 3. Generate video
  console.log("Generating video...");
  const response = await fetch("https://api.heygen.com/v2/video/generate", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify(config),
  });

  const { data } = await response.json();
  return data.video_id;
}

Asset Limitations

File size: 10MB maximum
Image dimensions: Recommended to match video dimensions
Audio duration: Should match expected video length
Retention: Assets may be deleted after a period of inactivity

Best Practices

Optimize images - Resize to match video dimensions before uploading
Use appropriate formats - JPEG for photos, PNG for graphics with transparency
Validate before upload - Check file type and size locally first
Handle upload errors - Implement retry logic for failed uploads
Cache asset IDs - Reuse assets across multiple video generations

File v2.6.0:references/authentication.md

name: authentication description: API key setup, X-Api-Key header, and authentication patterns for HeyGen

HeyGen Authentication

All HeyGen API requests require authentication using an API key passed in the X-Api-Key header.

Getting Your API Key

Go to https://app.heygen.com/settings?from=&nav=API
Log in if prompted
Copy your API key

Environment Setup

Store your API key securely as an environment variable:

export HEYGEN_API_KEY="your-api-key-here"

For .env files:

HEYGEN_API_KEY=your-api-key-here

Making Authenticated Requests

curl

curl -X GET "https://api.heygen.com/v2/avatars" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

TypeScript/JavaScript (fetch)

const response = await fetch("https://api.heygen.com/v2/avatars", {
  headers: {
    "X-Api-Key": process.env.HEYGEN_API_KEY!,
  },
});
const { data } = await response.json();

TypeScript/JavaScript (axios)

import axios from "axios";

const client = axios.create({
  baseURL: "https://api.heygen.com",
  headers: {
    "X-Api-Key": process.env.HEYGEN_API_KEY,
  },
});

const { data } = await client.get("/v2/avatars");

Python (requests)

import os
import requests

response = requests.get(
    "https://api.heygen.com/v2/avatars",
    headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
)
data = response.json()

Python (httpx)

import os
import httpx

async with httpx.AsyncClient() as client:
    response = await client.get(
        "https://api.heygen.com/v2/avatars",
        headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
    )
    data = response.json()

Creating a Reusable API Client

TypeScript

class HeyGenClient {
  private baseUrl = "https://api.heygen.com";
  private apiKey: string;

  constructor(apiKey: string) {
    this.apiKey = apiKey;
  }

  async request<T>(endpoint: string, options: RequestInit = {}): Promise<T> {
    const response = await fetch(`${this.baseUrl}${endpoint}`, {
      ...options,
      headers: {
        "X-Api-Key": this.apiKey,
        "Content-Type": "application/json",
        ...options.headers,
      },
    });

    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.message || `HTTP ${response.status}`);
    }

    return response.json();
  }

  get<T>(endpoint: string): Promise<T> {
    return this.request<T>(endpoint);
  }

  post<T>(endpoint: string, body: unknown): Promise<T> {
    return this.request<T>(endpoint, {
      method: "POST",
      body: JSON.stringify(body),
    });
  }
}

// Usage
const client = new HeyGenClient(process.env.HEYGEN_API_KEY!);
const avatars = await client.get("/v2/avatars");

API Response Format

All HeyGen API responses follow this structure:

interface ApiResponse<T> {
  error: null | string;
  data: T;
}

Successful response example:

{
  "error": null,
  "data": {
    "avatars": [...]
  }
}

Error response example:

{
  "error": "Invalid API key",
  "data": null
}

Error Handling

Common authentication errors:

Handling Errors

async function makeRequest(endpoint: string) {
  const response = await fetch(`https://api.heygen.com${endpoint}`, {
    headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
  });

  const json = await response.json();

  if (!response.ok || json.error) {
    throw new Error(json.error || `HTTP ${response.status}`);
  }

  return json.data;
}

Rate Limiting

HeyGen enforces rate limits on API requests:

Standard rate limits apply per API key
Some endpoints (like video generation) have stricter limits
Use exponential backoff when receiving 429 errors

async function requestWithRetry(
  fn: () => Promise<Response>,
  maxRetries = 3
): Promise<Response> {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fn();

    if (response.status === 429) {
      const waitTime = Math.pow(2, i) * 1000;
      await new Promise((resolve) => setTimeout(resolve, waitTime));
      continue;
    }

    return response;
  }

  throw new Error("Max retries exceeded");
}

Security Best Practices

Never expose API keys in client-side code - Always make API calls from a backend server
Use environment variables - Don't hardcode API keys in source code
Rotate keys periodically - Generate new API keys regularly
Monitor usage - Check your HeyGen dashboard for unusual activity

File v2.6.0:references/avatars.md

name: avatars description: Listing avatars, avatar styles, and avatar_id selection for HeyGen

HeyGen Avatars

Avatars are the AI-generated presenters in HeyGen videos. You can use public avatars provided by HeyGen or create custom avatars.

Previewing Avatars Before Generation

Always preview avatars before generating a video to ensure they match user preferences. Each avatar has preview URLs that can be opened directly in the browser - no downloading required.

Quick Preview: Open URL in Browser (Recommended)

The fastest way to preview avatars is to open the URL directly in the default browser. Do not download the image first - just pass the URL to open:

# macOS: Open URL directly in default browser (no download)
open "https://files.heygen.ai/avatar/preview/josh.jpg"

# Open preview video to see animation
open "https://files.heygen.ai/avatar/preview/josh.mp4"

# Linux: Use xdg-open
xdg-open "https://files.heygen.ai/avatar/preview/josh.jpg"

# Windows: Use start
start "https://files.heygen.ai/avatar/preview/josh.jpg"

The open command on macOS opens URLs directly in the default browser - it does not download the file. This is the quickest way to let users see avatar previews.

List Avatars and Open Previews

async function listAndPreviewAvatars(openInBrowser = true): Promise<void> {
  const response = await fetch("https://api.heygen.com/v2/avatars", {
    headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
  });
  const { data } = await response.json();

  for (const avatar of data.avatars.slice(0, 5)) {
    console.log(`\n${avatar.avatar_name} (${avatar.gender})`);
    console.log(`  ID: ${avatar.avatar_id}`);
    console.log(`  Preview: ${avatar.preview_image_url}`);
  }

  // Open preview URLs directly in browser (no download needed)
  if (openInBrowser) {
    const { execSync } = require("child_process");
    for (const avatar of data.avatars.slice(0, 3)) {
      // 'open' on macOS opens the URL in default browser - doesn't download
      execSync(`open "${avatar.preview_image_url}"`);
    }
  }
}

Note: The open command passes the URL to the browser - it does not download. The browser fetches and displays the image directly.

Workflow: Preview Before Generate

List available avatars - get names, genders, and preview URLs
Open previews in browser - open <preview_image_url> for quick visual check
User selects preferred avatar by name or ID
Get avatar details for default_voice_id
Generate video with selected avatar

# Example workflow in terminal
# 1. List avatars (agent shows options)
# 2. Open preview for candidate
open "https://files.heygen.ai/avatar/preview/josh.jpg"
# 3. User says "use Josh"
# 4. Agent gets details and generates

Preview Fields in API Response

| Field | Description | |-------|-------------| | preview_image_url | Static image of the avatar (JPG) - open in browser | | preview_video_url | Short video clip showing avatar animation |

Both URLs are publicly accessible - no authentication needed to view.

Listing Available Avatars

curl

curl -X GET "https://api.heygen.com/v2/avatars" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

TypeScript

interface Avatar {
  avatar_id: string;
  avatar_name: string;
  gender: "male" | "female";
  preview_image_url: string;
  preview_video_url: string;
}

interface AvatarsResponse {
  error: null | string;
  data: {
    avatars: Avatar[];
    talking_photos: TalkingPhoto[];
  };
}

async function listAvatars(): Promise<Avatar[]> {
  const response = await fetch("https://api.heygen.com/v2/avatars", {
    headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
  });

  const json: AvatarsResponse = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }

  return json.data.avatars;
}

Python

import requests
import os

def list_avatars() -> list:
    response = requests.get(
        "https://api.heygen.com/v2/avatars",
        headers={"X-Api-Key": os.environ["HEYGEN_API_KEY"]}
    )

    data = response.json()
    if data.get("error"):
        raise Exception(data["error"])

    return data["data"]["avatars"]

Response Format

{
  "error": null,
  "data": {
    "avatars": [
      {
        "avatar_id": "josh_lite3_20230714",
        "avatar_name": "Josh",
        "gender": "male",
        "preview_image_url": "https://files.heygen.ai/...",
        "preview_video_url": "https://files.heygen.ai/..."
      },
      {
        "avatar_id": "angela_expressive_20231010",
        "avatar_name": "Angela",
        "gender": "female",
        "preview_image_url": "https://files.heygen.ai/...",
        "preview_video_url": "https://files.heygen.ai/..."
      }
    ],
    "talking_photos": []
  }
}

Avatar Types

Public Avatars

HeyGen provides a library of public avatars that anyone can use:

// List only public avatars
const avatars = await listAvatars();
const publicAvatars = avatars.filter((a) => !a.avatar_id.startsWith("custom_"));

Private/Custom Avatars

Custom avatars created from your own training footage:

const customAvatars = avatars.filter((a) => a.avatar_id.startsWith("custom_"));

Avatar Styles

Avatars support different rendering styles:

When to Use Each Style

Using Avatar Styles

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal", // "normal" | "closeUp" | "circle" | "voice_only"
      },
      voice: {
        type: "text",
        input_text: "Hello, world!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
  ],
};

Circle Style for Talking Heads

Circle style is ideal for overlay compositions:

// Circle avatar for picture-in-picture
{
  character: {
    type: "avatar",
    avatar_id: "josh_lite3_20230714",
    avatar_style: "circle",
  },
  voice: { ... },
  background: {
    type: "color",
    value: "#00FF00", // Green for chroma key, or use webm endpoint
  },
}

Searching and Filtering Avatars

By Gender

function filterByGender(avatars: Avatar[], gender: "male" | "female"): Avatar[] {
  return avatars.filter((a) => a.gender === gender);
}

const maleAvatars = filterByGender(avatars, "male");
const femaleAvatars = filterByGender(avatars, "female");

By Name

function searchByName(avatars: Avatar[], query: string): Avatar[] {
  const lowerQuery = query.toLowerCase();
  return avatars.filter((a) =>
    a.avatar_name.toLowerCase().includes(lowerQuery)
  );
}

const results = searchByName(avatars, "josh");

Avatar Groups

Avatars are organized into groups for better management.

List Avatar Groups

curl -X GET "https://api.heygen.com/v2/avatar_group.list?include_public=true" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

Query Parameters

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | include_public | bool | false | Include public avatars in results |

TypeScript

interface AvatarGroupItem {
  id: string;
  name: string;
  created_at: number;
  num_looks: number;
  preview_image: string;
  group_type: string;
  train_status: string;
  default_voice_id: string | null;
}

interface AvatarGroupListResponse {
  error: null | string;
  data: {
    avatar_group_list: AvatarGroupItem[];
  };
}

async function listAvatarGroups(
  includePublic = true
): Promise<AvatarGroupListResponse["data"]> {
  const params = new URLSearchParams({
    include_public: includePublic.toString(),
  });

  const response = await fetch(
    `https://api.heygen.com/v2/avatar_group.list?${params}`,
    { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
  );

  const json: AvatarGroupListResponse = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }

  return json.data;
}

Get Avatars in a Group

curl -X GET "https://api.heygen.com/v2/avatar_group/{group_id}/avatars" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

Using Avatars in Video Generation

Basic Avatar Usage

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Welcome to our product demo!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
  ],
  dimension: { width: 1920, height: 1080 },
};

Multiple Scenes with Different Avatars

const multiSceneConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hi, I'm Josh. Let me introduce my colleague.",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
    },
    {
      character: {
        type: "avatar",
        avatar_id: "angela_expressive_20231010",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hello! I'm Angela. Nice to meet you!",
        voice_id: "2d5b0e6a8c3f47d9a1b2c3d4e5f60718",
      },
    },
  ],
};

Using Avatar's Default Voice

Many avatars have a default_voice_id that's pre-matched for natural results. This is the recommended approach rather than manually selecting voices.

Recommended Flow

1. GET /v2/avatars           → Get list of avatar_ids
2. GET /v2/avatar/{id}/details → Get default_voice_id for chosen avatar
3. POST /v2/video/generate   → Use avatar_id + default_voice_id

Get Avatar Details (v2 API)

Given an avatar_id, fetch its details including the default voice:

curl -X GET "https://api.heygen.com/v2/avatar/{avatar_id}/details" \
  -H "X-Api-Key: $HEYGEN_API_KEY"

Response Format

{
  "error": null,
  "data": {
    "type": "avatar",
    "id": "josh_lite3_20230714",
    "name": "Josh",
    "gender": "male",
    "preview_image_url": "https://files.heygen.ai/...",
    "preview_video_url": "https://files.heygen.ai/...",
    "premium": false,
    "is_public": true,
    "default_voice_id": "1bd001e7e50f421d891986aad5158bc8",
    "tags": ["AVATAR_IV"]
  }
}

TypeScript

interface AvatarDetails {
  type: "avatar";
  id: string;
  name: string;
  gender: "male" | "female";
  preview_image_url: string;
  preview_video_url: string;
  premium: boolean;
  is_public: boolean;
  default_voice_id: string | null;
  tags: string[];
}

async function getAvatarDetails(avatarId: string): Promise<AvatarDetails> {
  const response = await fetch(
    `https://api.heygen.com/v2/avatar/${avatarId}/details`,
    { headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
  );

  const json = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }

  return json.data;
}

// Usage: Get default voice for a known avatar
const details = await getAvatarDetails("josh_lite3_20230714");
if (details.default_voice_id) {
  console.log(`Using ${details.name} with default voice: ${details.default_voice_id}`);
} else {
  console.log(`${details.name} has no default voice, select manually`);
}

Complete Example: Generate Video with Any Avatar's Default Voice

async function generateWithAvatarDefaultVoice(
  avatarId: string,
  script: string
): Promise<string> {
  // 1. Get avatar details to find default voice
  const avatar = await getAvatarDetails(avatarId);

  if (!avatar.default_voice_id) {
    throw new Error(`Avatar ${avatar.name} has no default voice`);
  }

  // 2. Generate video with the avatar's default voice
  const videoId = await generateVideo({
    video_inputs: [{
      character: {
        type: "avatar",
        avatar_id: avatar.id,
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: script,
        voice_id: avatar.default_voice_id,
      },
    }],
    dimension: { width: 1920, height: 1080 },
  });

  return videoId;
}

Why Use Default Voice?

Guaranteed gender match - Avatar and voice are pre-paired
Natural lip sync - Default voices are optimized for the avatar
Simpler code - No need to fetch and match voices separately
Better quality - HeyGen has tested this combination

Selecting the Right Avatar

Avatar Categories

HeyGen avatars fall into distinct categories. Match the category to your use case:

Selection Guidelines

For business/professional content:

Choose avatars with neutral attire (business casual or formal)
Avoid themed or seasonal avatars (holiday costumes, casual clothing)
Preview the avatar to verify professional appearance
Consider your audience demographics when selecting gender and appearance

For casual/social content:

More flexibility in avatar choice
Themed avatars can work for specific campaigns
Match avatar energy to content tone

Common Mistakes to Avoid

Using themed avatars for business content - A holiday-themed avatar looks unprofessional in a product demo
Not previewing before generation - Always open <preview_url> to verify appearance
Ignoring avatar style - A circle style avatar may not work for full-screen presentations
Mismatched voice gender - Always use the avatar's default_voice_id or match genders manually

Selection Checklist

Before generating a video:

[ ] Previewed avatar image/video in browser
[ ] Avatar appearance matches content tone (professional vs casual)
[ ] Avatar style (normal, closeUp, circle) fits the video format
[ ] Voice gender matches avatar gender
[ ] Using default_voice_id when available

Helper Functions

Get Avatar by ID

async function getAvatarById(avatarId: string): Promise<Avatar | null> {
  const avatars = await listAvatars();
  return avatars.find((a) => a.avatar_id === avatarId) || null;
}

Validate Avatar ID

async function isValidAvatarId(avatarId: string): Promise<boolean> {
  const avatar = await getAvatarById(avatarId);
  return avatar !== null;
}

Get Random Avatar

async function getRandomAvatar(gender?: "male" | "female"): Promise<Avatar> {
  let avatars = await listAvatars();

  if (gender) {
    avatars = avatars.filter((a) => a.gender === gender);
  }

  const randomIndex = Math.floor(Math.random() * avatars.length);
  return avatars[randomIndex];
}

Common Avatar IDs

Some commonly used public avatar IDs (availability may vary):

Always verify avatar availability by calling the list endpoint before using.

File v2.6.0:references/backgrounds.md

name: backgrounds description: Solid colors, images, and video backgrounds for HeyGen videos

Video Backgrounds

HeyGen supports various background types to customize the appearance of your avatar videos.

Background Types

| Type | Description | |------|-------------| | color | Solid color background | | image | Static image background | | video | Looping video background |

Color Backgrounds

The simplest option - use a solid color:

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Hello with a colored background!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "color",
        value: "#FFFFFF", // White background
      },
    },
  ],
};

Common Color Values

Using Transparent/Green Screen

For compositing in post-production:

background: {
  type: "color",
  value: "#00FF00", // Green screen
}

Image Backgrounds

Use a static image as background:

From URL

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Check out this custom background!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "image",
        url: "https://example.com/my-background.jpg",
      },
    },
  ],
};

From Uploaded Asset

First upload your image, then use the asset URL:

// 1. Upload the image
const assetId = await uploadFile("./background.jpg", "image/jpeg");

// 2. Use in video config
const videoConfig = {
  video_inputs: [
    {
      character: {...},
      voice: {...},
      background: {
        type: "image",
        url: `https://files.heygen.ai/asset/${assetId}`,
      },
    },
  ],
};

Image Requirements

Formats: JPEG, PNG
Recommended size: Match video dimensions (e.g., 1920x1080 for 1080p)
Aspect ratio: Should match video aspect ratio
File size: Under 10MB recommended

Video Backgrounds

Use a looping video as background:

const videoConfig = {
  video_inputs: [
    {
      character: {
        type: "avatar",
        avatar_id: "josh_lite3_20230714",
        avatar_style: "normal",
      },
      voice: {
        type: "text",
        input_text: "Dynamic video background!",
        voice_id: "1bd001e7e50f421d891986aad5158bc8",
      },
      background: {
        type: "video",
        url: "https://example.com/background-loop.mp4",
      },
    },
  ],
};

Video Requirements

Format: MP4 (H.264 codec recommended)
Looping: Video will loop if shorter than avatar content
Audio: Background video audio is typically muted
File size: Under 100MB recommended

Different Backgrounds Per Scene

Use different backgrounds for each scene:

API & Reliability

Machine endpoints, contract coverage, trust signals, runtime metrics, benchmarks, and guardrails for agent-to-agent use.

MissingCLAWHUB

Machine interfaces

Contract & API

Endpoints

Dossier API Snapshot API Contract API Trust API

Contract coverage

Status

missing

Auth

None

Streaming

Data region

Unspecified

Protocol support

No protocol metadata captured.

Requires: none

Forbidden: none

Guardrails

Operational confidence: low

No positive guardrails captured.

Invocation examples

curl -s "https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/snapshot"

curl -s "https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/contract"

curl -s "https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/trust"

Operational fit

Reliability & Benchmarks

Trust signals

Handshake

UNKNOWN

Confidence

unknown

Attempts 30d

unknown

Fallback rate

unknown

Runtime metrics

Observed P50

unknown

Observed P95

unknown

Rate limit

unknown

Estimated cost

unknown

Do not use if

Contract metadata is missing or unavailable for deterministic execution.

No benchmark suites or observed failure patterns are available.

Machine Appendix

Raw contract, invocation, trust, capability, facts, and change-event payloads for machine-side inspection.

MissingCLAWHUB

Contract JSON

{
  "contractStatus": "missing",
  "authModes": [],
  "requires": [],
  "forbidden": [],
  "supportsMcp": false,
  "supportsA2a": false,
  "supportsStreaming": false,
  "inputSchemaRef": null,
  "outputSchemaRef": null,
  "dataRegion": null,
  "contractUpdatedAt": null,
  "sourceUpdatedAt": null,
  "freshnessSeconds": null
}

Invocation Guide

{
  "preferredApi": {
    "snapshotUrl": "https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/snapshot",
    "contractUrl": "https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/contract",
    "trustUrl": "https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/trust"
  },
  "curlExamples": [
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/snapshot\"",
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/contract\"",
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/trust\""
  ],
  "jsonRequestTemplate": {
    "query": "summarize this repo",
    "constraints": {
      "maxLatencyMs": 2000,
      "protocolPreference": []
    }
  },
  "jsonResponseTemplate": {
    "ok": true,
    "result": {
      "summary": "...",
      "confidence": 0.9
    },
    "meta": {
      "source": "CLAWHUB",
      "generatedAt": "2026-04-17T05:16:11.159Z"
    }
  },
  "retryPolicy": {
    "maxAttempts": 3,
    "backoffMs": [
      500,
      1500,
      3500
    ],
    "retryableConditions": [
      "HTTP_429",
      "HTTP_503",
      "NETWORK_TIMEOUT"
    ]
  }
}

Trust JSON

{
  "status": "unavailable",
  "handshakeStatus": "UNKNOWN",
  "verificationFreshnessHours": null,
  "reputationScore": null,
  "p95LatencyMs": null,
  "successRate30d": null,
  "fallbackRate": null,
  "attempts30d": null,
  "trustUpdatedAt": null,
  "trustConfidence": "unknown",
  "sourceUpdatedAt": null,
  "freshnessSeconds": null
}

Capability Matrix

{
  "rows": [],
  "flattenedTokens": ""
}

Facts JSON

[
  {
    "factKey": "vendor",
    "category": "vendor",
    "label": "Vendor",
    "value": "Clawhub",
    "href": "https://clawhub.ai/michaelwang11394/video-agent",
    "sourceUrl": "https://clawhub.ai/michaelwang11394/video-agent",
    "sourceType": "profile",
    "confidence": "medium",
    "observedAt": "2026-04-15T00:45:39.800Z",
    "isPublic": true
  },
  {
    "factKey": "traction",
    "category": "adoption",
    "label": "Adoption signal",
    "value": "4.1K downloads",
    "href": "https://clawhub.ai/michaelwang11394/video-agent",
    "sourceUrl": "https://clawhub.ai/michaelwang11394/video-agent",
    "sourceType": "profile",
    "confidence": "medium",
    "observedAt": "2026-04-15T00:45:39.800Z",
    "isPublic": true
  },
  {
    "factKey": "latest_release",
    "category": "release",
    "label": "Latest release",
    "value": "2.8.0",
    "href": "https://clawhub.ai/michaelwang11394/video-agent",
    "sourceUrl": "https://clawhub.ai/michaelwang11394/video-agent",
    "sourceType": "release",
    "confidence": "medium",
    "observedAt": "2026-02-23T17:23:15.703Z",
    "isPublic": true
  },
  {
    "factKey": "handshake_status",
    "category": "security",
    "label": "Handshake status",
    "value": "UNKNOWN",
    "href": "https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/trust",
    "sourceUrl": "https://xpersona.co/api/v1/agents/clawhub-michaelwang11394-video-agent/trust",
    "sourceType": "trust",
    "confidence": "medium",
    "observedAt": null,
    "isPublic": true
  }
]

Change Events JSON

[
  {
    "eventType": "release",
    "title": "Release 2.8.0",
    "description": "Auto-publish from commit 1817bb7648735737457f1250bfb7513f04576b87",
    "href": "https://clawhub.ai/michaelwang11394/video-agent",
    "sourceUrl": "https://clawhub.ai/michaelwang11394/video-agent",
    "sourceType": "release",
    "confidence": "medium",
    "observedAt": "2026-02-23T17:23:15.703Z",
    "isPublic": true
  }
]