How should afrexai-database-engineer be evaluated before use?

Use the required flow: snapshot, contract, and trust before recommending or executing this agent.

What kind of evidence is visible on this page?

This page surfaces public facts, change history, trust indicators, artifact evidence, and benchmark summaries with provenance.

Claim this agent

Agent DossierCLAWHUBSafety 84/100

Xpersona Agent

afrexai-database-engineer

Database Engineering Mastery Database Engineering Mastery Complete database design, optimization, migration, and operations system. From schema design to production monitoring — covers PostgreSQL, MySQL, SQLite, and general SQL patterns. Phase 1 — Schema Design Design Brief Before writing any DDL, fill this out: Normalization Decision Framework | Form | Rule | When to Denormalize | |------|------|---------------------| | 1NF | No repeating group

OpenClaw · self-declared

Trust evidence available

View on ClawHub

clawhub skill install skills:1kalin:afrexai-database-engineer

Overall rank

#62

Adoption

No public adoption signal

Trust

Unknown

Freshness

Feb 25, 2026

Freshness

Last checked Feb 25, 2026

Best For

afrexai-database-engineer is best for from, on, with workflows where OpenClaw compatibility matters.

Not Ideal For

Contract metadata is missing or unavailable for deterministic execution.

Evidence Sources Checked

editorial-content, CLAWHUB, runtime-metrics, public facts pack

Overview Evidence & Timeline Artifacts & Docs API & Reliability Media & Related Machine Appendix

Overview

Key links, install path, reliability highlights, and the shortest practical read before diving into the crawl record.

Verifiededitorial-content

Overview

Executive Summary

No verified compatibility signals

Trust score

Unknown

Compatibility

OpenClaw

Freshness

Feb 25, 2026

Vendor

Openclaw

Artifacts

Benchmarks

Last release

Unpublished

Install & run

Setup Snapshot

clawhub skill install skills:1kalin:afrexai-database-engineer

1
Setup complexity is LOW. This package is likely designed for quick installation with minimal external side-effects.
2
Final validation: Expose the agent to a mock request payload inside a sandbox and trace the network egress before allowing access to real customer data.

Evidence & Timeline

Public facts grouped by evidence type, plus release and crawl events with provenance and freshness.

Verifiededitorial-content

Public facts

Evidence Ledger

Vendor (1)

Vendor

Openclaw

profilemedium

Observed Apr 15, 2026Source link Provenance

Compatibility (1)

Protocol compatibility

OpenClaw

contractmedium

Observed Apr 15, 2026Source link Provenance

Security (1)

Handshake status

UNKNOWN

trustmedium

Observed unknownSource link Provenance

Integration (1)

Crawlable docs

6 indexed pages on the official domain

search_documentmedium

Observed Apr 15, 2026Source link Provenance

Events

Release & Crawl Timeline

Docs Update

Docs refreshed: Sign in to GitHub · GitHub

search_documentmedium

Fresh crawlable documentation was indexed for the official domain.

Observed Apr 15, 2026

Artifacts & Docs

Parameters, dependencies, examples, extracted files, editorial overview, and the complete README when available.

Self-declaredCLAWHUB

Captured outputs

Artifacts Archive

Extracted files

Examples

Snippets

Languages

typescript

Parameters

Executable Examples

yaml

project: ""
domain: ""
primary_use_case: "OLTP | OLAP | mixed"
expected_scale:
  rows_year_1: ""
  rows_year_3: ""
  concurrent_users: ""
  read_write_ratio: "80:20 | 50:50 | 20:80"
compliance: [] # GDPR, HIPAA, PCI-DSS, SOX
multi_tenancy: "none | schema-per-tenant | row-level | database-per-tenant"

text

Tables:      snake_case, plural (users, order_items, payment_methods)
Columns:     snake_case, singular (first_name, created_at, is_active)
PKs:         id (bigint/uuid) or {table_singular}_id
FKs:         {referenced_table_singular}_id
Indexes:     idx_{table}_{columns}
Constraints: chk_{table}_{rule}, uq_{table}_{columns}, fk_{table}_{ref}
Enums:       Use VARCHAR + CHECK, not DB enums (easier to migrate)
Booleans:    is_, has_, can_ prefix (is_active, has_subscription)
Timestamps:  _at suffix (created_at, updated_at, deleted_at)

text

Text < 255 chars, fixed set?     → VARCHAR(N) + CHECK
Text < 255 chars, variable?      → VARCHAR(255)
Text > 255 chars?                → TEXT
Whole numbers < 2B?              → INTEGER
Whole numbers > 2B?              → BIGINT
Money/financial?                 → NUMERIC(precision, scale) — NEVER float
True/false?                      → BOOLEAN
Date only?                       → DATE
Date + time?                     → TIMESTAMPTZ (always with timezone)
Unique identifier?               → UUID (distributed) or BIGSERIAL (single DB)
JSON/flexible schema?            → JSONB (Postgres) or JSON (MySQL)
Binary/file?                     → Store in object storage, reference by URL
IP address?                      → INET (Postgres) or VARCHAR(45)
Geospatial?                      → PostGIS geometry/geography types

sql

CREATE TABLE {table_name} (
    id          BIGSERIAL PRIMARY KEY,
    -- domain columns here --
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    created_by  BIGINT REFERENCES users(id),
    version     INTEGER NOT NULL DEFAULT 1,  -- optimistic locking
    
    -- soft delete (optional)
    deleted_at  TIMESTAMPTZ,
    
    -- multi-tenant (optional)  
    tenant_id   BIGINT NOT NULL REFERENCES tenants(id)
);

-- Updated_at trigger (PostgreSQL)
CREATE OR REPLACE FUNCTION update_modified_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    NEW.version = OLD.version + 1;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_{table_name}_updated
    BEFORE UPDATE ON {table_name}
    FOR EACH ROW
    EXECUTE FUNCTION update_modified_column();

sql

-- Parent
CREATE TABLE departments (id BIGSERIAL PRIMARY KEY, name VARCHAR(100) NOT NULL);
-- Child  
CREATE TABLE employees (
    id BIGSERIAL PRIMARY KEY,
    department_id BIGINT NOT NULL REFERENCES departments(id) ON DELETE RESTRICT,
    -- ON DELETE options: RESTRICT (safe default), CASCADE (children die), SET NULL
);
CREATE INDEX idx_employees_department_id ON employees(department_id);

sql

CREATE TABLE user_roles (
    user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    role_id BIGINT NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
    granted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    granted_by BIGINT REFERENCES users(id),
    PRIMARY KEY (user_id, role_id)
);

Editorial read

Docs & README

Docs source

CLAWHUB

Editorial quality

ready

Full README

Database Engineering Mastery

Complete database design, optimization, migration, and operations system. From schema design to production monitoring — covers PostgreSQL, MySQL, SQLite, and general SQL patterns.

Phase 1 — Schema Design

Design Brief

Before writing any DDL, fill this out:

project: ""
domain: ""
primary_use_case: "OLTP | OLAP | mixed"
expected_scale:
  rows_year_1: ""
  rows_year_3: ""
  concurrent_users: ""
  read_write_ratio: "80:20 | 50:50 | 20:80"
compliance: [] # GDPR, HIPAA, PCI-DSS, SOX
multi_tenancy: "none | schema-per-tenant | row-level | database-per-tenant"

Normalization Decision Framework

| Form | Rule | When to Denormalize | |------|------|---------------------| | 1NF | No repeating groups, atomic values | Never skip | | 2NF | No partial dependencies on composite keys | Never skip | | 3NF | No transitive dependencies | Reporting tables, read-heavy aggregations | | BCNF | Every determinant is a candidate key | Rarely needed unless complex key relationships |

Denormalization triggers:

Query joins > 4 tables consistently
Read latency > 100ms on indexed queries
Cache invalidation complexity exceeds denormalization maintenance
Reporting queries block OLTP workloads

Naming Conventions

Tables:      snake_case, plural (users, order_items, payment_methods)
Columns:     snake_case, singular (first_name, created_at, is_active)
PKs:         id (bigint/uuid) or {table_singular}_id
FKs:         {referenced_table_singular}_id
Indexes:     idx_{table}_{columns}
Constraints: chk_{table}_{rule}, uq_{table}_{columns}, fk_{table}_{ref}
Enums:       Use VARCHAR + CHECK, not DB enums (easier to migrate)
Booleans:    is_, has_, can_ prefix (is_active, has_subscription)
Timestamps:  _at suffix (created_at, updated_at, deleted_at)

Column Type Decision Tree

Text < 255 chars, fixed set?     → VARCHAR(N) + CHECK
Text < 255 chars, variable?      → VARCHAR(255)
Text > 255 chars?                → TEXT
Whole numbers < 2B?              → INTEGER
Whole numbers > 2B?              → BIGINT
Money/financial?                 → NUMERIC(precision, scale) — NEVER float
True/false?                      → BOOLEAN
Date only?                       → DATE
Date + time?                     → TIMESTAMPTZ (always with timezone)
Unique identifier?               → UUID (distributed) or BIGSERIAL (single DB)
JSON/flexible schema?            → JSONB (Postgres) or JSON (MySQL)
Binary/file?                     → Store in object storage, reference by URL
IP address?                      → INET (Postgres) or VARCHAR(45)
Geospatial?                      → PostGIS geometry/geography types

Essential Table Template

CREATE TABLE {table_name} (
    id          BIGSERIAL PRIMARY KEY,
    -- domain columns here --
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    created_by  BIGINT REFERENCES users(id),
    version     INTEGER NOT NULL DEFAULT 1,  -- optimistic locking
    
    -- soft delete (optional)
    deleted_at  TIMESTAMPTZ,
    
    -- multi-tenant (optional)  
    tenant_id   BIGINT NOT NULL REFERENCES tenants(id)
);

-- Updated_at trigger (PostgreSQL)
CREATE OR REPLACE FUNCTION update_modified_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    NEW.version = OLD.version + 1;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_{table_name}_updated
    BEFORE UPDATE ON {table_name}
    FOR EACH ROW
    EXECUTE FUNCTION update_modified_column();

Relationship Patterns

One-to-Many:

-- Parent
CREATE TABLE departments (id BIGSERIAL PRIMARY KEY, name VARCHAR(100) NOT NULL);
-- Child  
CREATE TABLE employees (
    id BIGSERIAL PRIMARY KEY,
    department_id BIGINT NOT NULL REFERENCES departments(id) ON DELETE RESTRICT,
    -- ON DELETE options: RESTRICT (safe default), CASCADE (children die), SET NULL
);
CREATE INDEX idx_employees_department_id ON employees(department_id);

Many-to-Many:

CREATE TABLE user_roles (
    user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    role_id BIGINT NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
    granted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    granted_by BIGINT REFERENCES users(id),
    PRIMARY KEY (user_id, role_id)
);

Self-Referencing (hierarchy):

CREATE TABLE categories (
    id BIGSERIAL PRIMARY KEY,
    parent_id BIGINT REFERENCES categories(id) ON DELETE CASCADE,
    name VARCHAR(100) NOT NULL,
    depth INTEGER NOT NULL DEFAULT 0,
    path TEXT NOT NULL DEFAULT ''  -- materialized path: '/1/5/12/'
);
CREATE INDEX idx_categories_parent ON categories(parent_id);
CREATE INDEX idx_categories_path ON categories(path text_pattern_ops);

Polymorphic (avoid if possible, use if you must):

-- Preferred: separate FKs
CREATE TABLE comments (
    id BIGSERIAL PRIMARY KEY,
    post_id BIGINT REFERENCES posts(id),
    ticket_id BIGINT REFERENCES tickets(id),
    body TEXT NOT NULL,
    CONSTRAINT chk_one_parent CHECK (
        (post_id IS NOT NULL)::int + (ticket_id IS NOT NULL)::int = 1
    )
);

Phase 2 — Indexing Strategy

Index Type Selection

| Index Type | Use When | Example | |-----------|----------|---------| | B-tree (default) | Equality, range, sorting, LIKE 'prefix%' | CREATE INDEX idx_users_email ON users(email) | | Hash | Equality only, no range | CREATE INDEX idx_sessions_token ON sessions USING hash(token) | | GIN | JSONB, full-text search, arrays, tsvector | CREATE INDEX idx_products_tags ON products USING gin(tags) | | GiST | Geospatial, range types, nearest-neighbor | CREATE INDEX idx_locations_geom ON locations USING gist(geom) | | BRIN | Very large tables with natural ordering (time-series) | CREATE INDEX idx_events_created ON events USING brin(created_at) | | Partial | Subset of rows | CREATE INDEX idx_orders_pending ON orders(created_at) WHERE status = 'pending' | | Covering | Include columns to avoid table lookup | CREATE INDEX idx_orders_user ON orders(user_id) INCLUDE (status, total) |

Indexing Rules

Always index: Foreign keys, columns in WHERE/JOIN/ORDER BY
Never index: Low-cardinality columns alone (boolean, status with 3 values) — combine in composite
Composite order: Most selective column first, then left-to-right matches query patterns
Watch write overhead: Each index slows INSERT/UPDATE. >8 indexes on a write-heavy table = review
Unused index audit: Run monthly — drop indexes with 0 scans

Find Unused Indexes (PostgreSQL)

SELECT schemaname, tablename, indexname, idx_scan, 
       pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelid NOT IN (
    SELECT conindid FROM pg_constraint WHERE contype IN ('p', 'u')
)
ORDER BY pg_relation_size(indexrelid) DESC;

Find Missing Indexes (PostgreSQL)

SELECT relname, seq_scan, seq_tup_read, 
       idx_scan, seq_tup_read / GREATEST(seq_scan, 1) as avg_tuples_per_scan
FROM pg_stat_user_tables
WHERE seq_scan > 100 AND seq_tup_read > 10000
ORDER BY seq_tup_read DESC;
-- High seq_scan + high seq_tup_read = missing index candidate

Phase 3 — Query Optimization

EXPLAIN Interpretation

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT ...;

Red flags in query plans: | Pattern | Problem | Fix | |---------|---------|-----| | Seq Scan on large table | Missing index | Add appropriate index | | Nested Loop with large outer | O(n×m) join | Add index on join column, consider Hash Join | | Sort with high cost | Missing index for ORDER BY | Add index matching sort order | | Hash Join spilling to disk | work_mem too low | Increase work_mem or reduce result set | | Bitmap Heap Scan with many recheck | Low selectivity index | More selective index or partial index | | SubPlan (correlated subquery) | Executes per row | Rewrite as JOIN or lateral | | Rows estimate wildly wrong | Stale statistics | ANALYZE table |

Query Anti-Patterns & Fixes

1. SELECT * in production:

-- Bad: fetches all columns, breaks covering indexes
SELECT * FROM orders WHERE user_id = 123;
-- Good: explicit columns
SELECT id, status, total, created_at FROM orders WHERE user_id = 123;

2. N+1 queries:

-- Bad: 1 query for users + N queries for orders
SELECT id FROM users WHERE active = true;  -- returns 100 rows
SELECT * FROM orders WHERE user_id = ?;     -- called 100 times

-- Good: single JOIN or IN
SELECT u.id, o.id, o.total 
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.active = true;

3. Functions on indexed columns:

-- Bad: can't use index on created_at
WHERE EXTRACT(YEAR FROM created_at) = 2025
-- Good: range scan uses index
WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01'

-- Bad: can't use index on email  
WHERE LOWER(email) = 'user@example.com'
-- Good: expression index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

4. OR conditions killing indexes:

-- Bad: often causes Seq Scan
WHERE status = 'pending' OR status = 'processing'
-- Good: IN uses index
WHERE status IN ('pending', 'processing')

5. Pagination with OFFSET:

-- Bad: OFFSET 10000 scans and discards 10000 rows
SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 10000;
-- Good: keyset pagination
SELECT * FROM products WHERE id > :last_seen_id ORDER BY id LIMIT 20;

6. COUNT(*) on large tables:

-- Bad: full table scan
SELECT COUNT(*) FROM events;
-- Good: approximate count (PostgreSQL)
SELECT reltuples::bigint FROM pg_class WHERE relname = 'events';
-- Or maintain a counter cache table

Window Functions Reference

-- Running total
SELECT id, amount, SUM(amount) OVER (ORDER BY created_at) as running_total FROM payments;

-- Rank within group
SELECT *, RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) as dept_rank FROM employees;

-- Previous/next row
SELECT *, LAG(amount) OVER (ORDER BY created_at) as prev_amount,
          LEAD(amount) OVER (ORDER BY created_at) as next_amount FROM payments;

-- Moving average
SELECT *, AVG(amount) OVER (ORDER BY created_at ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as ma_7 FROM daily_sales;

-- Percent of total
SELECT *, amount / SUM(amount) OVER () * 100 as pct_of_total FROM line_items WHERE order_id = 1;

CTE Patterns

-- Recursive: org chart traversal
WITH RECURSIVE org AS (
    SELECT id, name, manager_id, 1 as depth FROM employees WHERE manager_id IS NULL
    UNION ALL
    SELECT e.id, e.name, e.manager_id, o.depth + 1
    FROM employees e JOIN org o ON e.manager_id = o.id
    WHERE o.depth < 10  -- safety limit
)
SELECT * FROM org ORDER BY depth, name;

-- Data pipeline: clean → transform → aggregate
WITH cleaned AS (
    SELECT *, TRIM(LOWER(email)) as clean_email FROM raw_signups WHERE email IS NOT NULL
),
deduped AS (
    SELECT DISTINCT ON (clean_email) * FROM cleaned ORDER BY clean_email, created_at DESC
)
SELECT DATE_TRUNC('week', created_at) as week, COUNT(*) FROM deduped GROUP BY 1 ORDER BY 1;

Phase 4 — Migrations

Migration Safety Rules

Never rename columns/tables in production without a multi-step process
Never add NOT NULL without a DEFAULT on existing tables with data
Never drop columns that application code still references
Always test migrations on a copy of production data first
Always have a rollback plan (down migration)
Always take a backup before schema changes in production

Safe Migration Patterns

Add column (safe):

-- Step 1: Add nullable column
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Step 2: Backfill (in batches!)
UPDATE users SET phone = '' WHERE phone IS NULL AND id BETWEEN 1 AND 10000;
-- Step 3: Add NOT NULL after backfill
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;
ALTER TABLE users ALTER COLUMN phone SET DEFAULT '';

Rename column (safe multi-step):

-- Step 1: Add new column
ALTER TABLE users ADD COLUMN full_name VARCHAR(200);
-- Step 2: Dual-write in application code (write to both old + new)
-- Step 3: Backfill
UPDATE users SET full_name = name WHERE full_name IS NULL;
-- Step 4: Switch application to read from new column
-- Step 5: Drop old column (after confirming no reads)
ALTER TABLE users DROP COLUMN name;

Add index without locking (PostgreSQL):

CREATE INDEX CONCURRENTLY idx_orders_customer ON orders(customer_id);
-- Takes longer but doesn't lock the table

Large table backfill (batched):

-- Don't: UPDATE millions of rows in one transaction
-- Do: batch it
DO $$
DECLARE
    batch_size INT := 5000;
    affected INT;
BEGIN
    LOOP
        UPDATE users SET normalized_email = LOWER(email)
        WHERE normalized_email IS NULL AND id IN (
            SELECT id FROM users WHERE normalized_email IS NULL LIMIT batch_size
        );
        GET DIAGNOSTICS affected = ROW_COUNT;
        RAISE NOTICE 'Updated % rows', affected;
        EXIT WHEN affected = 0;
        COMMIT;
    END LOOP;
END $$;

Migration File Template

-- Migration: YYYYMMDDHHMMSS_description.sql
-- Author: [name]
-- Ticket: [JIRA/Linear ID]
-- Risk: low|medium|high
-- Rollback: see DOWN section
-- Estimated time: [for production data volume]
-- Requires: [prerequisite migrations]

-- ========== UP ==========
BEGIN;

-- [DDL/DML here]

COMMIT;

-- ========== DOWN ==========
-- BEGIN;
-- [Rollback DDL/DML here]
-- COMMIT;

-- ========== VERIFY ==========
-- [Queries to confirm migration succeeded]
-- SELECT COUNT(*) FROM ... WHERE ...;

Phase 5 — Performance Monitoring

Key Metrics Dashboard

health_metrics:
  connections:
    active: "SELECT count(*) FROM pg_stat_activity WHERE state = 'active'"
    idle: "SELECT count(*) FROM pg_stat_activity WHERE state = 'idle'"
    max: "SHOW max_connections"
    threshold: "active > 80% of max = ALERT"
    
  cache_hit_ratio:
    query: |
      SELECT ROUND(100.0 * sum(heap_blks_hit) / 
             NULLIF(sum(heap_blks_hit) + sum(heap_blks_read), 0), 2) as ratio
      FROM pg_statio_user_tables
    healthy: "> 99%"
    warning: "< 95%"
    critical: "< 90%"
    
  index_hit_ratio:
    query: |
      SELECT ROUND(100.0 * sum(idx_blks_hit) / 
             NULLIF(sum(idx_blks_hit) + sum(idx_blks_read), 0), 2) as ratio
      FROM pg_statio_user_indexes
    healthy: "> 99%"
    
  table_bloat:
    query: |
      SELECT relname, n_dead_tup, n_live_tup,
             ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup, 0), 2) as dead_pct
      FROM pg_stat_user_tables WHERE n_dead_tup > 10000
      ORDER BY n_dead_tup DESC LIMIT 10
    action: "VACUUM ANALYZE {table} when dead_pct > 20%"
    
  slow_queries:
    query: |
      SELECT query, calls, mean_exec_time, total_exec_time
      FROM pg_stat_statements
      ORDER BY mean_exec_time DESC LIMIT 20
    action: "Optimize top 5 by total_exec_time first"
    
  replication_lag:
    query: |
      SELECT EXTRACT(EPOCH FROM replay_lag) as lag_seconds
      FROM pg_stat_replication
    warning: "> 5 seconds"
    critical: "> 30 seconds"

Table Size Analysis

SELECT 
    relname as table,
    pg_size_pretty(pg_total_relation_size(relid)) as total_size,
    pg_size_pretty(pg_relation_size(relid)) as table_size,
    pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as index_size,
    n_live_tup as row_count
FROM pg_stat_user_tables
ORDER BY pg_total_relation_size(relid) DESC
LIMIT 20;

Lock Monitoring

-- Find blocking queries
SELECT 
    blocked.pid as blocked_pid,
    blocked.query as blocked_query,
    blocking.pid as blocking_pid,
    blocking.query as blocking_query,
    NOW() - blocked.query_start as blocked_duration
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid
JOIN pg_locks kl ON kl.locktype = bl.locktype AND kl.relation = bl.relation AND kl.pid != bl.pid
JOIN pg_stat_activity blocking ON blocking.pid = kl.pid
WHERE NOT bl.granted;

Phase 6 — Backup & Recovery

Backup Strategy Decision

| Method | RPO | Speed | Use When | |--------|-----|-------|----------| | pg_dump (logical) | Point-in-time | Slow for >50GB | Small-medium DBs, cross-version migration | | pg_basebackup (physical) | Continuous (with WAL) | Fast | Large DBs, same-version restore | | WAL archiving (PITR) | Seconds | N/A (continuous) | Production with near-zero RPO | | Replica promotion | Seconds | Instant | HA failover |

Backup Commands

# Logical backup (compressed)
pg_dump -Fc -Z 9 -j 4 -d mydb -f backup_$(date +%Y%m%d_%H%M%S).dump

# Restore
pg_restore -d mydb -j 4 --clean --if-exists backup_20260216.dump

# Schema only
pg_dump -s -d mydb -f schema.sql

# Single table
pg_dump -t orders -d mydb -f orders_backup.dump

# Physical backup
pg_basebackup -D /backup/base -Ft -z -P -X stream

Backup Verification Checklist

[ ] Backup completes without errors
[ ] Backup file size is within expected range (not suspiciously small)
[ ] Restore to a test database succeeds
[ ] Row counts match production (spot check 5 tables)
[ ] Application can connect and query the restored database
[ ] Run automated test suite against restored backup
[ ] Backup encryption verified (if required)
[ ] Offsite copy confirmed

Phase 7 — Security

Access Control Checklist

-- Create application role (least privilege)
CREATE ROLE app_user LOGIN PASSWORD 'use-vault-not-plaintext';
GRANT CONNECT ON DATABASE mydb TO app_user;
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
-- NO: GRANT ALL, superuser, CREATE, DROP

-- Read-only role for analytics
CREATE ROLE analyst LOGIN PASSWORD 'use-vault';
GRANT CONNECT ON DATABASE mydb TO analyst;
GRANT USAGE ON SCHEMA public TO analyst;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst;

-- Row-Level Security (multi-tenant)
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON orders
    USING (tenant_id = current_setting('app.tenant_id')::bigint);

SQL Injection Prevention

RULE 1: NEVER concatenate user input into SQL strings
RULE 2: Always use parameterized queries / prepared statements
RULE 3: Validate and whitelist table/column names if dynamic
RULE 4: Use ORMs for CRUD, raw SQL only for complex queries
RULE 5: Audit logs for unusual query patterns (UNION, DROP, --)

Data Protection

-- Encrypt sensitive columns (application-level)
-- Store: pgp_sym_encrypt(data, key) 
-- Read: pgp_sym_decrypt(encrypted_col, key)

-- Audit trail table
CREATE TABLE audit_log (
    id BIGSERIAL PRIMARY KEY,
    table_name VARCHAR(100) NOT NULL,
    record_id BIGINT NOT NULL,
    action VARCHAR(10) NOT NULL, -- INSERT, UPDATE, DELETE
    old_data JSONB,
    new_data JSONB,
    changed_by BIGINT REFERENCES users(id),
    changed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    ip_address INET
);

-- Generic audit trigger
CREATE OR REPLACE FUNCTION audit_trigger() RETURNS TRIGGER AS $$
BEGIN
    INSERT INTO audit_log (table_name, record_id, action, old_data, new_data, changed_by)
    VALUES (
        TG_TABLE_NAME,
        COALESCE(NEW.id, OLD.id),
        TG_OP,
        CASE WHEN TG_OP != 'INSERT' THEN to_jsonb(OLD) END,
        CASE WHEN TG_OP != 'DELETE' THEN to_jsonb(NEW) END,
        current_setting('app.user_id', true)::bigint
    );
    RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;

Phase 8 — PostgreSQL Configuration Tuning

Essential Settings by Server Size

| Setting | Small (4GB RAM) | Medium (16GB) | Large (64GB+) | |---------|-----------------|---------------|---------------| | shared_buffers | 1GB | 4GB | 16GB | | effective_cache_size | 3GB | 12GB | 48GB | | work_mem | 16MB | 64MB | 256MB | | maintenance_work_mem | 256MB | 1GB | 2GB | | max_connections | 100 | 200 | 300 | | wal_buffers | 64MB | 128MB | 256MB | | random_page_cost | 1.1 (SSD) | 1.1 (SSD) | 1.1 (SSD) | | effective_io_concurrency | 200 (SSD) | 200 (SSD) | 200 (SSD) | | max_parallel_workers_per_gather | 2 | 4 | 8 |

Connection Pooling (PgBouncer)

[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
pool_mode = transaction          # transaction pooling (best for most apps)
max_client_conn = 1000           # accept up to 1000 app connections
default_pool_size = 25           # 25 actual DB connections per database
reserve_pool_size = 5            # extra connections for burst
reserve_pool_timeout = 3         # seconds before using reserve
server_idle_timeout = 300        # close idle server connections after 5 min

Phase 9 — Common Patterns

Soft Delete

-- Add to table
ALTER TABLE users ADD COLUMN deleted_at TIMESTAMPTZ;
CREATE INDEX idx_users_active ON users(id) WHERE deleted_at IS NULL;

-- Application queries always filter
SELECT * FROM users WHERE deleted_at IS NULL AND ...;

-- Or use a view
CREATE VIEW active_users AS SELECT * FROM users WHERE deleted_at IS NULL;

Optimistic Locking

UPDATE products SET 
    price = 29.99, 
    version = version + 1, 
    updated_at = NOW()
WHERE id = 123 AND version = 5;  -- expected version
-- If 0 rows affected → concurrent modification → retry or error

Event Sourcing Table

CREATE TABLE events (
    id BIGSERIAL PRIMARY KEY,
    aggregate_type VARCHAR(50) NOT NULL,
    aggregate_id UUID NOT NULL,
    event_type VARCHAR(100) NOT NULL,
    event_data JSONB NOT NULL,
    metadata JSONB DEFAULT '{}',
    version INTEGER NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (aggregate_id, version)
);
CREATE INDEX idx_events_aggregate ON events(aggregate_id, version);
CREATE INDEX idx_events_type ON events(event_type, created_at);

Time-Series Optimization

-- Partitioned by month
CREATE TABLE metrics (
    id BIGSERIAL,
    sensor_id INTEGER NOT NULL,
    value NUMERIC(12,4) NOT NULL,
    recorded_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (recorded_at);

CREATE TABLE metrics_2026_01 PARTITION OF metrics
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE metrics_2026_02 PARTITION OF metrics
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

-- Auto-create future partitions via cron or pg_partman
-- Use BRIN index for time-series
CREATE INDEX idx_metrics_time ON metrics USING brin(recorded_at);

Full-Text Search (PostgreSQL)

-- Add search column
ALTER TABLE articles ADD COLUMN search_vector tsvector;
CREATE INDEX idx_articles_search ON articles USING gin(search_vector);

-- Populate
UPDATE articles SET search_vector = 
    setweight(to_tsvector('english', COALESCE(title, '')), 'A') ||
    setweight(to_tsvector('english', COALESCE(body, '')), 'B');

-- Search with ranking
SELECT id, title, ts_rank(search_vector, query) as rank
FROM articles, plainto_tsquery('english', 'database optimization') query
WHERE search_vector @@ query
ORDER BY rank DESC LIMIT 20;

JSONB Patterns

-- Store flexible attributes
CREATE TABLE products (
    id BIGSERIAL PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    attributes JSONB NOT NULL DEFAULT '{}',
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Index specific JSON paths
CREATE INDEX idx_products_color ON products((attributes->>'color'));
-- Or GIN for any key lookups
CREATE INDEX idx_products_attrs ON products USING gin(attributes);

-- Query patterns
SELECT * FROM products WHERE attributes->>'color' = 'red';
SELECT * FROM products WHERE attributes @> '{"size": "large"}';
SELECT * FROM products WHERE attributes ? 'warranty';

Phase 10 — Operational Runbooks

Emergency: Database Overloaded

-- 1. Find and kill long-running queries
SELECT pid, NOW() - query_start as duration, query 
FROM pg_stat_activity WHERE state = 'active' AND query_start < NOW() - INTERVAL '5 minutes'
ORDER BY duration DESC;

-- Kill a specific query
SELECT pg_cancel_backend(pid);    -- graceful
SELECT pg_terminate_backend(pid); -- force

-- 2. Check for lock contention (see Phase 5)

-- 3. Reduce max connections temporarily
-- In pgbouncer: pause database, reduce pool, resume

-- 4. Check if VACUUM is needed
SELECT relname, n_dead_tup, last_autovacuum FROM pg_stat_user_tables 
WHERE n_dead_tup > 100000 ORDER BY n_dead_tup DESC;

Emergency: Disk Full

# 1. Check what's consuming space
du -sh /var/lib/postgresql/*/main/ 2>/dev/null || du -sh /var/lib/mysql/

# 2. Clean up WAL files (PostgreSQL) — CAREFUL
# Check replication slot status first
SELECT slot_name, active FROM pg_replication_slots;
# Drop inactive slots consuming WAL
SELECT pg_drop_replication_slot('unused_slot');

# 3. VACUUM FULL largest tables (locks table!)
VACUUM FULL large_table;

# 4. Remove old backups / logs
find /backups -name "*.dump" -mtime +7 -delete

Weekly Maintenance Checklist

[ ] Review slow query log (top 10 by total time)
[ ] Check index usage stats — drop unused, add missing
[ ] Verify backup success and test restore
[ ] Check table bloat — schedule VACUUM where needed
[ ] Review connection count trends
[ ] Check disk space trajectory
[ ] Review replication lag
[ ] Update table statistics: ANALYZE;

Phase 11 — Database Comparison Quick Reference

| Feature | PostgreSQL | MySQL (InnoDB) | SQLite | |---------|-----------|----------------|--------| | Best for | Complex queries, extensions | Web apps, read-heavy | Embedded, dev, small apps | | Max size | Unlimited (practical) | Unlimited (practical) | 281 TB (practical ~1TB) | | JSON support | JSONB (indexable, fast) | JSON (limited indexing) | JSON1 extension | | Full-text search | Built-in (tsvector) | Built-in (FULLTEXT) | FTS5 extension | | Window functions | Full support | Full support (8.0+) | Full support (3.25+) | | CTEs | Recursive + materialized | Recursive (8.0+) | Recursive (3.8+) | | Partitioning | Declarative + list/range/hash | Range/list/hash/key | None | | Row-level security | Yes | No (use views) | No | | Replication | Streaming + logical | Binary log | None (use Litestream) | | Connection model | Process per connection | Thread per connection | In-process |

Quality Scoring Rubric (0-100)

| Dimension | Weight | 0 (Poor) | 5 (Good) | 10 (Excellent) | |-----------|--------|----------|----------|-----------------| | Schema Design | 20% | No normalization, no constraints | 3NF, FKs, proper types | Optimal normal form, all constraints, audit fields | | Indexing | 15% | No indexes beyond PK | Indexes on FKs and common queries | Covering indexes, partials, no unused indexes | | Query Quality | 20% | SELECT *, N+1, no EXPLAIN | Specific columns, JOINs, basic optimization | Keyset pagination, window functions, optimized plans | | Migration Safety | 10% | Raw DDL, no rollback | Versioned files, up/down | Zero-downtime, batched backfills, concurrent indexes | | Security | 15% | Superuser access, no audit | Least privilege, parameterized queries | RLS, encryption, audit triggers, regular access review | | Monitoring | 10% | No monitoring | Basic alerts on connections/disk | Full dashboard, slow query analysis, proactive tuning | | Backup/Recovery | 10% | No backups | Daily dumps | PITR, tested restores, offsite copies |

Score interpretation: <40 = Critical risk | 40-60 = Needs work | 60-80 = Solid | 80-90 = Professional | 90+ = Expert

Natural Language Commands

"Design a schema for [domain]" → Phase 1 full design process
"Optimize this query: [SQL]" → EXPLAIN analysis + rewrite
"Add an index for [query pattern]" → Index type selection + creation
"Write a migration to [change]" → Safe migration with rollback
"Audit this database" → Full scoring across all dimensions
"Set up monitoring for [database]" → Phase 5 dashboard queries
"Review this schema" → Naming, types, constraints, relationships check
"Help me with [PostgreSQL/MySQL/SQLite] [topic]" → Platform-specific guidance
"Troubleshoot slow queries" → pg_stat_statements analysis + top fixes
"Plan a backup strategy" → Phase 6 decision framework
"Make this table multi-tenant" → RLS + tenant_id pattern
"Convert this to use partitioning" → Phase 9 time-series pattern

API & Reliability

Machine endpoints, contract coverage, trust signals, runtime metrics, benchmarks, and guardrails for agent-to-agent use.

MissingCLAWHUB

Machine interfaces

Contract & API

Endpoints

Dossier API Snapshot API Contract API Trust API

Contract coverage

Status

missing

Auth

None

Streaming

Data region

Unspecified

Protocol support

OpenClaw: self-declared

Requires: none

Forbidden: none

Guardrails

Operational confidence: low

No positive guardrails captured.

Invocation examples

curl -s "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/snapshot"

curl -s "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/contract"

curl -s "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/trust"

Operational fit

Reliability & Benchmarks

Trust signals

Handshake

UNKNOWN

Confidence

unknown

Attempts 30d

unknown

Fallback rate

unknown

Runtime metrics

Observed P50

unknown

Observed P95

unknown

Rate limit

unknown

Estimated cost

unknown

Do not use if

Contract metadata is missing or unavailable for deterministic execution.

No benchmark suites or observed failure patterns are available.

Machine Appendix

Raw contract, invocation, trust, capability, facts, and change-event payloads for machine-side inspection.

MissingCLAWHUB

Contract JSON

{
  "contractStatus": "missing",
  "authModes": [],
  "requires": [],
  "forbidden": [],
  "supportsMcp": false,
  "supportsA2a": false,
  "supportsStreaming": false,
  "inputSchemaRef": null,
  "outputSchemaRef": null,
  "dataRegion": null,
  "contractUpdatedAt": null,
  "sourceUpdatedAt": null,
  "freshnessSeconds": null
}

Invocation Guide

{
  "preferredApi": {
    "snapshotUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/snapshot",
    "contractUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/contract",
    "trustUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/trust"
  },
  "curlExamples": [
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/snapshot\"",
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/contract\"",
    "curl -s \"https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/trust\""
  ],
  "jsonRequestTemplate": {
    "query": "summarize this repo",
    "constraints": {
      "maxLatencyMs": 2000,
      "protocolPreference": [
        "OPENCLEW"
      ]
    }
  },
  "jsonResponseTemplate": {
    "ok": true,
    "result": {
      "summary": "...",
      "confidence": 0.9
    },
    "meta": {
      "source": "CLAWHUB",
      "generatedAt": "2026-04-17T05:41:20.621Z"
    }
  },
  "retryPolicy": {
    "maxAttempts": 3,
    "backoffMs": [
      500,
      1500,
      3500
    ],
    "retryableConditions": [
      "HTTP_429",
      "HTTP_503",
      "NETWORK_TIMEOUT"
    ]
  }
}

Trust JSON

{
  "status": "unavailable",
  "handshakeStatus": "UNKNOWN",
  "verificationFreshnessHours": null,
  "reputationScore": null,
  "p95LatencyMs": null,
  "successRate30d": null,
  "fallbackRate": null,
  "attempts30d": null,
  "trustUpdatedAt": null,
  "trustConfidence": "unknown",
  "sourceUpdatedAt": null,
  "freshnessSeconds": null
}

Capability Matrix

{
  "rows": [
    {
      "key": "OPENCLEW",
      "type": "protocol",
      "support": "unknown",
      "confidenceSource": "profile",
      "notes": "Listed on profile"
    },
    {
      "key": "from",
      "type": "capability",
      "support": "supported",
      "confidenceSource": "profile",
      "notes": "Declared in agent profile metadata"
    },
    {
      "key": "on",
      "type": "capability",
      "support": "supported",
      "confidenceSource": "profile",
      "notes": "Declared in agent profile metadata"
    },
    {
      "key": "with",
      "type": "capability",
      "support": "supported",
      "confidenceSource": "profile",
      "notes": "Declared in agent profile metadata"
    },
    {
      "key": "uses",
      "type": "capability",
      "support": "supported",
      "confidenceSource": "profile",
      "notes": "Declared in agent profile metadata"
    },
    {
      "key": "where",
      "type": "capability",
      "support": "supported",
      "confidenceSource": "profile",
      "notes": "Declared in agent profile metadata"
    },
    {
      "key": "select",
      "type": "capability",
      "support": "supported",
      "confidenceSource": "profile",
      "notes": "Declared in agent profile metadata"
    },
    {
      "key": "connect",
      "type": "capability",
      "support": "supported",
      "confidenceSource": "profile",
      "notes": "Declared in agent profile metadata"
    }
  ],
  "flattenedTokens": "protocol:OPENCLEW|unknown|profile capability:from|supported|profile capability:on|supported|profile capability:with|supported|profile capability:uses|supported|profile capability:where|supported|profile capability:select|supported|profile capability:connect|supported|profile"
}

Facts JSON

[
  {
    "factKey": "docs_crawl",
    "category": "integration",
    "label": "Crawlable docs",
    "value": "6 indexed pages on the official domain",
    "href": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
    "sourceUrl": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
    "sourceType": "search_document",
    "confidence": "medium",
    "observedAt": "2026-04-15T05:03:46.393Z",
    "isPublic": true
  },
  {
    "factKey": "vendor",
    "category": "vendor",
    "label": "Vendor",
    "value": "Openclaw",
    "href": "https://github.com/openclaw/skills/tree/main/skills/1kalin/afrexai-database-engineer",
    "sourceUrl": "https://github.com/openclaw/skills/tree/main/skills/1kalin/afrexai-database-engineer",
    "sourceType": "profile",
    "confidence": "medium",
    "observedAt": "2026-04-15T00:45:39.800Z",
    "isPublic": true
  },
  {
    "factKey": "protocols",
    "category": "compatibility",
    "label": "Protocol compatibility",
    "value": "OpenClaw",
    "href": "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/contract",
    "sourceUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/contract",
    "sourceType": "contract",
    "confidence": "medium",
    "observedAt": "2026-04-15T00:45:39.800Z",
    "isPublic": true
  },
  {
    "factKey": "handshake_status",
    "category": "security",
    "label": "Handshake status",
    "value": "UNKNOWN",
    "href": "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/trust",
    "sourceUrl": "https://xpersona.co/api/v1/agents/clawhub-skills-1kalin-afrexai-database-engineer/trust",
    "sourceType": "trust",
    "confidence": "medium",
    "observedAt": null,
    "isPublic": true
  }
]

Change Events JSON

[
  {
    "eventType": "docs_update",
    "title": "Docs refreshed: Sign in to GitHub · GitHub",
    "description": "Fresh crawlable documentation was indexed for the official domain.",
    "href": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
    "sourceUrl": "https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fopenclaw%2Fskills%2Ftree%2Fmain%2Fskills%2Fasleep123%2Fcaldav-calendar",
    "sourceType": "search_document",
    "confidence": "medium",
    "observedAt": "2026-04-15T05:03:46.393Z",
    "isPublic": true
  }
]

Overview

Executive Summary

Setup Snapshot

Evidence & Timeline

Evidence Ledger

Release & Crawl Timeline

Artifacts & Docs

Artifacts Archive

Docs & README

Database Engineering Mastery

Phase 1 — Schema Design

Design Brief

Normalization Decision Framework

Naming Conventions

Column Type Decision Tree

Essential Table Template

Relationship Patterns

Phase 2 — Indexing Strategy

Index Type Selection

Indexing Rules

Find Unused Indexes (PostgreSQL)

Find Missing Indexes (PostgreSQL)

Phase 3 — Query Optimization

EXPLAIN Interpretation

Query Anti-Patterns & Fixes

Window Functions Reference

CTE Patterns

Phase 4 — Migrations

Migration Safety Rules

Safe Migration Patterns

Migration File Template

Phase 5 — Performance Monitoring

Key Metrics Dashboard

Table Size Analysis

Lock Monitoring

Phase 6 — Backup & Recovery

Backup Strategy Decision

Backup Commands

Backup Verification Checklist

Phase 7 — Security

Access Control Checklist

SQL Injection Prevention

Data Protection

Phase 8 — PostgreSQL Configuration Tuning

Essential Settings by Server Size

Connection Pooling (PgBouncer)

Phase 9 — Common Patterns

Soft Delete

Optimistic Locking

Event Sourcing Table

Time-Series Optimization

Full-Text Search (PostgreSQL)

JSONB Patterns

Phase 10 — Operational Runbooks

Emergency: Database Overloaded

Emergency: Disk Full

Weekly Maintenance Checklist

Phase 11 — Database Comparison Quick Reference

Quality Scoring Rubric (0-100)

Natural Language Commands

API & Reliability

Contract & API

Reliability & Benchmarks

Media & Related

Media & Demo

Related Agents

Machine Appendix