🧪 Skills
Product Demo Video Creator
Create product demo videos with voiceover, text overlays, and real browser interactions. Fully automated, zero cost. Uses Puppeteer (headless Chrome), edge-t...
v1.0.0
Description
name: product-demo-video description: > Create product demo videos with voiceover, text overlays, and real browser interactions. Fully automated, zero cost. Uses Puppeteer (headless Chrome), edge-tts (Microsoft Neural TTS), PIL (text overlays), and FFmpeg (video encoding). Use when: user wants a demo video, product walkthrough, launch video, Product Hunt video, app showcase, or screen recording of a web application. Triggers: "demo video", "product video", "record demo", "launch video", "walkthrough video", "showcase video", "screen recording".
Product Demo Video Creator
Create polished demo videos with voiceover and text overlays — fully automated, zero cost.
Stack
| Tool | Purpose | Install |
|---|---|---|
| Puppeteer | Headless browser recording | npm i -g puppeteer |
| edge-tts | Microsoft Neural TTS (free) | pip3 install edge-tts |
| PIL/Pillow | Text overlays on frames | pip3 install Pillow (usually pre-installed) |
| FFmpeg | Video encoding | Static build from johnvansickle.com or package manager |
| Chromium | Browser engine | System package |
Quick Start
- Install dependencies (see
scripts/install-deps.sh) - Define scenes in
scripts/record-demo.mjs(copy template, customize) - Run:
node scripts/record-demo.mjs - Output: MP4 with voiceover + text overlays
Workflow
Define Scenes → Generate Voiceover → Record Browser → Add Text Overlays → Compile Video
↓ ↓ ↓ ↓ ↓
scenes[] edge-tts MP3 Puppeteer frames PIL drawtext FFmpeg concat
Step 1: Define Scenes
Each scene has: id, title, subtitle, narration, url, type, actions.
{
id: 'json',
title: 'JSON Formatter',
subtitle: 'Paste messy JSON, get formatted output',
narration: 'Paste any messy JSON and get it formatted instantly.',
url: 'https://example.com/json-formatter/',
type: 'tool', // 'intro' | 'tool' | 'outro'
actions: async (page) => {
await reactSetValue(page, 'textarea', '{"key":"value"}');
await wait(1000);
await clickButton(page, ['Format']);
await wait(2000);
}
}
Step 2: Scene Types and Overlay Positioning
| Type | Overlay Position | Use For |
|---|---|---|
intro |
Bottom bar (centered, large title) | Opening scene with product name |
tool |
Bottom bar (left-aligned title + right badge) | Individual feature demos |
outro |
Bottom bar (centered CTA) | Closing with call-to-action |
Critical: Use bottom bar, not top bar — top overlays conflict with website navigation.
Step 3: React App Interaction
React controlled components ignore direct .value assignment. Use reactSetValue():
async function reactSetValue(page, selector, value) {
await page.evaluate((sel, val) => {
const el = document.querySelector(sel);
const setter = Object.getOwnPropertyDescriptor(
el.tagName === 'TEXTAREA'
? window.HTMLTextAreaElement.prototype
: window.HTMLInputElement.prototype, 'value'
)?.set;
if (setter) setter.call(el, val);
else el.value = val;
el.dispatchEvent(new Event('input', { bubbles: true }));
el.dispatchEvent(new Event('change', { bubbles: true }));
}, selector, value);
}
Step 4: Voiceover
edge-tts --voice en-US-AndrewNeural --rate=+5% --text "Your narration" --write-media output.mp3
Recommended voices:
en-US-AndrewNeural— Male, warm, confident (best for product demos)en-US-AriaNeural— Female, positive, confidenten-US-BrianNeural— Male, approachable, casual
Step 5: Text Overlays
PIL adds text to frames. Key design rules:
- Solid dark background (alpha ≥ 230) — semi-transparent looks cheap
- Green accent line (2px) separating bar from content
- "100% Client-Side" or equivalent badge in green (#4ade80)
- Font: Noto Sans Bold for titles, Regular for subtitles
Step 6: Video Compilation
# Frames to video per scene
ffmpeg -framerate 6 -i frames/frame_%05d.png -c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p -r 24 scene.mp4
# Add audio
ffmpeg -i scene.mp4 -i narration.mp3 -c:v copy -c:a aac -b:a 128k -shortest scene_final.mp4
# Normalize for concatenation
ffmpeg -i scene_final.mp4 -c:v libx264 -preset slow -crf 20 -r 24 -c:a aac -b:a 128k -ar 44100 -ac 2 scene_norm.mp4
# Concatenate
ffmpeg -f concat -safe 0 -i concat.txt -c copy output.mp4
Timing Guidelines
| Action | Wait Time |
|---|---|
| Page load | 600ms |
| After setting input value | 800-1000ms |
| After clicking action button | 2000-2500ms |
| Between password generations | 1500ms |
| QR code / image rendering | 2500ms |
| Show final result | 2000-3000ms |
Scene duration = max(8s, ceil(audio_duration) + 2s)
Capture Settings
| Setting | Value | Notes |
|---|---|---|
| Resolution | 1280×720 | Standard for PH/social |
| Capture FPS | 6 | Balance quality/performance |
| Output FPS | 24 | Smooth playback |
| CRF | 20 | Good quality |
| JPEG quality | N/A (PNG frames) | Lossless capture |
Troubleshooting
- Empty tool outputs: React apps need
reactSetValue(), not.value = - Dropdown menus open: Click
bodyafter interactions to dismiss - FFmpeg "No such filter: drawtext": Static FFmpeg builds lack it — use PIL instead
- edge-tts fails: Check network; it calls Microsoft servers
- Chromium won't start: Need
--no-sandbox --disable-gpu --disable-dev-shm-usage
References
references/demo-planning.md— Demo structure, pacing, what makes demos compellingscripts/record-demo.mjs— Complete working template (customize scenes for your app)scripts/overlay.py— PIL text overlay processorscripts/install-deps.sh— One-command dependency setup
Reviews (0)
Sign in to write a review.
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!