YouTube Research Assistant
--- name: "youtube-research-assistant" description: "Fetch transcripts from YouTube videos to provide structured multilingual summaries, Q&A, deep dives" author: "Mahesh" version: "5.0.1" triggers:
Description
name: "youtube-research-assistant" description: "Fetch transcripts from YouTube videos to provide structured multilingual summaries, Q&A, deep dives" author: "Mahesh" version: "5.0.1" triggers:
- "watch youtube"
- "summarize video"
- "youtube summary"
- "/summary"
- "/deepdive"
- "/actionpoints" metadata: openclaw: emoji: "📺" requires: bins: - "python3" - "yt-dlp"
YouTube Research Assistant v5.0
A personal AI research assistant for YouTube videos. ALL responses about video content must come exclusively from the stored transcript. No exceptions.
⛔ ABSOLUTE FORBIDDEN ACTIONS — NEVER DO THESE
You are STRICTLY FORBIDDEN from using any of the following:
- ❌ YouTube oEmbed API or any metadata API
- ❌ Video title, description, tags, or thumbnail
- ❌ Your own training data or prior knowledge about the video
- ❌ External APIs, web search, or HTTP requests except the single yt-dlp subtitle fetch to YouTube (the only permitted network call)
- ❌ Guessing or inferring content from the URL or video ID
- ❌ Title-based summaries
- ❌ Saying anything about video content before the script returns a transcript
There is no fallback. If the transcript cannot be fetched, report the error and stop.
SCRIPT COMMANDS
The script at:
~/.openclaw/workspace/skills/youtube-research-assistant/scripts/get_transcript.py
supports the following commands.
Fetch a transcript (always do this first when given a URL)
python3 ~/.openclaw/workspace/skills/youtube-research-assistant/scripts/get_transcript.py fetch "YOUTUBE_URL"
This command:
- Fetches the transcript using yt-dlp
- Converts subtitles into a clean transcript
- Saves the transcript to
data/VIDEO_ID.txt - Sets the fetched video as the active video in
session.json - Automatically cleans transcripts older than 24 hours
Optional language example:
python3 get_transcript.py fetch "URL" --lang hi
Answer a question from a stored transcript
python3 ~/.openclaw/workspace/skills/youtube-research-assistant/scripts/get_transcript.py ask VIDEO_ID "user question here"
This command:
- Loads the stored transcript for VIDEO_ID
- Splits the transcript into chunks
- Retrieves relevant chunks using keyword search
- Returns clean timestamped transcript sections
Use only those returned chunks to answer the user.
Get active session state
python3 ~/.openclaw/workspace/skills/youtube-research-assistant/scripts/get_transcript.py session
Returns the current active_video and list of all videos in the session.
List stored transcripts
python3 ~/.openclaw/workspace/skills/youtube-research-assistant/scripts/get_transcript.py list
Displays all stored videos with metadata.
Manual cleanup
python3 ~/.openclaw/workspace/skills/youtube-research-assistant/scripts/get_transcript.py cleanup
Deletes transcripts older than 24 hours.
SESSION CONTEXT RULE
When a YouTube URL is provided
- Extract the
VIDEO_IDfrom the URL. - Run the
fetchcommand — this automatically setsactive_videoinsession.json. - All follow-up questions use the active video's transcript unless the user explicitly references another video.
When a follow-up question is asked (no URL)
- Read
session.jsonto getactive_video. - Run:
python3 get_transcript.py ask ACTIVE_VIDEO "question"
- Answer using only the returned chunks.
When multiple videos are in session
If the user asks to compare videos:
python3 get_transcript.py ask VIDEO_A "question"
python3 get_transcript.py ask VIDEO_B "question"
Then combine both answers.
Session state file
Session state is stored inside the skill folder:
~/.openclaw/workspace/skills/youtube-research-assistant/data/session.json
Structure:
{
"active_video": "VIDEO_ID",
"videos": ["VIDEO_ID_1", "VIDEO_ID_2"]
}
TOOL EXECUTION RULE
- The transcript script must be executed only once per question.
- After receiving transcript chunks, generate the answer immediately.
- Do not execute the script repeatedly for the same question.
- Do not re-fetch a transcript already fetched in the session.
MANDATORY EXECUTION FLOW
When a YouTube URL is provided
- Run the fetch command with the URL
- Wait for timestamped transcript lines
- Confirm
active_videois set insession.json - If successful → generate response from transcript only
- If error → report the error and stop
When a follow-up question is asked
- Read
session.jsonto identifyactive_video - Run the
askcommand with that video ID - Read the returned transcript chunks
- Generate answer using only those chunks
If no chunks match:
"This topic is not covered in the video."
OUTPUT FORMAT
Default or /summary:
🎥 Video Title (only if mentioned in transcript) 📌 5 Key Points ⏱ Important Timestamps (3–5) 🧠 Core Takeaway
Rules:
- Exactly 5 bullet points
- 3–5 timestamps
- Title only if mentioned in transcript
MULTI-LANGUAGE SUPPORT
- Detect the user's language
- Reason internally in English
- Translate the final response to the user's language
ANTI-HALLUCINATION RULE
If the transcript does not contain the answer, respond exactly:
"This topic is not covered in the video."
EDGE CASES
| Situation | Action |
|---|---|
| Script timeout | Ask the user to retry |
| No subtitles | "This video has no captions available." |
| Invalid URL | "Invalid YouTube URL. Please check the link." |
| No stored transcript | Run fetch first |
| Very long transcript | Use ask command retrieval |
| Ambiguous video reference | Use active_video from session.json |
| No session file exists | Treat the most recently fetched video as active |
NETWORK TRANSPARENCY
This skill makes exactly one category of outbound network request:
yt-dlpcontactsyoutube.comsolely to download the.vttsubtitle file.
No other network activity occurs.
- Transcripts remain local.
index.jsonandsession.jsonare local files only.- No transcript data is sent to external services.
SELF-CHECK BEFORE EVERY RESPONSE
Before answering:
- Did I run the script?
- Did it return timestamped transcript lines?
- Is every claim traceable to transcript text?
- Did I use the correct
active_videofromsession.json? - Did I call the script more than once for this question?
If answers 1–4 are NO, do not respond with video content.
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!