Help us improve
Share bugs, ideas, or general feedback.
From obsidian-vault-agent
Extracts transcript and key slides from a local video file using mlx-whisper, then creates a vault-formatted lecture note with embedded screenshots. Works with any language.
npx claudepluginhub tuan3w/obsidian-vault-agent --plugin obsidian-vault-agentHow this skill is triggered — by the user, by Claude, or both
Slash command
/obsidian-vault-agent:lecture <path-to-video-file><path-to-video-file>This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
<Purpose>
Extracts YouTube video transcripts and metadata, then creates a vault note with synthesized content. Triggers on any YouTube URL or phrases like "summarize this video."
Processes audio recordings, transcripts, podcasts, lectures into structured Obsidian notes with action items, decisions, glossary via intake interview. Multilingual support.
Enhances existing English Obsidian video, transcript, timestamped, or article notes into bilingual study notes with Chinese TL;DR, inline translations below English blocks, and clearer timestamp screenshots.
Share bugs, ideas, or general feedback.
<Use_When>
<Do_Not_Use_When>
<Execution_Policy>
Parse the video file path from $ARGUMENTS. If no path provided, ask the user. Verify the file exists and is a video format (mp4, mov, mkv, avi, webm).
Run the extraction script:
SKILL_DIR="${CLAUDE_SKILL_DIR}"
LECTURE_OUTPUT="temp/lecture-extract-output.json"
uv run "$SKILL_DIR/scripts/extract_lecture.py" "VIDEO_PATH" > "$LECTURE_OUTPUT" 2>&1 &
IMPORTANT: This script takes time (several minutes for a 30-60 min video). Inform the user: "Extracting audio and transcribing — this will take a few minutes for a [duration] video."
Run it and wait for completion. Then read the output JSON.
The JSON contains:
filename, duration, duration_seconds, width, heighttranscript.full_text, transcript.segments (with start/end times), transcript.languagetranscript.error (null if success)frames[] — array of {path, timestamp_seconds, timestamp} for each extracted slideoutput_dir — temp directory with extracted framesIf transcript.error is not null: inform the user and stop. Check if mlx-whisper is installed.
If transcript is very long (>80,000 chars): warn the user. Send first 60,000 chars to the agent with a note about total length.
Copy the extracted frames to the vault's assets directory with a descriptive naming scheme:
# Generate a slug from the video filename
SLUG=$(echo "VIDEO_FILENAME" | sed 's/\.[^.]*$//' | tr '[:upper:]' '[:lower:]' | tr ' ' '-' | sed 's/[^a-z0-9-]//g' | cut -c1-30)
ASSETS_DIR="assets"
for frame in FRAME_PATHS; do
FRAME_NUM=$(basename "$frame" | grep -o '[0-9]*')
cp "$frame" "$ASSETS_DIR/lecture-${SLUG}-${FRAME_NUM}.jpg"
done
Build a frame manifest for the noter agent — each frame gets:
![[embedding]])Example manifest:
FRAMES WITH TIMESTAMPS:
- lecture-risk-mgmt-01.jpg (timestamp: 0:10)
- lecture-risk-mgmt-02.jpg (timestamp: 2:00)
- lecture-risk-mgmt-03.jpg (timestamp: 4:00)
...
Read the agent definition:
Read("${CLAUDE_SKILL_DIR}/agents/lecture-noter.md")
Search the vault for existing notes related to the lecture's topics using the MCP tool:
search_notes(query="KEYWORD", limit=20)
Or fall back to Grep if MCP is unavailable:
Grep(pattern="KEYWORD", path="notes/", glob="*.md", head_limit=20)
Review each frame using the Read tool to see what's on each slide. Build a brief description of each frame's content (1 line each) to include in the agent prompt.
Launch the lecture-noter agent:
Agent(
subagent_type="general-purpose",
model="sonnet",
run_in_background=false,
prompt="You are Lecture Noter. Follow these instructions exactly:
[INSERT FULL CONTENT OF agents/lecture-noter.md HERE]
VIDEO METADATA:
- Filename: [filename]
- Duration: [duration]
- Transcript language: [language from extraction JSON]
FRAMES WITH TIMESTAMPS AND DESCRIPTIONS:
- lecture-slug-01.jpg (timestamp: 0:10) — Title slide showing course name
- lecture-slug-02.jpg (timestamp: 2:00) — Diagram of risk framework
[... one line per frame with what you see on it]
EXISTING VAULT NOTES ON RELATED TOPICS:
[List any matching notes found in grep search]
TRANSCRIPT:
[full_text]
Produce the note body following the Output Format. Do NOT include frontmatter —
only the body starting from the # title line.
Use the exact filenames from FRAMES list for ![[embedding]] — do not invent filenames."
)
date +%Y%m%d%H%M%S
Determine the best subfolder for the note:
notes/ml/notes/startup/notes/finance/notes/design/notes/psychology/notes/Create the note file with frontmatter + agent output:
---
id: YYYYMMDDHHMMSS
type: lecture
processing_status: inbox
created_date: YYYY-MM-DD
updated_date: YYYY-MM-DD
---
[AGENT OUTPUT HERE — starts with # title, includes embedded screenshots]
rm -rf "$OUTPUT_DIR"
rm -f "$LECTURE_OUTPUT"
<Tool_Usage>
<Escalation_And_Stop_Conditions>
brew install uv) and stop$ARGUMENTS