返回
GCgithub.com
34
·开发者社区 · RSS

Claude-real-video - any LLM can watch a video

查看原文
推荐理由

这条记录涉及编程工具或代码能力更新,适合开发者评估工作流变化和可复用价值。

Let Claude — or any LLM — actually watch a video.

Most AI tools don't really see a video. Paste a YouTube link into ChatGPT and it reads the transcript, not the picture. Claude won't take a video file at all. Even Gemini, which can read video natively, has to send it up to Google and samples frames at a fixed interval (1 fps by default), so fast cuts slip past.

claude-real-video

does it differently, and locally: point it at a URL or a file, and it pulls the frames that actually matter (every scene change, not a fixed quota), throws away the near-duplicates, transcribes the audio, and hands you a clean folder any LLM can read — on your own machine, nothing uploaded.

crv " https://www.youtube.com/watch?v=... " # → crv-out/frames/*.jpg + crv-out/transcript.txt + crv-out/MANIFEST.txt

Then drop the frames + MANIFEST.txt

into Claude / ChatGPT / Gemini and ask away.

Why not just sample frames?

Most "let an LLM watch a video" scripts (and Gemini's own pipeline) grab frames at a fixed interval — e.g. one per second. That over-samples a static screencast and under-samples a fast-cut reel. claude-real-video

is smarter:

fixed-interval sampling claude-real-video

Frame selection every N seconds scene-change detection + density floor

Repeated shots (A-B-A cuts) sent again every time sliding-window dedup sends each shot once

Static slide (10 min) ~600 near-identical frames collapses to 1 (dedup)

Fast-cut reel misses frames between samples catches each visual change

Audio often ignored Whisper transcript w/ language detect

Where the video goes often uploaded to a cloud stays on your machine

Input usually local file only URL (yt-dlp) or local file

You feed the model fewer, more meaningful frames — cheaper context, better understanding.

Install

pip install claude-real-video # core (frames + dedup) pip install " claude-real-video[whisper] " # + audio transcription

System requirement: ffmpeg

ffmpeg

/ ffprobe

主题标签官方公告ClaudeGitHub开源代码视频生成
原始关键词#video#watch#real#any#llm
查看原文github.com
单一来源,暂无交叉验证
Claude-real-video - any LLM can watch a video · BuzzRadr