🔓 Unlock all 10,000+ workflows & prompts free Join Newsletter →
✅ Full access unlocked — explore all 10,000 AI workflow and prompt templates Browse Templates →
Home n8n Workflow
January 22, 2026

YouTube + Telegram: searchable transcripts on demand

Lisa Granqvist Partner Workflow Automation Expert

You find a great YouTube video. Then the real work starts: scrubbing the timeline, pausing, rewatching, copying quotes into a doc, and hoping you didn’t miss the good part.

This YouTube transcript automation hits content marketers first, honestly. But agency strategists building client briefs and founders doing competitive research feel it too. The outcome is simple: you get a clean transcript plus key metadata, and you can ask Telegram for the exact quote or takeaway you need.

Below you’ll see how the workflow pulls the video data, stitches the transcript, and turns Telegram into a “search bar” for the video’s ideas.

How This Automation Works

See how this solves the problem:

n8n Workflow Template: YouTube + Telegram: searchable transcripts on demand

The Challenge: Finding answers inside long videos

When a video is 20 minutes (or 2 hours), “just watch it” stops being a plan. You end up taking messy notes in three places, losing timestamps, and redoing the same search every time someone asks, “Where did they say that?” It’s also deceptively expensive. If you’re turning videos into briefs, posts, or research summaries, the time cost shows up in the worst way: context switching, half-finished docs, and lots of second-guessing. And if you’re doing this for a team, the knowledge doesn’t stick. It evaporates.

It adds up fast. Here’s where it breaks down in day-to-day work.

  • You pause and rewind constantly, which means a “quick scan” turns into an hour.
  • Quotes get copied without enough context, so you later rewatch to confirm what they meant.
  • Video metadata (title, upload date, description) ends up missing from your notes, so your source tracking is shaky.
  • When you need one specific answer, you still have to sift through everything again.

The Fix: Turn any YouTube video into a Telegram “ask me anything”

This workflow takes a YouTube video ID, grabs the video’s metadata, extracts the transcript, and bundles it into one clean JSON response that an AI agent can actually use. After that, you interact through Telegram like you’re chatting with a researcher who already watched the whole thing. Ask for quotes, summaries, key takeaways, or clarification on a specific segment. The agent pulls its answers from the transcript and the video details, so responses stay grounded in what was actually said. You’re no longer hunting inside a video. You’re querying it. That’s the shift that makes this practical for content briefs and research work.

The workflow starts with a video ID coming in (often from a parent workflow). It then fetches metadata via the YouTube Data API, pulls and stitches the transcript into readable text, and hands both to a Telegram-based AI agent. Finally, your answer comes back as a chat reply you can copy straight into a doc.

What Changes: Before vs. After

Real-World Impact

Say you review 5 competitor or industry videos a week and each one is around 30–60 minutes. Manually, you might spend about 45 minutes watching, then another 20 minutes pulling quotes and writing takeaways, so call it roughly 5–6 hours weekly. With this workflow, it’s closer to 2 minutes to paste a video ID and ask what you need, plus a few minutes of AI response time. Even if you still skim the video later, you’ve already got the highlights and the exact lines to jump to.

Requirements

  • n8n instance (try n8n Cloud free)
  • Self-hosting option if you prefer (Hostinger works well)
  • Telegram for chat-based questions and answers.
  • YouTube Data API to pull title, description, dates.
  • OpenAI (or compatible chat model) to generate grounded answers.
  • YouTube API key (get it from Google Cloud Console credentials)

Skill level: Intermediate. You’ll connect accounts, add API keys, and be comfortable testing the workflow with a few real video IDs.

Need help implementing this? Talk to an automation expert (free 15-minute consultation).

The Workflow Flow

A YouTube video ID comes in. This template is designed to be triggered by a parent workflow, but the input is straightforward: one video ID that identifies what you want to analyze.

The workflow fetches the source data. One branch builds the YouTube API request and retrieves metadata (title, description, upload details). Another branch extracts the transcript, splits it into pieces, and concatenates it into a single readable text block.

An AI agent becomes your interface. A Telegram chat trigger captures your question, conversation memory keeps context for follow-ups, and the agent uses a chat model to answer based on the combined metadata and transcript.

You get a clean response you can reuse. The workflow aggregates everything into a single JSON payload (handy for storing later in Google Sheets, Excel, or Drive) and returns the agent’s answer back into Telegram.

You can easily modify where the transcript gets saved (Sheets, Excel, Drive, or Gmail) based on your needs. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Execute Workflow Trigger

Set up the parent workflow trigger and prepare the incoming data structure used throughout the automation.

  1. Open Triggered by Parent Flow and set Input Source to jsonExample.
  2. Paste the example JSON into JSON Example: { "query": { "videoId": "YouTube video id" } }.
  3. Open Set Workflow Inputs and set GOOGLE_API_KEY to your actual API key (replace [CONFIGURE_YOUR_API_KEY]).
  4. In Set Workflow Inputs, set VIDEO_ID to {{ $json.query.videoId }}.
  5. Confirm the connection from Triggered by Parent FlowSet Workflow Inputs.

⚠️ Common Pitfall: Leaving [CONFIGURE_YOUR_API_KEY] unchanged will cause Build YouTube API Link to throw “The Google API Key is missing.”

Step 2: Connect YouTube Data API and Transcript Retrieval

Build the YouTube API URL and fetch the transcript. These branches run simultaneously.

  1. Verify Build YouTube API Link uses the provided JavaScript to construct the API URL.
  2. Ensure Retrieve Video Metadata sets URL to {{ $json.youtubeUrl }}.
  3. Review Fetch Video Transcript to confirm it expects VIDEO_ID from input and uses the YouTube transcript fetch logic.
  4. Confirm the parallel routing: Set Workflow Inputs outputs to both Build YouTube API Link and Fetch Video Transcript in parallel.

Tip: The parallel execution speeds up processing by fetching metadata and transcript at the same time.

Step 3: Set Up Transcript Processing

Split the transcript into pieces and concatenate it into a single text block for analysis.

  1. In Separate Transcript Pieces, set Field to Split Out to transcript.
  2. In Concatenate Transcript Text, ensure Fields to Summarize includes field text, separateBy , and aggregation concatenate.
  3. Verify the path Fetch Video TranscriptSeparate Transcript PiecesConcatenate Transcript Text.

Step 4: Configure Merge and Aggregate Outputs

Combine metadata with transcript text and build a single JSON payload.

  1. In Combine Details and Transcript, set Mode to combine and Combine By to combineByPosition.
  2. Ensure Retrieve Video Metadata and Concatenate Transcript Text both connect to Combine Details and Transcript.
  3. In Aggregate into Single JSON, set Aggregate to aggregateAllItemData.
  4. In Return Video Data Response, set the response value to {{ $json.data }}.

Step 5: Set Up the AI Assistant and Tools

Configure the agent, memory, and tool workflow used to analyze YouTube video content.

  1. Open Incoming Chat Event to enable chat-triggered analysis.
  2. In YouTube Insight Agent, set Text to {{ $json.chatInput }} and keep the provided System Message content.
  3. Connect Conversation Memory Window to YouTube Insight Agent as memory (credentials are added on the parent AI node, not here).
  4. Connect YouTube Analysis Tool to YouTube Insight Agent as a tool and keep Name set to youtube_video_analyzer with Workflow ID {{ $workflow.id }}.
  5. Open Compact GPT Model and confirm Model is gpt-4o-mini with Temperature 0.1.
  6. Open Utility: DeepSeek Chat Model and set Model to deepseek-chat if used for alternate responses.

Credential Required: Connect your openAiApi credentials in Compact GPT Model and Utility: DeepSeek Chat Model. Conversation Memory Window and YouTube Analysis Tool are sub-nodes—credentials should be added to YouTube Insight Agent via its connected language model.

Step 6: Test and Activate Your Workflow

Run a manual test to validate transcript retrieval, metadata fetch, and AI analysis.

  1. Use Triggered by Parent Flow to supply a sample videoId and run the workflow.
  2. Confirm successful execution shows a combined payload in Return Video Data Response with transcript and metadata.
  3. Trigger Incoming Chat Event with a prompt containing a YouTube URL or ID to verify YouTube Insight Agent produces a structured summary.
  4. Once verified, toggle the workflow to Active for production use.
🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Watch Out For

  • YouTube Data API credentials can expire or lack the right API enabled. If metadata stops loading, check your Google Cloud Console project and API restrictions first.
  • If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
  • Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.

Common Questions

How quickly can I implement this YouTube transcript automation?

About 30 minutes if you already have your API keys and Telegram ready.

Can non-technical teams implement this transcript automation?

Yes. You won’t write code, but you will connect accounts and paste in API keys.

Is n8n free to use for this YouTube transcript automation workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI API costs, which are usually a few cents per run depending on transcript length and how many questions you ask.

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

How do I adapt this YouTube transcript automation solution to my specific challenges?

Start by adjusting what gets returned in the “Return Video Data Response” step so the agent always sees the fields you care about (for example, channel name, upload date, or a cleaned description). If you want a searchable archive, add a save step right after “Aggregate into Single JSON” to write the transcript and metadata into Google Sheets or Microsoft Excel 365. For different answer styles, edit the agent instructions in “YouTube Insight Agent” so it produces brief-ready bullets, verbatim quotes, or a summary format your team already uses. You can also swap the “Compact GPT Model” for another model if cost or tone is a concern.

Why is my YouTube connection failing in this workflow?

Usually it’s an API key issue. Make sure the YouTube Data API is enabled in the same Google Cloud project as your key, then confirm the key restrictions allow the requests. If transcript extraction fails but metadata works, the video may not have transcripts available (or it’s region/permission limited), so test with a known video that has captions. Rate limits can also show up if you run many videos back-to-back.

What’s the capacity of this YouTube transcript automation solution?

On a typical n8n Cloud plan, you can run thousands of executions per month, and each video analysis is usually one execution plus your chat interactions. If you self-host, there’s no execution cap; the practical limit is your server and how quickly your AI provider handles requests. Transcript length is the real bottleneck, so very long videos can take longer to process and cost a bit more per query.

Is this YouTube transcript automation better than using Zapier or Make?

Often, yes. This workflow benefits from n8n’s ability to merge multiple data branches (metadata + transcript) and run agent-style logic without turning every fork into a pricing upgrade. Zapier or Make can be simpler for tiny automations, but once you want conversational Q&A, memory, and a “tool” workflow pattern, n8n stays flexible. Self-hosting is also a big deal if you want predictable costs. If you’re torn, Talk to an automation expert and we’ll map it to your exact volume and use case.

Set this up once, and you stop treating videos like a black box. Your research gets faster, cleaner, and way easier to reuse.

Need Help Setting This Up?

Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.

Lisa Granqvist

Workflow Automation Expert

Expert in workflow automation and no-code tools.

×

Use template

Get instant access to this n8n workflow Json file

💬
Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Launch login modal Launch register modal