YouTube to Telegram, summaries and Q&A on demand
You save a YouTube video “for later,” then later never happens. Or worse, you half-watch it while answering emails, miss the good parts, and still have no notes you can actually use.
This is where YouTube Telegram summaries help in a very real way. Marketers use it to pull angles and quotes fast, founders use it to stay sharp without losing a morning, and students rely on it when one lecture video turns into five.
This workflow turns a YouTube link into a clean summary inside Telegram, saves the transcript in Google Docs, and lets you ask follow-up questions any time. You’ll see what it does, what you need, and how to make it fit your setup.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: YouTube to Telegram, summaries and Q&A on demand
flowchart LR
subgraph sg0["Trigger on Telegram Message Flow"]
direction LR
n0@{ icon: "mdi:swap-vertical", form: "rounded", label: "Split Transcript into Segments", pos: "b", h: 48 }
n1@{ icon: "mdi:swap-vertical", form: "rounded", label: "Extract YouTube URL from Input", pos: "b", h: 48 }
n2["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Extract Video ID from URL"]
n3@{ icon: "mdi:brain", form: "rounded", label: "gpt-4o-mini", pos: "b", h: 48 }
n4@{ icon: "mdi:robot", form: "rounded", label: "Generate Summary with GPT-4o..", pos: "b", h: 48 }
n5@{ icon: "mdi:cog", form: "rounded", label: "Concatenate Transcript Segme..", pos: "b", h: 48 }
n6@{ icon: "mdi:play-circle", form: "rounded", label: "Trigger on Telegram Message", pos: "b", h: 48 }
n7@{ icon: "mdi:code-braces", form: "rounded", label: "Extract YouTube Transcript", pos: "b", h: 48 }
n8["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Send Summary via Telegram"]
n9["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Receive YouTube URL via Webh.."]
n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Send Response to Webhook"]
n12@{ icon: "mdi:cog", form: "rounded", label: "Retrieve Transcript from Goo..", pos: "b", h: 48 }
n13@{ icon: "mdi:cog", form: "rounded", label: "Update Transcript in Google ..", pos: "b", h: 48 }
n3 -.-> n4
n2 --> n7
n8 --> n10
n7 --> n0
n6 --> n1
n1 --> n2
n0 --> n5
n5 --> n4
n5 --> n12
n9 --> n1
n4 --> n8
n12 --> n13
end
subgraph sg1["Telegram Flow"]
direction LR
n11["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Telegram Trigger"]
n14@{ icon: "mdi:robot", form: "rounded", label: "Handle User Questions via AI", pos: "b", h: 48 }
n15@{ icon: "mdi:brain", form: "rounded", label: "OpenAI Chat Model", pos: "b", h: 48 }
n17@{ icon: "mdi:cog", form: "rounded", label: "Google Docs2", pos: "b", h: 48 }
n18["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Send AI Response via Telegram"]
n17 -.-> n14
n11 --> n14
n15 -.-> n14
n14 --> n18
end
subgraph sg2["Window Buffer Memory Flow"]
direction LR
n16@{ icon: "mdi:memory", form: "rounded", label: "Window Buffer Memory", pos: "b", h: 48 }
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n6,n11 trigger
class n4,n14 ai
class n3,n15 aiModel
class n16 ai
class n9,n10 api
class n2,n7 code
classDef customIcon fill:none,stroke:none
class n2,n8,n9,n10,n11,n18 customIcon
The Problem: YouTube research disappears the moment you close the tab
If you use YouTube for research, you already know the loop. Someone shares a “must watch” link. You open it, skim the comments, and realize it’s 42 minutes long. You either bail, or you watch it and tell yourself you’ll remember the key points. Then you need that one quote, that one framework, that one timestamp… and it’s gone. Rewatching is a time tax, and taking manual notes is slow enough that you stop doing it consistently, which means your “research” never becomes a reusable asset.
The friction compounds. Here’s where it breaks down.
- You end up rewatching the same 30–60 minute videos just to find one segment.
- Notes live in random places (or nowhere), so your team can’t search or reuse them.
- When you do take notes, they’re inconsistent because you’re rushing and multitasking.
- Follow-up questions turn into another full watch-through instead of a quick answer.
The Solution: Send a link, get a summary, then ask questions later
This n8n workflow gives you a simple habit: drop a YouTube link into Telegram (or send it via a webhook), then let the workflow do the heavy lifting. It pulls the video transcript, stitches it into one usable text, and saves it into a Google Docs document so you have a permanent, searchable record. Next, GPT-4o-mini creates a structured summary with a clear overview and key moments, then sends that summary right back to your Telegram chat. Later, when you want specifics, you can ask questions in Telegram and the AI answers using the stored transcript as context, so you’re not guessing from memory.
The workflow starts with a Telegram message trigger or a webhook intake. It extracts the transcript, writes it to Google Docs, generates a summary, and posts it to Telegram. After that, your Q&A messages go through an AI agent that looks up the transcript and replies with grounded answers.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you review five YouTube videos a week for competitive research, and each one is around 40 minutes. Manually, most people spend about 10 minutes skimming, then another 20 minutes jumping around for the “good parts,” so that’s roughly 2.5 hours weekly (and you still don’t have a clean transcript saved). With this workflow, you send five links in Telegram in about 5 minutes total, wait a bit for transcript + summary, and you’re done. When a teammate asks, “Did they mention pricing?” you ask the bot and get an answer without reopening the video.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Telegram to send links and receive summaries.
- Google Docs (Google Drive) to store transcripts for search.
- OpenAI API key (get it from your OpenAI API dashboard).
Skill level: Intermediate. You’ll connect accounts, add credentials, and (if self-hosting) install one community node.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
A YouTube link comes in. You either message the link to Telegram (chat trigger) or send it to an n8n webhook (handy for mobile shortcuts and share sheets).
The workflow extracts the transcript. It derives the video identifier, pulls the transcript using the youtubeTranscripter community node, then divides and merges chunks so the text is usable for AI prompts.
AI writes a structured summary. GPT-4o-mini generates an overview plus key moments (and instructions when the video is a tutorial). That summary gets posted back to your Telegram chat so you can read it where you already are.
The transcript is saved for Q&A. The merged transcript is written into Google Docs, and later Telegram questions are answered by an AI agent that looks up that document before replying.
You can easily modify the summary format to match your note style based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Telegram and Webhook Triggers
Set up the inbound trigger paths so the workflow can accept YouTube links and user questions.
- Add Telegram Message Trigger to capture chat-based inputs used for summaries.
- Add Webhook URL Intake and set Path to
8f0beaaf-b2c3-4148-8006-3b73fa146f60, with Response Mode set toresponseNode. - Add Telegram Updates Trigger to listen for viewer questions and set Updates to
message. - Credential Required: Connect your telegramApi credentials on Telegram Updates Trigger.
Step 2: Connect Google Docs
Configure the transcript document so the workflow can read and update the video transcript in Google Docs.
- Add Fetch Transcript Document and set Operation to
getwith Document URL set to[YOUR_ID]. - Credential Required: Connect your googleDocsOAuth2Api credentials on Fetch Transcript Document.
- Add Update Transcript Document and set Operation to
update. - Set Document URL to
={{ $json.documentId }}. - In Actions, add a replaceAll action with Text set to
={{ $json.content }}and Replace Text set to={{ $('Merge Transcript Pieces').item.json.concatenated_text }}. - Credential Required: Connect your googleDocsOAuth2Api credentials on Update Transcript Document.
Step 3: Set Up Transcript Processing and AI Nodes
Extract the YouTube URL, pull the transcript, merge it, and configure the AI nodes that generate summaries and answer questions.
- In Pull YouTube Link, add an assignment named youtubeUrl with the value
={{ $json.chatInput || $json.query.url}}. - In Derive Video Identifier, replace the placeholder Python with logic that extracts a videoId from the YouTube URL (this value is required by later nodes).
- Set Fetch YouTube Transcript Video ID to
={{ $json.videoId}}. - Configure Divide Transcript Chunks with Field To Split Out set to
transcript. - In Merge Transcript Pieces, set Fields To Summarize to concatenate
textwith a separator ofand aggregationconcatenate. - In Compose Video Summary, set Text to
=Please analyze the given text and create a structured summary following these guidelines: 1. *General Summary*: - Provide a concise overview of the main topic or purpose of the text in one paragraph. - Focus on the essence of the content without excessive detail. 2. *Key Moments*: - List the most important points, events, or concepts from the text. - Use bullet points for clarity. - Keep each point short and focused. - Highlight key terms using HTML bold tags (<b>term</b>). 3. *Instructions (if applicable)*: - If the text is a tutorial or instructional, list the steps in a clear order. - Use numbered points for steps. - If not applicable, state: "This text does not contain instructions." 4. *Format requirements*: - Use markdown for headers (e.g., ## General Summary) and bullet points. - Use HTML bold tags (<b>term</b>) for emphasis instead of markdown bold. - Do not use tables; use simple text for lists or comparisons (e.g., "Element: opis"). - Ensure the message is simple and displays correctly in the Telegram app, avoiding unsupported features like nested lists or tables. Here is the text: {{ $json.concatenated_text }}. - Connect Mini GPT Chat Model as the language model for Compose Video Summary. Credential Required: Connect your openAiApi credentials on Mini GPT Chat Model.
- Configure Answer Viewer Questions with Text set to
={{ $json.message.text }}and confirm the system message matches your Q&A behavior. - Connect OpenAI Chat Engine as the language model for Answer Viewer Questions. Credential Required: Connect your openAiApi credentials on OpenAI Chat Engine.
- Add Utility: Windowed Memory and set Session Key to
={{ $json.message.text }}with Session ID Type set tocustomKey. - Docs Lookup Tool is connected as a tool for Answer Viewer Questions—Credential Required: Connect your googleDocsOAuth2Api credentials on Answer Viewer Questions (the tool uses the parent credentials).
Merge Transcript Pieces outputs to both Compose Video Summary and Fetch Transcript Document in parallel.
videoId. Update the Python code to parse the YouTube URL and return a videoId, or Fetch YouTube Transcript will fail.Step 4: Configure Output and Reply Actions
Send the AI summary back to Telegram and reply to webhook requests.
- In Post Summary to Telegram, set Text to
={{ $json.text }} {{ $('Pull YouTube Link').item.json.youtubeUrl}}and keep Parse Mode set toHTML. - Credential Required: Connect your telegramApi credentials on Post Summary to Telegram.
- Connect Return Webhook Reply to send a response after Post Summary to Telegram completes.
- In Send AI Reply to Telegram, set Text to
={{ $json.output }}and keep Parse Mode set toHTML. - Credential Required: Connect your telegramApi credentials on Send AI Reply to Telegram.
Step 5: Test and Activate Your Workflow
Validate both the summary and Q&A paths before turning the workflow on.
- Use Telegram Message Trigger to send a YouTube URL and confirm a summary is posted by Post Summary to Telegram.
- Send a question via Telegram Updates Trigger and confirm Answer Viewer Questions returns a response in Send AI Reply to Telegram.
- Verify the Google Doc is updated by Update Transcript Document with the merged transcript content.
- In n8n, click Execute Workflow to test manually, then toggle Active to enable production runs.
Common Gotchas
- Google Docs credentials can expire or need specific permissions. If things break, check your Google Cloud OAuth consent screen and the connected account access in n8n first.
- If you’re using Wait nodes or external transcript fetching, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Frequently Asked Questions
Plan on about 30–60 minutes if your Google and Telegram accounts are ready.
No. You will mostly paste credentials and adjust prompts. The only “technical” part is installing a community node if you self-host.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI API usage, which is usually small for summaries but depends on transcript length and how much Q&A you do.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and it’s mostly prompt work. Update the “Compose Video Summary” prompt to output in your target language, then mirror the same instruction in the “Answer Viewer Questions” agent so replies match. Many teams also add a short style line like “use bullet points, avoid slang, include timestamps when possible” to make summaries consistently useful.
Most of the time it’s a bot token issue or the bot hasn’t been added to the right chat. Recreate the Telegram credentials in n8n, confirm the bot can read messages (privacy mode can block this), and double-check you’re using the correct chat ID. If it works for summaries but not Q&A, look at the “Telegram Updates Trigger” node first because that’s what listens for ongoing questions.
A lot, but it depends on your plan and how long the videos are. On n8n Cloud you’re mainly limited by monthly executions, while self-hosting is limited by your server resources. In practice, most small teams run dozens of videos a week comfortably, then scale up by batching links and keeping transcripts organized in Google Docs folders.
Often, yes. This workflow uses an AI agent, memory, transcript lookup, and conditional logic, and n8n tends to handle that kind of “chat plus context” setup with fewer workarounds. Zapier or Make can still do simple “send link, get summary” flows, but ongoing Q&A backed by stored transcripts gets messy fast. If you want this to become a team-wide research system, the Google Docs storage plus agent approach is honestly the point. Talk to an automation expert if you’re not sure which fits.
You’re not trying to watch more videos. You’re trying to keep the useful parts. Set this up once, and your next “send me that link” turns into searchable knowledge you can actually build on.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.