YouTube to Google Docs, shareable summaries ready
You grab a YouTube link, then the real work starts. Find a transcript, clean it up, summarize it, translate it, paste it into a doc, and hope you didn’t miss the one key point your audience actually cared about.
Marketing managers feel this when content calendars get tight. A founder trying to stay visible feels it too. And if you run client work, you know the “can we get this in Spanish by tomorrow?” request shows up at the worst time. This YouTube summary automation turns one video URL into a shareable Google Doc summary, in the language you choose.
Below, you’ll see how the workflow runs end to end, what it replaces, and the results you can expect once it’s set up.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: YouTube to Google Docs, shareable summaries ready
flowchart LR
subgraph sg0["On form submission Flow"]
direction LR
n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/form.svg' width='40' height='40' /></div><br/>On form submission"]
n1@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model", pos: "b", h: 48 }
n2@{ icon: "mdi:cog", form: "rounded", label: "Google Docs", pos: "b", h: 48 }
n3@{ icon: "mdi:swap-vertical", form: "rounded", label: "Mapper", pos: "b", h: 48 }
n4["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Formator"]
n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Optimizer"]
n6@{ icon: "mdi:robot", form: "rounded", label: "AI Agent1", pos: "b", h: 48 }
n7["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>YouTube Transcript AI"]
n3 --> n7
n4 --> n6
n6 --> n5
n5 --> n2
n0 --> n3
n7 --> n4
n1 -.-> n6
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n6 ai
class n1 aiModel
class n7 api
class n4,n5 code
classDef customIcon fill:none,stroke:none
class n0,n4,n5,n7 customIcon
The Problem: Turning Videos Into Summaries Is Weirdly Manual
Summarizing a YouTube video sounds simple until you do it more than twice. You hunt for a transcript (or copy captions), realize it’s messy, fix timestamps and weird line breaks, then rewrite it into something readable. Next comes formatting for reuse: a clean doc someone can skim, pull quotes from, and turn into posts. If you need multilingual versions, it’s another round of copying, translating, and double-checking tone. Honestly, the friction isn’t one big task. It’s the pile of tiny tasks that keep stealing your best creative hours.
It adds up fast. Here’s where it usually breaks down in real teams.
- You end up spending about 30 minutes per video just getting to a “usable” transcript and summary.
- Summaries drift in quality because each person writes them differently, so your repurposed content feels inconsistent.
- Language requests turn into last-minute fire drills, because translating and rewriting still needs human attention.
- Even when you finish, the output lives in random places, which makes it hard to share or build a repeatable process.
The Solution: YouTube Transcript → AI Summary → Google Doc
This workflow gives you a clean, repeatable path from “here’s a video link” to “here’s a summary I can share.” It starts with a simple form submission (your team pastes in a YouTube URL and chooses the output language). n8n then retrieves the transcript via the YouTube Transcript API on RapidAPI, so you’re not manually copying captions. Next, it formats that transcript into something AI can work with (even when the transcript isn’t perfect). From there, an AI agent powered by a chat model generates a concise summary in your chosen language. Finally, the workflow inserts the finished summary into a predefined Google Doc, so it’s immediately editable, shareable, and ready to repurpose.
The workflow begins when you submit the YouTube link in the intake form. It pulls and cleans the transcript, then the AI agent creates the summary with the language instruction you provided. The final output is saved straight into Google Docs, which means your “source of truth” is already organized.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you summarize 5 YouTube videos each week for a newsletter and social posts. Manually, you might spend about 30 minutes per video between transcript wrangling, writing, and formatting, which is roughly 2.5 hours a week. With this workflow, submitting the URL and language takes maybe 2 minutes, then you wait for processing and the Google Doc is updated automatically. You still review and tweak, but you’re starting from a clean draft instead of a blank page.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- RapidAPI for YouTube Transcript API access.
- Google Docs to store and share summaries.
- Google Gemini API key (get it from Google AI Studio / Google Cloud console).
Skill level: Intermediate. You’ll connect accounts, add API keys, and test with a few real video links.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
A form submission kicks things off. You paste a YouTube URL and specify the language you want the summary written in. This keeps requests consistent across your team, even if multiple people use the same workflow.
The transcript gets pulled automatically. n8n sends an HTTP request to the YouTube Transcript API (via RapidAPI) and receives the transcript data back. No jumping between tools, no manual copy.
The text is cleaned and prepared for summarization. A formatting step reshapes the transcript so the AI agent has fewer distractions like broken lines or odd separators. Then the summarization agent uses the chat model (Gemini in this workflow) to generate a concise version in the requested language.
The summary is saved where you actually use it. The workflow updates a predefined Google Doc with the final summary, so it’s instantly shareable and easy to edit. If you want an audit trail, you can also log each run in Google Sheets or notify someone via email.
You can easily modify the summary format to match your brand voice, or change the target document based on campaign, client, or language. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Form Trigger
Set up the form that collects the YouTube URL and language input from users.
- Add and open Form Intake Trigger.
- Set Form Title to
summarize youtube videos from transcript for social media. - Under Form Fields, add videoUrl with placeholder
full video urland mark it required. - Add language with placeholder
Englishand mark it required. - Connect Form Intake Trigger to Map Input Fields.
Step 2: Connect the Transcript API Request
Map the form fields and call the transcript API to fetch the raw transcript data.
- Open Map Input Fields and confirm the assignments: videoUrl set to
{{ $json.videoUrl }}and language set to{{ $json.language }}. - Open Transcript API Request and set URL to
https://youtube-transcriptor-pro.p.rapidapi.com/yt/index.php. - Set Method to
POST, enable Send Body, and set Content Type tomultipart-form-data. - Under Body Parameters, set name to
=videoUrland value to{{ $json.videoUrl }}. - Under Header Parameters, set x-rapidapi-host to
youtube-transcriptor-pro.p.rapidapi.comand add your API key in x-rapidapi-key. - Connect Map Input Fields to Transcript API Request.
Step 3: Set Up Transcript Processing and AI Summarization
Format the transcript text, send it to the AI model, and extract the summary section.
- Open Transcript Formatter and keep the provided JavaScript that parses, decodes, and outputs chatInput.
- Open Summarization Agent and confirm the System Message uses
{{ $('Map Input Fields').item.json.language }}to enforce language-specific output. - Open Gemini Chat Model and set Model Name to
models/gemini-2.0-flash. - Credential Required: Connect your googlePalmApi credentials in Gemini Chat Model.
- Ensure Gemini Chat Model is connected as the language model for Summarization Agent (credentials should be added to Gemini Chat Model, not the agent).
- Open Summary Extractor and keep the JavaScript that extracts the
🎬 **Summary**section into summary. - Connect Transcript API Request → Transcript Formatter → Summarization Agent → Summary Extractor.
Step 4: Configure the Google Docs Update
Insert the extracted summary into a Google Doc.
- Open Update Docs Record and set Operation to
update. - Set Authentication to
serviceAccount. - Set Document URL to your target Google Doc.
- In Action Fields, set Text to
{{ $json.summary }}and Action toinsert. - Credential Required: Connect your googleApi credentials in Update Docs Record.
- Connect Summary Extractor to Update Docs Record.
Step 5: Test and Activate Your Workflow
Validate each step and then turn the workflow on for production use.
- Click Execute Workflow and submit the form in Form Intake Trigger with a valid YouTube URL and language.
- Check that Transcript API Request returns transcript data and Transcript Formatter outputs chatInput.
- Confirm Summarization Agent produces a formatted response and Summary Extractor outputs a summary value.
- Verify Update Docs Record inserts the summary into your Google Doc.
- Toggle the workflow to Active to enable live form submissions.
Common Gotchas
- RapidAPI credentials can expire or your plan may block certain endpoints. If transcript pulls fail, check your RapidAPI dashboard logs and subscription status first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Frequently Asked Questions
About 30 minutes if your APIs and Google account are ready.
No. You’ll mostly connect accounts and paste in API keys. There’s a little testing involved, but it’s guided.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in RapidAPI usage fees and your Gemini API costs, which depend on how many videos you process.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, but you’ll want to make the target document dynamic. In practice, you add a “client” field to the form, then route to different Google Docs in the “Update Docs Record” step based on that value. Common customizations include changing the summary length, adding bullet points for social hooks, and saving a copy into a dated folder structure.
Usually it’s a missing or expired RapidAPI key, or the request is hitting a plan limit. Open the HTTP Request node, confirm the header value is correct, then check your RapidAPI dashboard for errors and quota. If the video has transcripts disabled or the language track isn’t available, the API can also return empty data, which looks like a “failure” downstream.
A lot, as long as your API quotas can keep up.
Often, yes, because this workflow benefits from a bit more control. n8n makes it easier to format transcripts, run an AI agent step, and handle odd cases like missing transcript chunks without paying “per step” penalties. Self-hosting is also a big deal if you’re processing lots of videos, since you’re not boxed in by task pricing the same way. Zapier or Make can still be fine for very simple flows, but this one tends to grow once teams start adding doc routing, approvals, and logging. If you want a second opinion, Talk to an automation expert.
Once this is running, turning a YouTube link into a shareable, multilingual summary becomes a routine task, not a mini-project. The workflow handles the repeatable parts so you can spend your time polishing the message and publishing.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.