YouTube to Gmail, ready made summaries in your inbox
You rewatch a 40-minute YouTube video because you missed one key detail. Then you try to pull out quotes, timestamps, and “the good parts” for a brief, and it turns into a messy doc you don’t trust.
This YouTube summary automation hits video marketers first, but content leads and research folks feel it too. You will turn a single YouTube link into a clean summary, transcript, and timestamped notes that land in Gmail and get archived automatically.
Below you’ll see how the workflow runs in n8n, what it produces, and how to adapt it so the output matches your team’s exact use case.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: YouTube to Gmail, ready made summaries in your inbox
flowchart LR
subgraph sg0["Initiate Workflow Form Flow"]
direction LR
n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/form.svg' width='40' height='40' /></div><br/>Initiate Workflow Form"]
n1@{ icon: "mdi:swap-vertical", form: "rounded", label: "Setup Variables", pos: "b", h: 48 }
n2["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Build YouTube API URL"]
n3["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Retrieve Video Details"]
n4@{ icon: "mdi:swap-vertical", form: "rounded", label: "Draft Audience Meta Prompt", pos: "b", h: 48 }
n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Fetch Audience Metadata"]
n6@{ icon: "mdi:swap-vertical", form: "rounded", label: "Parse Metadata Payload", pos: "b", h: 48 }
n7["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/merge.svg' width='40' height='40' /></div><br/>Combine Streams"]
n8@{ icon: "mdi:swap-vertical", form: "rounded", label: "Assemble Prompt Templates", pos: "b", h: 48 }
n9["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Select Prompt by Type"]
n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Request Video Info by Type"]
n11["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/markdown.dark.svg' width='40' height='40' /></div><br/>Render Markdown as HTML"]
n12@{ icon: "mdi:cog", form: "rounded", label: "Store Text to Drive", pos: "b", h: 48 }
n13@{ icon: "mdi:message-outline", form: "rounded", label: "Send Gmail HTML Report", pos: "b", h: 48 }
n14["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/form.svg' width='40' height='40' /></div><br/>Return HTML to User"]
n7 --> n8
n1 --> n2
n1 --> n4
n0 --> n1
n8 --> n9
n2 --> n3
n6 --> n7
n11 --> n13
n11 --> n14
n9 --> n10
n3 --> n7
n4 --> n5
n5 --> n6
n10 --> n11
n10 --> n12
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n3,n5,n10 api
class n2,n9 code
classDef customIcon fill:none,stroke:none
class n0,n2,n3,n5,n7,n9,n10,n11,n14 customIcon
The Problem: Turning YouTube into usable notes is slow
Watching a video is easy. Extracting something you can use is the painful part. You pause, rewind, copy a line into a doc, lose the timestamp, then realize the “one important point” was actually spread across five different moments. If you’re doing this for competitive research, campaign ideas, training, or client work, the cost isn’t just time. It’s also missed insights, inconsistent briefs, and the nagging feeling you should rewatch it “to be safe.”
It adds up fast. Here’s where it breaks down in real life.
- You end up rewatching the same sections just to capture accurate wording and context.
- Timestamps get recorded inconsistently, which makes it hard to reference clips later.
- Different people summarize differently, so your team’s notes aren’t comparable from one video to the next.
- Even when you do a good write-up, it lives in someone’s inbox or desktop and disappears.
The Solution: Automated YouTube summaries sent to Gmail (and saved)
This n8n workflow turns a YouTube video ID into a structured, ready-to-use report without you scrubbing through the timeline. You start by submitting a YouTube ID in a simple form and choosing what you want back (summary, transcript, timestamps, scene descriptions, or clip ideas). The workflow pulls the video’s details through the YouTube API, then asks an AI model to extract “audience metadata” like tone, topics, and what drives engagement. Next it uses the prompt type you picked to generate the final output, renders that output into a clean HTML email, and sends it through Gmail. At the same time, it saves a text copy into Google Drive so you have an archive you can search later.
The workflow starts with a form submission and a prompt-type dropdown. Then it gathers video details, generates the right analysis, and formats it into something you can skim. Finally, Gmail delivers the report and Google Drive stores the same content for future reuse.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you review 5 competitor videos a week for hooks and structure. Manually, you might spend about 45 minutes per video between watching, rewinding, and writing notes, so you lose roughly 4 hours weekly. With this workflow, you submit the YouTube ID and pick “Clips” or “Timestamps” in under a minute, wait a few minutes for processing, then the full report shows up in Gmail and gets saved to Drive. That’s most of an afternoon back, and you still have the receipts.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Gmail to email the final HTML report.
- Google Drive to store the report as a text file.
- Google API key (create it in Google Cloud Console).
Skill level: Intermediate. You’ll connect Google credentials and add an API key as an environment variable, but you won’t be writing a bunch of code.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
You submit a YouTube ID and choose an output type. The workflow starts from a form trigger, so you can run it on demand instead of waiting for a schedule.
Video details and “audience metadata” get collected. n8n builds the YouTube API URL, pulls the video’s information, then sends a second request to generate structured metadata (topic, tone, format, engagement drivers) that improves the final write-up.
The workflow picks the right prompt template. A small logic step selects the correct prompt for “Summary,” “Transcribe,” “Timestamps,” “Scene,” “Clips,” or the default analysis, then requests the final content from the AI model.
The report is formatted and delivered. The output is rendered from markdown into HTML, emailed through Gmail, and also saved as a text file in Google Drive. You can read it immediately, and you can find it again later.
You can easily modify the prompt templates to match your brand voice or change where reports are stored based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Form Trigger
Set up the form that initiates the workflow and captures the YouTube video ID and analysis type.
- Add the Initiate Workflow Form node and keep Form Title set to
Extract Information from YouTube Videos. - Under Form Fields, keep the Prompt Type dropdown options (default, transcribe, timestamps, summary, scene, clips).
- Ensure the YouTube Video Id field is required and has the placeholder
wBuULAoJxok. - Set Response Mode to
lastNodeso the final HTML response is shown to the user.
Step 2: Connect Input Variables and YouTube Data Fetching
Map form input into working variables and prepare the YouTube Data API request. Setup Variables outputs to both Build YouTube API URL and Draft Audience Meta Prompt in parallel.
- In Setup Variables, set google_api_key to
{{ $env.GOOGLE_API_KEY }}. - Set youtube_url to
=https://www.youtube.com/watch?v={{ $json["YouTube Video Id"] }}. - Set prompt_type to
{{ $json["Prompt Type"] }}and video_id to{{ $json["YouTube Video Id"] }}. - In Build YouTube API URL, keep the provided JavaScript that constructs the API URL using the dynamic key and ID.
- In Retrieve Video Details, set URL to
{{ $json.youtubeUrl }}.
GOOGLE_API_KEY. If it’s missing, Build YouTube API URL will throw an error.
Step 3: Generate Audience Metadata and Merge Streams
Create a metadata analysis prompt, call the Gemini API, parse the response, and merge it with YouTube details.
- In Draft Audience Meta Prompt, keep the long-form prompt text that outputs meta_prompt as a JSON-only response.
- Configure Fetch Audience Metadata with URL set to
=https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key={{ $('Setup Variables').item.json.google_api_key }}and Method set toPOST. - Set the JSON Body to the existing expression that includes
$json.meta_promptand{{ $('Setup Variables').item.json.youtube_url }}. - In Parse Metadata Payload, set text to
{{ $json.candidates[0].content.parts[0].text.replaceAll('```json', '').replaceAll('```', '') }}. - In Combine Streams, keep Mode set to
combineand Combine By set tocombineByPosition.
Step 4: Assemble and Select Prompt Templates
Create XML-based prompt templates and extract the correct prompt and model based on the user’s selection.
- In Assemble Prompt Templates, keep the full XML template in content that references
{{ $json.text.content_purpose }},{{ $json.text.key_topics[0] }},{{ $json.text.video_tone }}, and related fields. - In Select Prompt by Type, keep the JavaScript that reads
$node["Setup Variables"].json.prompt_typeand extracts the matching<prompt>and<model>.
Step 5: Request AI Output and Render HTML
Send the selected prompt to Gemini, convert the response to HTML, and prepare output destinations. Request Video Info by Type outputs to both Render Markdown as HTML and Store Text to Drive in parallel.
- In Request Video Info by Type, set URL to
=https://generativelanguage.googleapis.com/v1beta/models/{{ $json.model }}:generateContent?key={{$('Setup Variables').item.json.google_api_key }}. - Keep the JSON Body expression that injects
{{ JSON.stringify($json.prompt) }}and the YouTube file URI{{ $('Setup Variables').item.json.youtube_url }}. - In Render Markdown as HTML, set Mode to
markdownToHtmland Markdown to{{ $json.candidates[0].content.parts[0].text }}. - Render Markdown as HTML outputs to both Send Gmail HTML Report and Return HTML to User in parallel.
Step 6: Configure Output Destinations
Store the analysis in Google Drive, email the HTML report, and return a response to the form user.
- In Store Text to Drive, set Name to
{{ $('Initiate Workflow Form').item.json['YouTube Video Id'] }} - {{ $now }}and keep Operation ascreateFromText. - Set Content to the existing expression that includes the parsed metadata, AI response, and
{{ $('Combine Streams').item.json.items.toJsonString() }}. - Credential Required: Connect your googleDriveOAuth2Api credentials in Store Text to Drive.
- In Send Gmail HTML Report, replace Send To with your email and keep Subject set to
{{ $('Initiate Workflow Form').item.json['YouTube Video Id'] }} - {{ $('Parse Metadata Payload').item.json.text.key_topics[0] }}. - Credential Required: Connect your gmailOAuth2 credentials in Send Gmail HTML Report.
- In Return HTML to User, keep Respond With set to
showTextand Response Text set to the existing HTML expression.
Step 7: Test and Activate Your Workflow
Run a manual test to confirm the workflow generates the report, stores it, and returns HTML to the form.
- Click Execute Workflow and submit the Initiate Workflow Form with a valid YouTube video ID and prompt type.
- Confirm that Retrieve Video Details and Fetch Audience Metadata return data without errors.
- Verify that Send Gmail HTML Report sends an email containing the thumbnail, title, and HTML output.
- Check Google Drive for a new file created by Store Text to Drive with the expected content.
- Once successful, toggle the workflow to Active to allow form submissions in production.
Common Gotchas
- Google (Gmail/Drive) credentials can expire or need specific permissions. If things break, check the n8n Credentials screen and your Google account’s connected-app access first.
- YouTube API responses sometimes differ by video (age-restricted, disabled captions, limited metadata). If a request fails, confirm the video ID is valid and that your Google API key has YouTube Data API enabled.
- Default prompts in AI nodes are generic. Add your brand voice and the exact format you want (bullets, sections, clip rules) early or you will be tweaking every email by hand.
Frequently Asked Questions
About 30 minutes if your Google credentials and API key are ready.
No. You’ll mostly connect accounts and paste in an API key. There is a small “prompt type” selector already built, so you’re configuring, not programming.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in AI API usage, which is usually a few cents per run for summaries and transcripts.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, but you’ll swap the delivery step. Replace the Send Gmail HTML Report node with a Slack message node or a Notion page-create node, then keep the same Render Markdown as HTML output (or switch to plain text). Common tweaks include adding a “channel” dropdown, changing the summary structure, and saving the Drive file into a client-specific folder.
Usually it’s expired Google OAuth credentials inside n8n, so the Gmail node can’t send. Reconnect the Google credential, confirm Gmail send permissions, and make sure you’re using the right Google account. If it fails only on certain videos, check upstream nodes too, because an empty or malformed HTML payload can cause Gmail to reject the message.
On n8n Cloud you’re mainly limited by monthly executions on your plan, and on self-hosted you’re limited by your server. Practically, most small teams run dozens to a few hundred videos a month without thinking about it, as long as your YouTube API quota and AI usage are sized appropriately.
Often, yes. This workflow has branching logic (different prompt types), multi-step data assembly, and formatting, which n8n handles comfortably without forcing you into pricey “premium” steps. Self-hosting also changes the economics if you plan to run this a lot. Zapier or Make can still be fine for a simple “YouTube link in, summary out” flow, but once you want metadata, multiple output formats, and Drive storage, complexity creeps in fast. If you want a quick recommendation for your situation, Talk to an automation expert.
This is the kind of workflow you set up once and quietly benefit from every week. Let the automation handle the rewinds, the formatting, and the filing so you can focus on what the video actually means.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.