YouTube + OpenAI: instant comment and thumbnail insights
You already have the data. It’s just trapped in YouTube. Comments are scattered, transcripts take effort to pull, and “is this thumbnail actually good?” becomes a guessing game you repeat for every upload.
This YouTube insight automation hits content marketers first, but creators building a weekly pipeline and analysts asked for “quick competitive takeaways” feel it too. Instead of reading 300 comments and hoping you didn’t miss the point, you get structured summaries you can use.
This guide breaks down what the workflow does, what you need, and how to adapt it so the insights match your channel, your niche, and your decisions.
How This Automation Works
Here’s the complete workflow you’ll be setting up:
n8n Workflow Template: YouTube + OpenAI: instant comment and thumbnail insights
flowchart LR
subgraph sg0["When chat message received Flow"]
direction LR
n0@{ icon: "mdi:brain", form: "rounded", label: "OpenAI Chat Model", pos: "b", h: 48 }
n1@{ icon: "mdi:wrench", form: "rounded", label: "get_channel_details", pos: "b", h: 48 }
n2@{ icon: "mdi:wrench", form: "rounded", label: "get_video_description", pos: "b", h: 48 }
n3@{ icon: "mdi:wrench", form: "rounded", label: "get_list_of_videos", pos: "b", h: 48 }
n4@{ icon: "mdi:wrench", form: "rounded", label: "get_list_of_comments", pos: "b", h: 48 }
n5@{ icon: "mdi:wrench", form: "rounded", label: "search", pos: "b", h: 48 }
n6@{ icon: "mdi:wrench", form: "rounded", label: "analyze_thumbnail", pos: "b", h: 48 }
n7@{ icon: "mdi:wrench", form: "rounded", label: "video_transcription", pos: "b", h: 48 }
n8@{ icon: "mdi:memory", form: "rounded", label: "Postgres Chat Memory", pos: "b", h: 48 }
n9@{ icon: "mdi:robot", form: "rounded", label: "AI Agent", pos: "b", h: 48 }
n10@{ icon: "mdi:play-circle", form: "rounded", label: "When chat message received", pos: "b", h: 48 }
n5 -.-> n9
n0 -.-> n9
n6 -.-> n9
n3 -.-> n9
n1 -.-> n9
n7 -.-> n9
n8 -.-> n9
n4 -.-> n9
n2 -.-> n9
n10 --> n9
end
subgraph sg1["Execute Workflow Flow"]
direction LR
n11["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Get Comments"]
n12@{ icon: "mdi:play-circle", form: "rounded", label: "Execute Workflow Trigger", pos: "b", h: 48 }
n13["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Get Channel Details"]
n14["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Get Video Description"]
n15@{ icon: "mdi:swap-vertical", form: "rounded", label: "Edit Fields", pos: "b", h: 48 }
n16["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Run Query"]
n17["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Get Videos by Channel"]
n18@{ icon: "mdi:swap-vertical", form: "rounded", label: "Response", pos: "b", h: 48 }
n19@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Switch", pos: "b", h: 48 }
n20["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Get Video Transcription"]
n21@{ icon: "mdi:robot", form: "rounded", label: "OpenAI", pos: "b", h: 48 }
n21 --> n18
n19 --> n13
n19 --> n14
n19 --> n11
n19 --> n16
n19 --> n17
n19 --> n21
n19 --> n20
n16 --> n18
n15 --> n18
n11 --> n15
n13 --> n18
n14 --> n18
n17 --> n18
n20 --> n18
n12 --> n19
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n10,n12 trigger
class n9,n21 ai
class n0 aiModel
class n1,n2,n3,n4,n5,n6,n7 ai
class n8 ai
class n19 decision
class n11,n13,n14,n16,n17,n20 api
classDef customIcon fill:none,stroke:none
class n11,n13,n14,n16,n17,n20 customIcon
Why This Matters: Turning YouTube noise into usable signals
When a video underperforms, you usually don’t lack opinions. You lack clarity. Comments hint at what people liked, what confused them, and what they wanted next, but scrolling them manually turns into a time sink fast. Transcripts are even worse. You know the best “next video ideas” are buried in what the audience reacted to, yet pulling transcripts, scanning for themes, and comparing competitor thumbnails becomes a mini research project every time. After a while, you stop doing it consistently. And then your strategy drifts.
The friction compounds. Here’s where it breaks down.
- You end up making content decisions from gut feel because reading every comment on multiple videos is exhausting.
- Transcript-based planning gets skipped, so your next outline isn’t anchored to what viewers actually responded to.
- Thumbnail feedback stays subjective, which means you keep repeating the same design mistakes.
- Insights live in random docs and tabs, so it’s hard to build a repeatable process your team can follow.
What You’ll Build: YouTube research summaries you can trust
This workflow turns a messy set of YouTube inputs into a clean, consistent insight report using OpenAI. It starts from a simple chat-style trigger, then routes your request based on what you ask for (comments, channel info, video details, search, transcripts, or thumbnail feedback). In the middle, it pulls the right data from YouTube using API requests, shapes the output so it’s readable, and hands the context to an AI agent that can summarize patterns instead of repeating raw text back at you. For thumbnail evaluation, it sends the thumbnail image to an OpenAI image review step so you get feedback that’s closer to “creative direction” than generic advice. At the end, you receive a response payload you can copy into briefs, planning docs, or reporting.
The workflow begins when you message the assistant and choose what you want analyzed. It then fetches the matching YouTube data and compiles it into an analysis-ready format. Finally, OpenAI returns a summary that’s designed to drive a decision, not just describe what happened.
What You’re Building
| What Gets Automated | What You’ll Achieve |
|---|---|
|
|
Expected Results
Say you publish two videos a week and review three things each time: top comments, transcript themes, and thumbnail feedback. Manually, you might spend about 30 minutes pulling comments and skimming, another 30 minutes tracking down transcript text, then 20 minutes debating the thumbnail. That’s roughly 2 hours per video. With this workflow, you send one chat request per task and wait for the summary, which is usually a few minutes of “hands-on” time total. You get most of that block back.
Before You Start
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- YouTube Data API for pulling channel, video, and comments data.
- OpenAI to summarize text and review thumbnail images.
- YouTube API key (get it from Google Cloud Console after enabling YouTube Data API).
Skill level: Intermediate. You’ll connect API keys, test a few requests, and tweak prompts, but you don’t need to write code.
Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).
Step by Step
A chat message triggers the request. You kick things off through the workflow’s chat trigger, which hands the request to a YouTube-focused AI agent.
Your request gets routed by intent. A simple switch routes commands to the right path: fetch channel info, pull a video’s details, collect comments, run a search, request a transcript, or analyze a thumbnail.
YouTube data is collected and cleaned. HTTP requests pull the raw data, then “set/edit fields” steps shape it into a predictable output (so summaries don’t change format every run).
OpenAI produces the insights. Text summaries run through the OpenAI chat model, while thumbnail critique uses an image review step, then everything is returned in a response payload you can save or forward.
You can easily modify the commands and output format to match your workflow, like pushing summaries into Google Sheets instead of returning them in chat. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Incoming Chat Trigger
Set up the chat entry point so user prompts can start the assistant and carry session context.
- Add the Incoming Chat Trigger node as the workflow trigger.
- Leave Options empty (default) unless you need custom behavior.
- Ensure the workflow is configured to accept chat messages; the node will expose a webhook ID for chat inputs.
Step 2: Connect OpenAI and Memory Services
Attach the language model and conversation memory that power the assistant’s responses.
- Open OpenAI Chat Engine and select the model credentials. Credential Required: Connect your openAiApi credentials.
- Open Postgres Conversation Memory and connect the database for chat history. Credential Required: Connect your postgres credentials.
- Set Session Key to
{{ $('Incoming Chat Trigger').item.json.sessionId }}and keep Session ID Type ascustomKey. - In YouTube Assistant Agent, ensure Text is set to
{{ $('Incoming Chat Trigger').item.json.chatInput }}so the agent reads the user message.
Step 3: Set Up the YouTube Assistant Agent and Tools
Connect the AI agent to its tool workflows so it can fetch channel details, video data, comments, search results, thumbnails, and transcriptions.
- Open YouTube Assistant Agent and confirm Agent is
openAiFunctionsAgentand Prompt Type isdefine. - Review the System Message to ensure it matches your use case for YouTube analysis.
- Attach all tool nodes to the agent: Fetch Channel Details Tool, Retrieve Video Details Tool, List Channel Videos Tool, Collect Video Comments Tool, Video Search Tool, Thumbnail Analysis Tool, and Transcribe Video Tool.
- Confirm each tool node’s Name matches its command (e.g., Fetch Channel Details Tool uses
get_channel_details).
Step 4: Configure the Subflow Router and HTTP Requests
Route tool commands to the correct API calls and ensure each request is authenticated.
- Use Subflow Trigger Start to receive tool requests, then send them to Route by Command.
- In Route by Command, verify each rule checks
{{ $('Subflow Trigger Start').item.json.command }}against the command keys likeget_channel_details,video_details,comments,search,videos,analyze_thumbnail, andvideo_transcription. - For all HTTP request nodes (Fetch Channel Info API, Fetch Video Info API, Fetch Comments API, Execute Search Request, Fetch Channel Videos API, Request Video Transcript), set Authentication to
genericCredentialTypeand select httpQueryAuth. Credential Required: Connect your httpQueryAuth credentials. - Confirm the API endpoints are correct, for example Fetch Channel Info API uses
https://www.googleapis.com/youtube/v3/channelsand Request Video Transcript useshttps://api.apify.com/v2/acts/dB9f4B02ocpTICIEY/run-sync-get-dataset-items.
Step 5: Build Response Formatting and Image Analysis
Format outputs and enable image-based analysis for thumbnail reviews.
- In Compose Comment Output, keep the response assignment as the JSON string formatter:
{{ JSON.stringify(` Comments: ${$json.items.map(item => { const topLevelComment = `${item.snippet.topLevelComment.snippet.authorDisplayName}: ${item.snippet.topLevelComment.snippet.textOriginal}`; const replies = item.replies?.comments.map(reply => `${reply.snippet.authorDisplayName}: ${reply.snippet.textOriginal}` ).join('\n') || ''; return [topLevelComment, replies].filter(Boolean).join('\n'); }).join('\n\n')} `) }}. - In OpenAI Image Review, set Text to
{{ $('Subflow Trigger Start').item.json.query.prompt }}and Image URLs to{{ $('Subflow Trigger Start').item.json.query.url }}. - Connect OpenAI Image Review to Return Response Payload to send the analysis back. Credential Required: Connect your openAiApi credentials.
- In Return Response Payload, keep the response assignment set to
{{$json}}so each API output passes through unmodified.
Step 6: Test and Activate Your Workflow
Validate the end-to-end behavior and enable the workflow for production use.
- Click Execute Workflow and send a test chat message to Incoming Chat Trigger, such as a command for search or channel details.
- Confirm that YouTube Assistant Agent invokes the right tool, then Subflow Trigger Start passes to Route by Command and the matching HTTP node runs.
- Verify the final output appears in Return Response Payload with a populated
responseobject. - When testing is successful, switch the workflow to Active to accept live chat requests.
Troubleshooting Tips
- YouTube Data API credentials can expire or be restricted by project settings. If things break, check your Google Cloud Console API key restrictions and quota page first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- OpenAI prompts that stay generic produce generic insights. Add a short “brand + niche + goal” block early so you aren’t rewriting every summary.
Quick Answers
About an hour if you already have your API keys.
No. You’ll mainly connect accounts, add API keys, and test a few sample videos. The “work” is choosing good prompts and outputs.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI API costs and YouTube Data API quota usage.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and you probably should. Most teams customize the Switch “Route by Command” step to match their own commands, then adjust the “Return Response Payload” formatting so it fits a brief template. You can also swap the destination entirely by sending the final output to Google Sheets or Airtable instead of returning it in chat. Thumbnail critique can be tightened by adding a few rules in the OpenAI Image Review prompt (for example, “compare against my last three thumbnails” or “optimize for mobile readability”).
Usually it’s an API key issue: the YouTube Data API isn’t enabled, the key is restricted incorrectly, or you’ve hit quota for the day. Check the Google Cloud Console quota metrics, then re-test the exact HTTP request with one known video ID. If only transcript requests fail, it’s often because the video has no transcript available or it’s region-restricted.
If you self-host n8n, there’s no execution limit, but you’re still limited by YouTube API quota and how much data you ask for per run. On n8n Cloud, your plan sets monthly executions, and this workflow can burn through them if you batch lots of videos at once. Practically, many teams run it a few times per week per channel, then schedule a bigger “research sweep” once a month.
Often, yes. n8n is better when you want branching logic (“if the command is X, do Y”), more control over how data is shaped, and the option to self-host. It’s also a more natural fit for AI-agent style workflows because you can keep context and memory in the flow. Zapier or Make can still work if you only want one simple action, like “new comment → send email,” and you never plan to expand it. If you’re on the fence, Talk to an automation expert and describe what “done” looks like for your team.
Once this is running, YouTube research stops being a “someday” task. You get cleaner decisions, faster, and your content planning gets a lot less chaotic.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.