OpenAI + Google Drive: ready to upload YouTube videos

Publishing YouTube videos sounds simple until you’re stuck juggling scripts, stock clips, voiceovers, renders, and file exports. One missed step and you’re redoing work you already “finished” yesterday.

Content creators feel it first. A marketing manager trying to keep a channel consistent feels it too. And frankly, an agency running faceless channels can drown in it. This YouTube video automation turns that messy chain into a repeatable pipeline.

You’ll set up a workflow that writes a narration script with OpenAI, generates voice audio, renders a full video with transitions, and drops the final file into Google Drive ready to upload.

How This Automation Works

Here’s the complete workflow you’ll be setting up:

n8n Workflow Template: OpenAI + Google Drive: ready to upload YouTube videos

Click to explore

flowchart LR

    subgraph sg0["Schedule Flow"]
        direction LR
        n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Search Pixabay Videos"]
        n1@{ icon: "mdi:cog", form: "rounded", label: "Upload Voiceover to Drive", pos: "b", h: 48 }
        n2@{ icon: "mdi:cog", form: "rounded", label: "Make Voiceover Public", pos: "b", h: 48 }
        n3@{ icon: "mdi:swap-vertical", form: "rounded", label: "Set Video Parameters1", pos: "b", h: 48 }
        n4["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Extract Video URLs1"]
        n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Build Shotstack JSON1"]
        n6["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Submit Render Job1"]
        n7@{ icon: "mdi:cog", form: "rounded", label: "Wait for Render1", pos: "b", h: 48 }
        n8["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Check Render Status1"]
        n9@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Is Render Complete?1", pos: "b", h: 48 }
        n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Download Final Video1"]
        n11@{ icon: "mdi:cog", form: "rounded", label: "Save to Google Drive1", pos: "b", h: 48 }
        n12@{ icon: "mdi:play-circle", form: "rounded", label: "Schedule Trigger", pos: "b", h: 48 }
        n13["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>HTTP Request"]
        n14["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/openAi.dark.svg' width='40' height='40' /></div><br/>Generate Script2"]
        n15["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/merge.svg' width='40' height='40' /></div><br/>Merge"]
        n16@{ icon: "mdi:swap-vertical", form: "rounded", label: "Loop Over Items", pos: "b", h: 48 }
        n15 --> n5
        n13 --> n1
        n16 --> n15
        n16 --> n2
        n14 --> n13
        n12 --> n3
        n12 --> n14
        n7 --> n8
        n6 --> n7
        n4 --> n15
        n8 --> n9
        n9 --> n10
        n9 --> n7
        n5 --> n6
        n10 --> n11
        n2 --> n16
        n0 --> n4
        n3 --> n0
        n1 --> n16
    end

    %% Styling
    classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
    classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
    classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
    classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
    classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef disabled stroke-dasharray: 5 5,opacity: 0.5
    class n12 trigger
    class n9 decision
    class n0,n6,n8,n10,n13 api
    class n4,n5 code
    classDef customIcon fill:none,stroke:none
    class n0,n4,n5,n6,n8,n10,n13,n14,n15 customIcon

Why This Matters: YouTube Production Gets Repetitive Fast

Most YouTube “production” time isn’t creative work. It’s the annoying glue steps between tools: writing a script, finding stock clips that fit, generating a voiceover, stitching everything together, waiting for a render, downloading the export, uploading to storage, and labeling files so you can find them later. Do that once and it’s fine. Do it three times a week and the friction becomes the job. Worse, you end up cutting corners: reusing weak scripts, grabbing random B‑roll, or publishing late because the render failed and you didn’t notice for an hour.

It adds up fast. Here’s where it breaks down in real life.

Stock footage hunting turns into endless tabs, and you still second-guess the choices.
Voiceover creation is “quick” until you’re exporting files, renaming them, and re-uploading because something didn’t sync.
Rendering introduces dead time, so you context-switch, then forget to check status and lose the day.
Final video files end up scattered, which makes it harder to build a consistent publishing cadence.

What You’ll Build: An Automated Faceless Video Pipeline

This workflow gives you a start-to-finish production loop that runs on a schedule inside n8n. It begins by generating a narration script with OpenAI, then sends that script to ElevenLabs to create professional voice audio. In parallel, it fetches relevant stock clips from Pixabay, prepares them with consistent clip settings, and builds a render payload for Shotstack. Shotstack handles the heavy lifting by assembling clips, adding transitions, syncing the voiceover, and rendering a finished video. Once the render is complete, the workflow retrieves the final file and archives it to Google Drive so you have a clean, ready-to-upload deliverable waiting for you.

The workflow starts on a scheduled trigger. From there, OpenAI generates the script, ElevenLabs creates the audio, and Pixabay supplies the visuals. Finally, Shotstack renders the full video and n8n saves it into Google Drive so you can publish without manual editing.

What You’re Building

What Gets Automated

What You’ll Achieve

Drafting a narration script with OpenAI on a schedule.
Generating voice audio via ElevenLabs and managing the files.
Pulling stock clips from Pixabay and turning them into usable video links.
Submitting a Shotstack render, waiting, checking status, then downloading the final video.

Produce a complete 5–10 minute video without opening an editor.
Spend your time reviewing ideas instead of cutting timelines.
Keep a consistent pipeline, because the workflow runs even on busy weeks.
Reduce “where did that file go?” chaos by archiving to Google Drive automatically.
Scale output by running more scheduled executions, not by hiring another pair of hands.

Expected Results

Say you publish three 5–10 minute videos a week. Manually, it’s common to spend about 2 hours per video between script, stock footage selection, voiceover export, editing, rendering, and file handling. That’s roughly 6 hours weekly, and it’s easy for it to creep higher. With this workflow, your “hands-on” time can drop to about 15 minutes per video to review the script and spot-check the output, while rendering happens in the background. You get most of a workday back every week.

Before You Start

n8n instance (try n8n Cloud free)
Self-hosting option if you prefer (Hostinger works well)
OpenAI for narration script generation.
Google Drive to store voiceovers and final videos.
Pixabay API key (get it from your Pixabay developer dashboard).
ElevenLabs API key (get it from your ElevenLabs account settings).
Shotstack API key (get it from the Shotstack developer dashboard).

Skill level: Intermediate. You’ll connect APIs, paste credentials, and test a few runs, but you won’t be writing an app.

Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).

Step by Step

A scheduled run kicks everything off. The workflow uses a schedule trigger so your production can run daily, a few times a week, or whenever your content plan demands it.

The narration is generated and turned into audio. OpenAI drafts a voiceover script, then ElevenLabs converts that text into voice audio. The workflow stores the audio in Google Drive and publishes an accessible file link so the renderer can use it.

Stock clips are fetched and prepared for the video. n8n requests clips from Pixabay, applies consistent settings, and derives usable clip links. A merge step combines the “voice branch” and the “video branch” so everything is ready for assembly.

Shotstack renders, then the final file is archived. The workflow builds a render payload, submits it to Shotstack, waits, polls status, and only proceeds once rendering is complete. Then it downloads the finished video and saves it back to Google Drive as a clean deliverable.

You can easily modify the schedule and video style settings to fit your niche and publishing cadence. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Schedule Trigger

Set the workflow cadence so video production runs automatically.

Add and open Scheduled Run Trigger.
Define the schedule in rule.interval based on your desired frequency (the current configuration is empty).
Note the parallel execution: Scheduled Run Trigger outputs to both Assign Clip Settings and Draft Narration Script in parallel.

Keep the schedule conservative during testing to avoid excessive API usage while iterating.

Step 2: Connect Pixabay and Define Clip Parameters

This step sets the topic and clip count and fetches matching footage from Pixabay.

Open Assign Clip Settings and set fields: topic to productivity business workflow, videoDuration to 300, and numberOfClips to 10.
Open Fetch Pixabay Clips and set URL to https://pixabay.com/api/videos/ and Method to POST.
Under Query Parameters, set key to [CONFIGURE_YOUR_API_KEY], q to {{ $json.topic }}, and per_page to {{ $json.numberOfClips }}.
Leave video_type as all.
Open Derive Clip Links to confirm it parses Pixabay results into a list of clip URLs and metadata.

Credential Required: Connect your queryAuth credentials in Fetch Pixabay Clips (Pixabay API key is required and not preconfigured).

Step 3: Set Up AI Script Generation and Voice Synthesis

This branch creates the narration script and converts it to audio.

Open Draft Narration Script and review the prompt content for your desired tone and length.
Credential Required: Connect your openAiApi credentials in Draft Narration Script.
Open Generate Voice Audio and set URL to https://api.elevenlabs.io/v1/text-to-speech/[YOUR_ID].
Set the body parameters: text to {{ $json.choices[0].message.content }}, voice_settings.stability to 0.5, voice_settings.similarity_boost to 0.75, voice_settings.style to 0.0, and voice_settings.use_speaker_boost to true.
Credential Required: Connect your httpHeaderAuth credentials in Generate Voice Audio for ElevenLabs.

Step 4: Store and Share Voiceover Files in Drive

This step saves the generated audio and creates a shareable link for rendering.

Open Store Voiceover in Drive and set Name to voiceover_{{ $now.toFormat('yyyy-MM-dd_HH-mm-ss') }}.mp3.
Credential Required: Connect your googleDriveOAuth2Api credentials in Store Voiceover in Drive.
Open Publish Voiceover Access and confirm operation is set to share with permissions role reader and type anyone.
Credential Required: Connect your googleDriveOAuth2Api credentials in Publish Voiceover Access.
Verify Iterate Voice Files is connected to both Store Voiceover in Drive and Publish Voiceover Access so it passes the file ID to the merge step.

Step 5: Build and Submit the Render Payload

Combine the clip URLs with the voiceover to generate the Shotstack render request.

Open Combine Branches and ensure mode is set to chooseBranch to merge the clip list with the voiceover output.
Open Compose Render Payload and confirm it references the Drive file ID via $('Store Voiceover in Drive').first().json.id.
Open Submit Render Request and set URL to https://api.shotstack.io/v1/render and Method to POST.
Set jsonBody to {{ JSON.stringify($json) }} and confirm specifyBody is json.
Credential Required: Connect your httpHeaderAuth credentials in Submit Render Request (Shotstack API key).

Because Compose Render Payload uses the file ID from Store Voiceover in Drive, ensure the Drive file is available before rendering.

Step 6: Poll Render Status and Archive the Final Video

This loop waits for rendering to finish, then downloads and archives the final video.

Open Delay for Render and set unit to seconds and amount to 60.
Open Query Render Status and set URL to https://api.shotstack.io/v1/render/{{ $json.response.id }}.
Credential Required: Connect your httpHeaderAuth credentials in Query Render Status.
Open Render Completion Check and ensure the condition compares {{ $json.success }} to true.
Open Retrieve Final Video and set URL to {{ $json.response.url }} with response format file.
Open Archive Video to Drive and set Name to final_video_{{ $now.toFormat('yyyy-MM-dd_HH-mm-ss') }}.mp4.
Credential Required: Connect your googleDriveOAuth2Api credentials in Archive Video to Drive and update folderId from [YOUR_ID] to a valid Drive folder.

⚠️ Common Pitfall: If the render is not complete, Render Completion Check routes back to Delay for Render. Ensure your Shotstack API key is correct to avoid endless loops.

Step 7: Test and Activate Your Workflow

Validate the full pipeline before enabling scheduled runs.

Click Execute Workflow to run a manual test from Scheduled Run Trigger.
Confirm that Fetch Pixabay Clips returns videos and Derive Clip Links outputs valid URLs.
Verify Draft Narration Script produces a script and Generate Voice Audio returns audio data.
Check that Submit Render Request returns a render ID and Retrieve Final Video downloads the file.
Ensure Archive Video to Drive uploads the final MP4 to your target folder.
Toggle the workflow to Active to allow scheduled execution.

🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Troubleshooting Tips

Google Drive credentials can expire or need specific permissions. If things break, check the Google connection inside n8n’s Credentials list first.
If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.

Quick Answers

What’s the setup time for this YouTube video automation automation?

About 45 minutes if your API keys are ready.

Is coding required for this YouTube video automation?

No. You will connect accounts, add API keys, and adjust a few fields. The workflow already includes the logic for merging branches and preparing the render payload.

Is n8n free to use for this YouTube video automation workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI, ElevenLabs, and Shotstack usage-based API costs.

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

Can I modify this YouTube video automation workflow for different use cases?

Yes, and you should. Most people tweak the OpenAI “Draft Narration Script” prompt for niche tone, adjust “Assign Clip Settings” to match their visual pacing, and swap Pixabay queries to fit the topic. If you want different voices, you can change the ElevenLabs request in “Generate Voice Audio” without rebuilding the workflow.

Why is my Google Drive connection failing in this workflow?

Usually it’s expired credentials or missing Drive scopes. Reconnect Google Drive in n8n, then re-check the “Store Voiceover in Drive,” “Publish Voiceover Access,” and “Archive Video to Drive” nodes to confirm they point at the refreshed credential. If it fails only on some runs, look for file/folder permissions on the specific Drive location you selected.

What volume can this YouTube video automation workflow process?

If you self-host n8n, there’s no execution cap (it mainly depends on your server and API limits). On n8n Cloud, your monthly executions depend on your plan, and video rendering time will be your practical bottleneck because Shotstack jobs can queue. In most setups, running a few videos per day is realistic as long as you pace requests and handle render waiting properly.

Is this YouTube video automation automation better than using Zapier or Make?

For multi-step video creation, yes, but it depends on your tolerance for complexity. n8n is better when you need branching, merging, waiting for renders, and repeatedly checking status without paying extra for every tiny step. It also gives you a self-hosting option, which matters once you’re running frequent scheduled jobs. Zapier or Make can be fine for simpler “upload this file” automations, but full rendering pipelines get expensive and awkward there. If you want help choosing, Talk to an automation expert.

Once this is running, you stop “producing videos” the hard way and start reviewing finished drafts. Set it up, let it run, and use the extra hours on topics, thumbnails, and distribution.

OpenAI + Google Drive: ready to upload YouTube videos

How This Automation Works

n8n Workflow Template: OpenAI + Google Drive: ready to upload YouTube videos

Why This Matters: YouTube Production Gets Repetitive Fast