YouTube to ElevenLabs, voice IDs logged in Sheets
Voice cloning sounds simple until you try to operationalize it. Someone downloads a YouTube clip, someone else converts it, a third person uploads it to ElevenLabs, and then the all-important voice ID gets lost in a Slack thread you’ll never find again.
This ElevenLabs voice automation hits content leads first, but marketing ops and small agency owners feel the same pain. You want a clean “voice library” your team can reuse without second-guessing what’s approved or redoing work.
This workflow takes YouTube URLs from Google Sheets, creates cloned voices in ElevenLabs, then writes each voice ID back to the exact row it came from. You’ll see what it solves, how it runs, and what you need to launch it.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: YouTube to ElevenLabs, voice IDs logged in Sheets
flowchart LR
subgraph sg0["When clicking ‘Execute workflow’ Flow"]
direction LR
n0@{ icon: "mdi:play-circle", form: "rounded", label: "When clicking ‘Execute workf..", pos: "b", h: 48 }
n1@{ icon: "mdi:swap-vertical", form: "rounded", label: "Loop Over Items", pos: "b", h: 48 }
n2@{ icon: "mdi:database", form: "rounded", label: "Update row in sheet", pos: "b", h: 48 }
n3@{ icon: "mdi:database", form: "rounded", label: "Get videos", pos: "b", h: 48 }
n4["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Get Video ID"]
n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>From video to audio"]
n6["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Download audio"]
n7["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Create voice"]
n3 --> n1
n7 --> n2
n4 --> n5
n6 --> n7
n1 --> n4
n5 --> n6
n2 --> n1
n0 --> n3
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n2,n3 database
class n5,n6,n7 api
class n4 code
classDef customIcon fill:none,stroke:none
class n4,n5,n6,n7 customIcon
The Problem: Voice IDs Get Lost (and Work Gets Repeated)
When you’re building a repeatable content machine, “we cloned the voice already” is only useful if everyone can find it. In practice, teams create a voice in ElevenLabs, copy the ID somewhere “temporary,” and then move on to the next task. A week later, somebody needs that same voice for a new script, can’t locate the ID, and clones it again from scratch. Now you have duplicates, inconsistent naming, and a growing sense that voice cloning is more hassle than help, honestly.
The friction compounds. Here’s where it breaks down most often:
- Manual downloading and format conversion takes about 10 minutes per clip, and it’s easy to forget a step.
- Teams end up using random “voice names” because nobody knows the naming convention.
- Voice IDs live in chats, Notion pages, or someone’s bookmarks, which means reuse depends on memory.
- Duplicated voices rack up account clutter and force you to QA which one is “the right” clone.
The Solution: Clone Voices from YouTube and Track Them in Sheets
This workflow turns Google Sheets into your voice intake form and your voice registry at the same time. You add a YouTube URL and a voice name into a row, then the workflow looks for entries that have not been processed yet (specifically, where the “ELEVENLABS VOICE ID” cell is still empty). For each row, it extracts the YouTube video ID, converts that video to an audio file through RapidAPI, downloads the audio, and sends it straight into ElevenLabs to create a new voice. Once ElevenLabs responds with a voice_id, the workflow writes it back into the sheet so the ID is never separated from the source link and the intended name.
The workflow starts with a manual run in n8n, pulls only the rows that need work, then processes them in batches so you can feed it a list without babysitting. At the end, Google Sheets becomes the single place your team checks for “what voices exist” and “which ID do we use.”
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say your team wants to create 10 new voices for a quarter’s worth of ads. Manually, it’s usually 10 minutes to find the right YouTube section, download it, convert it, upload it, name it, and then store the voice ID somewhere, so you’re at about 2 hours total (and that’s before rework). With this workflow: you paste 10 YouTube links and 10 names into Google Sheets in about 10 minutes, run the workflow, and come back when it’s done. Your sheet now contains 10 voice IDs, ready to reuse, with no scavenger hunt later.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Google Sheets to store URLs, names, and voice IDs.
- RapidAPI (YouTube MP3 2025) to convert YouTube video to audio.
- ElevenLabs API key (generate it in your ElevenLabs account settings).
Skill level: Intermediate. You’ll paste API keys, connect Google, and test a run with a couple of rows.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
You trigger a run. This workflow uses a manual start inside n8n, which is great when you want control (run it after you’ve added a batch of new rows).
Google Sheets is scanned for “missing IDs.” It retrieves rows where your “ELEVENLABS VOICE ID” cell is empty, so it doesn’t waste time reprocessing voices you already created.
YouTube becomes clean audio. For each row, the workflow extracts the YouTube video identifier, sends it to RapidAPI for conversion, then downloads the resulting audio file for the next step.
ElevenLabs creates the voice and your sheet gets updated. n8n uploads the audio to ElevenLabs, captures the returned voice ID, and writes it back into the same row. That ID is now “official” for the team.
You can easily modify the voice naming to use the “VOICE NAME” column instead of a hardcoded name based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Manual Trigger
Set the workflow to start on demand so you can test the full pipeline before activating it.
- Add the Manual Start Trigger node as the entry point of the workflow.
- Confirm Manual Start Trigger has no parameters to configure.
- Connect Manual Start Trigger to Retrieve Video Rows.
Step 2: Connect Google Sheets
Pull YouTube URLs from your sheet and later write back the generated voice ID.
- Open Retrieve Video Rows and set Document to
1pZt5RZy6JkcnnxoSG1MFuIrNTLa0P4pVCptuk8uFJdI. - Set Sheet to
Foglio1(gid0). - Keep the filter on ELEVENLABS VOICE ID so only unprocessed rows are retrieved.
- Credential Required: Connect your googleSheetsOAuth2Api credentials to Retrieve Video Rows.
- Open Modify Sheet Row and confirm Operation is set to
update. - Set row_number to
{{ $('Batch Iterate Items').item.json.row_number }}and ELEVENLABS VOICE ID to{{ $json.voice_id }}. - Credential Required: Connect your googleSheetsOAuth2Api credentials to Modify Sheet Row.
Step 3: Set Up Processing and Audio Conversion
Iterate rows, extract video IDs, convert YouTube to audio, and fetch the audio file for voice cloning.
- Connect Retrieve Video Rows to Batch Iterate Items to process rows one at a time.
- Connect Batch Iterate Items (output 1) to Extract Video Identifier.
- In Extract Video Identifier, keep the JavaScript that parses
$json['YOUTUBE VIDEO']intovideo_id. - In Convert Video to Audio, set URL to
https://youtube-mp3-2025.p.rapidapi.com/v1/social/youtube/audioand Method toPOST. - In Convert Video to Audio, set body parameter id to
{{ $json.video_id }}and set headers x-rapidapi-host toyoutube-mp3-2025.p.rapidapi.comand x-rapidapi-key to[CONFIGURE_YOUR_API_KEY]. - In Fetch Audio File, set URL to
{{ $json.linkDownload }}and keep the response format asfile.
file and that {{ $json.linkDownload }} is present in the prior response.Step 4: Configure Voice Generation and Sheet Update
Send the audio to ElevenLabs, then write the returned voice ID back to your sheet.
- Connect Fetch Audio File to Generate Voice Profile.
- In Generate Voice Profile, set URL to
https://api.elevenlabs.io/v1/voices/addand Method toPOST. - Keep Content Type as
multipart-form-dataand set body parameter name toSample Voice. - Set body parameter files to use formBinaryData with Input Data Field Name as
data. - Credential Required: Connect your httpHeaderAuth credentials to Generate Voice Profile.
- Connect Generate Voice Profile to Modify Sheet Row, and then to Batch Iterate Items to continue iterating.
Step 5: Test and Activate Your Workflow
Run a manual test to validate every step from Google Sheets to ElevenLabs, then enable the workflow for production use.
- Click Execute Workflow from Manual Start Trigger.
- Verify that Retrieve Video Rows returns rows where ELEVENLABS VOICE ID is empty.
- Confirm Generate Voice Profile returns a
voice_idand that Modify Sheet Row updates the correct row_number with the new ID. - When results look correct, toggle the workflow to Active to use it in production.
Common Gotchas
- Google Sheets credentials can expire or need specific permissions. If things break, check the n8n “Credentials” section and confirm the connected Google account still has access to that spreadsheet.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- ElevenLabs voice cloning requires decent source audio. If you feed it a clip with multiple speakers or heavy background music, you’ll get a weak clone and end up redoing it anyway.
Frequently Asked Questions
About 30 minutes if your API keys and Sheet are ready.
No. You’ll connect accounts, paste API keys, and test with a couple of rows first.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in RapidAPI and ElevenLabs usage costs, which depend on how many voices you create.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and you should. Replace the hardcoded voice name inside the “Generate Voice Profile” HTTP request with the “VOICE NAME” value coming from Google Sheets. People also customize the workflow to skip certain rows (like “do not clone”), enforce naming rules, or write extra columns such as “Status” and “Error message.”
Most of the time it’s an invalid or expired ElevenLabs API key, so regenerate it and update the xi-api-key header in n8n. It can also fail if your plan doesn’t support voice cloning, or if the audio file download step returns an empty file. If you’re processing a big batch, rate limits can show up too, so slow the batch size down.
Dozens per run is typical, and more if you keep batch sizes reasonable and your APIs don’t throttle you. On n8n Cloud, capacity depends on your execution limits; on self-hosting, there’s no execution cap, but your server and API rate limits become the bottleneck. If you’re planning to process hundreds, run it in smaller batches and monitor failures so you don’t burn time retrying bad source clips.
Often, yes, because this flow depends on binary file handling (downloading audio, uploading audio) and looping through rows, which is where n8n tends to be more flexible and cost-effective. Zapier and Make can do it, but you may run into limits or pay more once you scale beyond a handful of items. If you only need one voice occasionally, those tools can feel simpler. If this is part of a production pipeline, n8n is usually the more comfortable long-term choice. Talk to an automation expert if you want help choosing.
Once your voice IDs are reliably logged, voice cloning stops being a one-off experiment and becomes a reusable asset. Set it up, run it in batches, and let Google Sheets be the source of truth.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.