🔓 Unlock all 10,000+ workflows & prompts free Join Newsletter →
✅ Full access unlocked — explore all 10,000 AI workflow and prompt templates Browse Templates →
Home n8n Workflow
January 22, 2026

FileFlows + OpenAI Whisper, transcripts emailed via Gmail

Lisa Granqvist Partner Workflow Automation Expert

You finally get the audio file. Then you remember the file size limit, the upload failures, and the awkward “part 1 / part 2 / part 3” transcript stitching that always breaks at the worst moment.

This Whisper transcript email automation hits podcasters hardest, but marketers repurposing interviews and ops folks capturing internal calls feel it too. The outcome is simple: long recordings turn into one clean transcript in your inbox, without babysitting the process.

Below you’ll see how the workflow handles splitting, transcription, and delivery, plus what you need to run it reliably at “real-world” audio lengths.

How This Automation Works

The full n8n workflow, from trigger to final output:

n8n Workflow Template: FileFlows + OpenAI Whisper, transcripts emailed via Gmail

The Problem: Long audio breaks “simple” transcription

Whisper is great, until your recording is longer than a quick clip. A one-hour MP3 can easily exceed the 25 MB upload limit, so you end up hunting for tools to split the file, guessing chunk lengths, and hoping nothing drifts out of order. If you’re doing this for clients or a team, it gets worse: people email you huge attachments, you re-upload the wrong version, and suddenly a “fast transcript” task turns into an afternoon of retries and cleanup. It’s not hard work. It’s fragile work.

The friction compounds. Here’s where it breaks down.

  • A single recording can require multiple manual splits just to fit the API limit.
  • Upload failures force you to restart, and you often don’t notice until much later.
  • Transcripts arrive as separate chunks, so you spend time stitching and reformatting.
  • Delivery becomes another job: copying text, saving files, and emailing the right person.

The Solution: Split, transcribe, merge, and email automatically

This workflow turns “long audio transcription” into a simple intake and an automatic delivery. It starts when someone uploads an MP3 via a web form and includes the email address where the transcript should go. n8n then prepares the file for safe handling by splitting it into small 4 MiB upload parts and sending them to FileFlows. FileFlows, using FFmpeg, segments the audio into 15-minute chunks so every piece stays comfortably under Whisper’s size limit. Each segment is transcribed through the OpenAI Whisper API (French by default, but you can change the language), and then n8n merges the text back into one coherent transcript. Finally, Gmail sends the finished transcript automatically, or sends a clear error email if something fails.

The workflow starts with a form submission in n8n. FileFlows does the heavy lifting for audio splitting, so Whisper only sees safe-sized segments. When transcription is complete, n8n assembles a single text file and emails it out, which means the “last mile” is handled too.

What You Get: Automation vs. Results

Example: What This Looks Like

Say you transcribe two 60-minute interviews each week for a podcast. Manually, you might spend about 30 minutes splitting files, uploading parts, waiting, and stitching text back together, so that’s roughly 1 hour of admin work weekly before you even edit the content. With this workflow, the “human time” is closer to 5 minutes per file (upload + email), and processing runs in the background. You get the transcript in about 10–15 minutes per hour of audio, delivered automatically to the right inbox.

What You’ll Need

  • n8n instance (try n8n Cloud free)
  • Self-hosting option if you prefer (Hostinger works well)
  • FileFlows for audio splitting and orchestration.
  • OpenAI Whisper API to transcribe each audio segment.
  • Gmail account to email transcripts and error notices.
  • OpenAI API key (get it from your OpenAI dashboard).

Skill level: Intermediate. You’ll connect credentials, set a few URLs/paths, and confirm FileFlows can reach your storage and n8n.

Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).

How It Works

A user submits a form with an MP3 and an email address. That form trigger kicks off the run and stores basic settings (like the target language and the callback details for FileFlows).

The MP3 is prepared for upload and sent to FileFlows in safe-sized parts. n8n splits the binary file into 4 MiB chunks, loops over them in batches, and uploads each part via HTTP so huge files don’t fail halfway through.

FileFlows splits the audio, and n8n transcribes each segment with Whisper. After FileFlows creates 15-minute segments, n8n waits for the callback, extracts the segment list, then loops through it. Whisper transcribes each piece, and a throttle wait helps avoid rate spikes when you’re processing a lot.

Everything is merged, turned into a text file, and emailed via Gmail. When all segments are done, n8n combines the transcript text in order, converts it into a downloadable file, and sends the final email. If splitting or transcription fails, the workflow sends an error notice instead of leaving you guessing.

You can easily modify the language and segment duration to match your needs. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Form Trigger

Set up the public form that collects the audio file and recipient email address to start the workflow.

  1. Add the Incoming Form Capture node as the trigger.
  2. Set Form Title to Audio Transcription.
  3. Set Form Description to Select an audio file to transcribe and an email address to receive the result.
  4. In Form Fields, add a file field labeled file (required) that accepts .mp3, and an email field labeled email (required).
  5. Confirm the response message in Respond With Options is set to Your file has been received; an email will be sent to you upon completion of transcription or in case of error.

Tip: The Create 4MiB Segments code expects the uploaded file field to be named file. If you rename the field, update the code accordingly.

Step 2: Configure Workflow Constants and File Chunking

Define the chunk size and FileFlows API settings, then split the incoming audio into 4MiB chunks for upload.

  1. Open Set Workflow Config and set chunk_size to {{ 4 * 1024 * 1024 }}.
  2. Set fileflows_url to your FileFlows base URL, e.g., http://0.0.0.0:5000.
  3. Set flowUid to your FileFlows flow ID, replacing [YOUR_ID].
  4. In Create 4MiB Segments, keep the provided JavaScript to read the binary upload and generate chunk metadata and binary parts.
  5. Ensure Iterate Chunk Batches is connected after Create 4MiB Segments to control chunk uploads.

⚠️ Common Pitfall: If flowUid or fileflows_url is incorrect, Upload Segment Part and Initiate Audio Split will fail silently, and the process will stall at Pause for Callback.

Step 3: Connect FileFlows Upload and Split Requests

Upload each chunk to FileFlows, validate the response, and trigger the server-side split operation.

  1. In Upload Segment Part, set URL to ={{ $('Set Workflow Config').item.json.fileflows_url }}/api/library-file/upload and Method to POST.
  2. Set Content Type to multipart-form-data and map body parameters: fileName to {{$json["fileName"]}}, chunkNumber to {{$json["chunkNumber"]}}, totalChunks to {{$json["totalChunks"]}}, and file to binary field chunk.
  3. Use Exclude Temp Records to filter out temporary items with condition {{ $json.data }} notEndsWith .temp.
  4. In Success Check, confirm the condition {{ $json.data }} exists to route valid uploads to Initiate Audio Split.
  5. Configure Initiate Audio Split with URL ={{ $('Set Workflow Config').item.json.fileflows_url }}/api/library-file/manually-add and JSON body { "FlowUid": "{{ $('Set Workflow Config').first().json.flowUid }}", "Files": [ "{{ $json.data }}" ], "CustomVariables": { "callbackUrl": "{{$execution.resumeUrl}}" } }.
  6. Keep Pause for Callback set to Resume webhook, HTTP Method POST, and Resume Amount 30 minutes to wait for FileFlows to send back split audio.

Step 4: Set Up Audio Extraction and Transcription Loop

Transform the callback binaries into items, loop through each audio part, and send them to OpenAI for transcription.

  1. In Extract Audio Parts, keep the provided JavaScript to convert all returned binaries into individual items with a binary field named Audio.
  2. Ensure Process Segment Loop follows Extract Audio Parts to iterate each segment.
  3. Set up OpenAI Transcribe with Resource audio, Operation transcribe, and Binary Property Name Audio. Keep Language set to fr under options if you want French transcription.
  4. Credential Required: Connect your openAiApi credentials in OpenAI Transcribe.
  5. Use Throttle Pause after OpenAI Transcribe to control request pacing before looping back to Process Segment Loop.

Tip: The Segment Marker and Transcription Output nodes are noOp markers to keep the loop structured and easy to debug.

Step 5: Combine Transcripts and Deliver Results

Aggregate all transcription pieces into a single text file and email it to the requester.

  1. In Combine Transcripts, keep the JavaScript that concatenates item.json.text into a single transcription field.
  2. Configure Build Text File with Operation toText and Source Property transcription.
  3. Set Build Text File options to File Name transcription.txt and Encoding utf8.
  4. In Email Transcript Delivery, set Send To to ={{ $('Incoming Form Capture').first().json.email }}, Subject to Your transcription is ready, and include the provided message.
  5. Credential Required: Connect your gmailOAuth2 credentials in Email Transcript Delivery.

Step 6: Add Error Handling for Split and Transcription Failures

Ensure users receive feedback when FileFlows or transcription fails.

  1. From Success Check, verify the false branch routes to Split Error Email.
  2. In Split Error Email, keep Send To set to ={{ $('Incoming Form Capture').first().json.email }} and confirm the error message content.
  3. From OpenAI Transcribe, ensure the error output connects to Email Error Notice to notify transcription issues.
  4. Credential Required: Connect your gmailOAuth2 credentials in both Email Error Notice and Split Error Email.

⚠️ Common Pitfall: If you remove the error output from OpenAI Transcribe, failed transcriptions will not trigger Email Error Notice, leaving users without feedback.

Step 7: Test and Activate Your Workflow

Validate the full pipeline with a test submission before enabling it in production.

  1. Click Execute Workflow and submit the Incoming Form Capture form with a small .mp3 file and a valid email.
  2. Verify that Upload Segment Part, Initiate Audio Split, and Pause for Callback execute without errors.
  3. Confirm OpenAI Transcribe produces text outputs and Combine Transcripts creates the transcription field.
  4. Check that Email Transcript Delivery sends an email with transcription.txt attached to the provided address.
  5. When testing is successful, toggle the workflow to Active to accept live form submissions.
🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Common Gotchas

  • Gmail credentials can expire or need specific permissions. If things break, check your connected Google account status in n8n credentials first.
  • If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
  • OpenAI prompts and defaults matter more than people expect. The Whisper node is configured for French by default here, so confirm language settings early or you’ll be correcting mistakes after the fact.

Frequently Asked Questions

How long does it take to set up this Whisper transcript email automation?

About 45 minutes if FileFlows and Gmail are already working.

Do I need coding skills to automate Whisper transcript email delivery?

No. You’ll mainly paste credentials and adjust a few settings in the form, FileFlows endpoint, and the Whisper node.

Is n8n free to use for this Whisper transcript email workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI Whisper API costs at $0.006 per minute (so a 1-hour recording is about $0.36).

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

Can I customize this Whisper transcript email workflow for English audio and different chunk sizes?

Yes, but change two places. Update the language setting in the OpenAI Transcribe node, then adjust the FileFlows split settings so your segments stay under the 25 MB limit. Common tweaks include switching French to English, shortening segments for noisy audio, and changing the email template to include speaker labels or a summary link.

Why is my FileFlows connection failing in this workflow?

Usually it’s network reachability or a wrong endpoint URL between n8n and FileFlows, honestly. Confirm n8n can reach the FileFlows host from its network, then re-check the HTTP Request node settings and any required headers. If FileFlows is running in Docker, port mapping and internal DNS names are common culprits. Also make sure the storage path FileFlows uses is writable, or the split job can “succeed” but produce nothing.

How many audio files can this Whisper transcript email automation handle?

On n8n Cloud Starter, you’re limited by monthly executions, while self-hosting is mainly limited by your server and how fast FileFlows can process jobs. A practical approach is to start with a few files per day, then increase concurrency once you’re confident in your queueing and wait times. If you expect bursts (like 20 uploads after an event), consider adding longer waits and slightly more throttling so Whisper and FileFlows don’t get overwhelmed.

Is this Whisper transcript email automation better than using Zapier or Make?

For long-audio transcription, yes. Zapier and Make struggle once you need chunked uploads, callbacks, looping over segments, and reliable error emails in one flow. n8n handles branching and loops cleanly, and you can self-host to avoid per-task pricing surprises when volume grows. The tradeoff is setup: you’ll spend a bit more time connecting FileFlows and testing the wait/callback logic. If you want someone to sanity-check it, Talk to an automation expert.

Once this is running, long-audio transcription stops being a recurring chore. You upload, you wait a bit, you get the transcript, and you move on.

Need Help Setting This Up?

Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.

Lisa Granqvist

Workflow Automation Expert

Expert in workflow automation and no-code tools.

×

Use template

Get instant access to this n8n workflow Json file

💬
Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Launch login modal Launch register modal