🔓 Unlock all 10,000+ workflows & prompts free Join Newsletter →
✅ Full access unlocked — explore all 10,000 AI workflow and prompt templates Browse Templates →
Home n8n Workflow
January 22, 2026

Telegram + OpenAI: voice notes into searchable text

Lisa Granqvist Partner Workflow Automation Expert

Your Telegram chats are full of decisions, client details, and “don’t forget this” moments. And then they disappear into voice notes that nobody can search, skim, or copy into docs.

Marketing managers end up replaying audio to pull quotes. Agency owners lose action items across client threads. Even ops leads feel it when approvals live inside 45 seconds of mumbled context. This Telegram transcription automation turns voice into clean text you can actually use.

You’ll set up an n8n workflow that listens for voice notes, transcribes them with OpenAI, falls back to Gemini if needed, and posts readable text back to the same chat (even when messages are long).

How This Automation Works

Here’s the complete workflow you’ll be setting up:

n8n Workflow Template: Telegram + OpenAI: voice notes into searchable text

Why This Matters: Voice Notes Aren’t Searchable

Voice notes feel fast in the moment. But later, they’re friction. Someone asks, “What did the client say about pricing?” and now you’re scrubbing through audio, turning volume up, rewinding, and hoping you caught the important part. Multiply that by a busy week of team chats and you get a quiet tax on your time. The worst part is the context loss: decisions don’t make it into your docs, and follow-ups get missed because the “real info” lived in audio.

It adds up fast. Here’s where it usually breaks down.

  • People stop documenting because replaying audio is annoying, so knowledge stays trapped in Telegram.
  • Manual transcription is slow and error-prone, especially when multiple people talk or accents vary.
  • One long voice message can exceed Telegram’s 4,000-character limit when transcribed, so the text gets cut off or never sent.
  • Without access control, anyone can trigger transcriptions and burn through AI credits (sometimes accidentally).

What You’ll Build: Secure Voice-to-Text in Telegram

This workflow turns Telegram voice messages into readable text replies, automatically. It starts when someone posts a voice note or audio file in your Telegram group, then verifies the sender is allowed to use the transcription service. If the message contains supported audio, n8n downloads the file and sends it to OpenAI for transcription (with a quick “transcription started” notice so people aren’t left guessing). If OpenAI errors out, the workflow routes the same file to Gemini as a backup. Finally, the transcribed text is posted back into the chat, and long transcripts are split into multiple messages so nothing gets chopped.

The flow is simple in practice. Telegram triggers the run, access control keeps usage clean, and AI handles the transcription. Then n8n formats the result for Telegram’s limits and delivers it right where your team already works.

What You’re Building

Expected Results

Say your team gets 10 voice notes a day and each one takes about 3 minutes to replay, pause, and type into something usable. That’s roughly 30 minutes daily, and it’s usually the worst 30 minutes because it interrupts real work. With this workflow, you drop the voice note as usual, get a “started” notification, then receive the transcript back in chat. Your manual time becomes close to zero, and long messages still arrive as multiple chunks instead of failing.

Before You Start

  • n8n instance (try n8n Cloud free)
  • Self-hosting option if you prefer (Hostinger works well)
  • Telegram for receiving voice notes and posting transcripts.
  • OpenAI to transcribe audio with Whisper.
  • Google Gemini as backup transcription if OpenAI fails.
  • OpenAI API key (get it from your OpenAI dashboard).

Skill level: Intermediate. You’ll connect Telegram + AI credentials, then adjust a few rules (authorized users, formats, and output behavior).

Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).

Step by Step

A Telegram message comes in. The workflow triggers on new messages in your group and captures sender details plus the message type (voice note, audio file, or just text).

Access gets checked immediately. An “if” rule verifies the sender against an approved list. If they’re not authorized, n8n replies with an access denied message and stops, which means you don’t spend AI credits on random requests.

Audio is detected and validated. The workflow figures out whether there’s a file to transcribe, pulls the correct Telegram file ID, and checks the audio format (OGG voice messages, MP3, M4A/MP4, and other supported types). If there’s no audio or an unknown format, it sends a clear warning back to the chat.

Transcription runs, with a fallback. n8n downloads the file and sends it to OpenAI for transcription while also posting a quick “started” notification. If OpenAI errors, the workflow automatically routes the same file to Gemini, then maps the resulting text into a single output variable so the rest of the workflow behaves the same.

The transcript is delivered safely. If the transcription is under Telegram’s 4,000-character limit, it posts once. If it’s longer, a code step splits it into readable chunks and sends multiple messages in sequence.

You can easily modify the authorized user list to match your team, or adjust how chunking behaves based on your chat style. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Telegram Trigger

Set up the entry point that listens for incoming Telegram messages and starts the workflow.

  1. Add and open Incoming Telegram Trigger.
  2. Set Updates to message.
  3. Credential Required: Connect your telegramApi credentials.

Tip: Ensure your Telegram bot is already created and allowed to receive messages from the users you plan to authorize.

Step 2: Validate Sender and Detect Message Type

Gate access to the bot and route messages based on whether they contain voice or audio files.

  1. Open Validate Sender Access and set the conditions to allow only authorized users, using ={{ $json.message.from.username }} with allowed values User 2 and User 1.
  2. Confirm Validate Sender Access routes to Detect Message Type on the true branch and to Send Access Denied on the false branch.
  3. In Detect Message Type, verify the rules check for voice and audio objects using ={{ $json.message.voice }} and ={{ $json.message.audio }}, and keep Fallback Output set to extra.
  4. Confirm Detect Message Type routes to Set Voice File ID, Set Audio File ID, or Alert Missing File depending on the message content.

⚠️ Common Pitfall: If the username in Validate Sender Access does not exactly match the sender’s Telegram username (case-sensitive), the workflow will always route to Send Access Denied.

Step 3: Prepare File IDs and Validate Audio Format

Extract the file ID from the message and enforce accepted audio MIME types before downloading the file.

  1. In Set Voice File ID, set the assignment file_id to ={{ $json.message.voice.file_id }} and enable Include Other Fields.
  2. In Set Audio File ID, set the assignment file_id to ={{ $json.message.audio.file_id }} and enable Include Other Fields.
  3. In Validate Audio Format, keep the MIME checks for audio/ogg, audio/mpeg, audio/mp4, and audio/m4a using the expressions referencing Incoming Telegram Trigger.
  4. Confirm the false branch of Validate Audio Format routes to Warn Unrecognized File.

Credential Required: Connect your telegramApi credentials to all Telegram action nodes (8 total) including Send Access Denied, Alert Missing File, Warn Unrecognized File, Notify Start Transcription, Retrieve File for GPT, Fetch File for Gemini, Send Transcript Output, and Send Chunked Output.

Step 4: Retrieve the File and Run Parallel Transcription

Download the media file and launch both the transcription process and a user notification in parallel.

  1. In Retrieve File for GPT, set Resource to file and File ID to ={{ $json.file_id }}.
  2. Retrieve File for GPT outputs to both OpenAI Transcription and Notify Start Transcription in parallel.
  3. In Notify Start Transcription, set Text to Starting transcription. Please wait. and Chat ID to ={{ $('Incoming Telegram Trigger').item.json.message.chat.id }}.
  4. Credential Required: Connect your openAiApi credentials to OpenAI Transcription.

Tip: Parallel execution ensures the user is notified immediately while transcription starts in the background.

Step 5: Configure Gemini Fallback and Map Transcription Text

Send the file to Gemini and normalize both transcription outputs into a consistent text field.

  1. In Fetch File for Gemini, set Resource to file and File ID to ={{ $('Retrieve File for GPT').item.json.result.file_id }}.
  2. In Gemini Transcription, set Resource to audio, Input Type to binary, and Binary Property Name to =data.
  3. Credential Required: Connect your googlePalmApi credentials to Gemini Transcription.
  4. In Map Text Variable, set text to ={{ $json.text }}.
  5. In Map Text Variable 2, set text to ={{ $json.content.parts[0].text }}.

Step 6: Route by Text Length and Send Output

Send short transcripts as a single message and split long transcripts into chunks.

  1. In Check Text Length, keep the condition ={{ $json["text"].length }} lt 4000.
  2. On the true branch, send output via Send Transcript Output with Text set to ={{ $json.text }} and Chat ID set to ={{ $('Incoming Telegram Trigger').item.json.message.chat.id }}.
  3. On the false branch, use Split Text Chunks with the provided JavaScript to split into 4000-character chunks.
  4. In Send Chunked Output, set Text to ={{ $json.body }} and Chat ID to ={{ $('Incoming Telegram Trigger').item.json.message.chat.id }}.

⚠️ Common Pitfall: If Split Text Chunks is modified to return a different field than body, Send Chunked Output will send empty messages.

Step 7: Test & Activate Your Workflow

Verify end-to-end behavior with real Telegram messages before going live.

  1. Click Execute Workflow and send a voice or audio file to your Telegram bot.
  2. Confirm that Notify Start Transcription sends the “Starting transcription. Please wait.” message immediately.
  3. Check that transcription output appears via Send Transcript Output for short text or via multiple messages from Send Chunked Output for long text.
  4. Once verified, toggle the workflow to Active for production use.
🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Troubleshooting Tips

  • Telegram credentials can expire or require the right bot permissions. If messages aren’t being received or sent, check your Telegram bot access in n8n’s Credentials first.
  • If you’re using Wait-style timing anywhere else in your workspace or relying on external transcription responses, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
  • Default prompts and settings in AI nodes can be generic. If you want consistent formatting (bullets, action items, speaker labels), define that output style early or you will be cleaning transcripts by hand.

Quick Answers

What’s the setup time for this Telegram transcription automation?

About 30 minutes if your Telegram bot and API keys are ready.

Is coding required for this voice-to-text outcome?

No. You’ll mostly connect accounts and edit a few simple rules. The only “code” piece is already included for splitting long text.

Is n8n free to use for this Telegram transcription automation workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI and Gemini API usage (most teams find it cheap for voice notes).

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

Can I modify this Telegram transcription automation workflow for different use cases?

Yes, and you should. You can change the authorized user check in the “Validate Sender Access” step, swap transcription providers by adjusting the OpenAI and Gemini nodes, and tweak the “Split Text Chunks” logic if you want smaller replies or different formatting (like action items first).

Why is my Telegram connection failing in this workflow?

Usually it’s the bot permissions or an expired credential in n8n. Confirm your bot is in the group, can read messages, and can post replies, then re-save the Telegram credential. If it triggers but can’t download files, the file ID mapping may not match the message type (voice vs audio), so check the message detection and “Set Voice File ID / Set Audio File ID” steps.

What volume can this Telegram transcription automation workflow process?

On n8n Cloud, it depends on your plan’s monthly executions, and each transcription is typically one execution. If you self-host, there’s no execution cap, so the real limit becomes your server and API rate limits. Practically, most small teams can handle dozens of voice notes per day without thinking about it. If you’re transcribing long audio all day, you’ll want queueing and stronger monitoring.

Is this Telegram transcription automation better than using Zapier or Make?

Often, yes. n8n handles branching logic (access control, message type detection, fallback to Gemini, and chunking) without turning your automation into a spaghetti monster. You also get a self-hosting option, which is a big deal if you expect high volume or want predictable costs. Zapier and Make can still work if your needs are simple, but this workflow has enough “ifs” that the pricing and complexity can creep up. If you’re torn, Talk to an automation expert and describe your volume plus your security needs.

Once this is running, voice notes stop being “lost context” and start becoming usable documentation. Set it up once, then let the workflow do the boring part.

Need Help Setting This Up?

Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.

Lisa Granqvist

Workflow Automation Expert

Expert in workflow automation and no-code tools.

×

Use template

Get instant access to this n8n workflow Json file

💬
Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Launch login modal Launch register modal