Telegram to Slack, voice notes turned bilingual text

Voice notes are fast. The follow-up isn’t. Someone has to replay the audio, type it out, translate it, then paste it into Slack (and you still get questions).

This Telegram Slack translation automation hits remote team leads first, but project managers and agency operators feel it too. You get a clean Slack update with the transcript and a bilingual translation in under 10 seconds for short clips.

You’ll learn what the workflow does, what you need to run it, and how the pieces fit together so you can customize the language direction and formatting.

How This Automation Works

Here’s the complete workflow you’ll be setting up:

n8n Workflow Template: Telegram to Slack, voice notes turned bilingual text

Click to explore

flowchart LR

    subgraph sg0["Telegram Flow"]
        direction LR
        n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Telegram Trigger"]
        n1@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Is Voice?", pos: "b", h: 48 }
        n2@{ icon: "mdi:swap-vertical", form: "rounded", label: "Get File ID", pos: "b", h: 48 }
        n3["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Telegram getFile"]
        n4@{ icon: "mdi:code-braces", form: "rounded", label: "Build File URL", pos: "b", h: 48 }
        n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Download Voice File"]
        n6@{ icon: "mdi:swap-vertical", form: "rounded", label: "Extract Transcript", pos: "b", h: 48 }
        n7@{ icon: "mdi:code-braces", form: "rounded", label: "Detect Language", pos: "b", h: 48 }
        n8["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Translate (OpenAI)"]
        n9@{ icon: "mdi:code-braces", form: "rounded", label: "Build Slack Message", pos: "b", h: 48 }
        n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Post to Slack"]
        n11@{ icon: "mdi:swap-vertical", form: "rounded", label: "Add Bot Token", pos: "b", h: 48 }
        n12["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Prepare Whisper Input1"]
        n13@{ icon: "mdi:robot", form: "rounded", label: "Transcribe a recording", pos: "b", h: 48 }
        n1 --> n2
        n2 --> n11
        n11 --> n3
        n4 --> n5
        n7 --> n8
        n0 --> n1
        n3 --> n4
        n6 --> n7
        n8 --> n9
        n9 --> n10
        n5 --> n12
        n12 --> n13
        n13 --> n6
    end

    %% Styling
    classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
    classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
    classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
    classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
    classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef disabled stroke-dasharray: 5 5,opacity: 0.5
    class n0 trigger
    class n13 ai
    class n1 decision
    class n3,n5,n8,n10 api
    class n4,n7,n9,n12 code
    classDef customIcon fill:none,stroke:none
    class n0,n3,n5,n8,n10,n12 customIcon

Why This Matters: Voice updates that don’t reach the right place

In a lot of teams, Telegram is where quick updates happen and Slack is where work gets tracked. So a simple voice message creates a weird gap: the people who need the info are in Slack, but the info is stuck in audio on Telegram. Then the pings start. “Can someone summarize that?” “What did they say?” “Was that in Japanese or English?” It’s not just time. It’s context loss, slower decisions, and the subtle frustration of repeating yourself.

It adds up fast. Here’s where it breaks down in day-to-day operations.

Someone has to stop what they’re doing, listen, and manually write a transcript (and they usually paraphrase).
Translation happens late or not at all, which means half the team is always one message behind.
Slack ends up with “FYI see Telegram,” so the searchable record of decisions is incomplete.
Misheard details create rework, especially with tasks, names, dates, and numbers spoken quickly.

What You’ll Build: Telegram voice-to-Slack transcript + translation

This workflow watches a Telegram chat for voice messages posted through your bot. When a voice note arrives, it checks that it’s actually audio, grabs the Telegram file ID, and uses the Telegram Bot API to fetch the audio file. Next, it sends that audio to OpenAI Whisper to create a transcript. With the transcript in hand, the workflow detects whether the source language is Japanese or English, then calls GPT-4o-mini to translate into the other language. Finally, it formats a Slack message with clear labeling (including flags), the original sender attribution, and both the transcript and translated text, then posts it straight into your chosen Slack channel.

The workflow starts in Telegram and ends in Slack. In the middle, Whisper turns audio into text, then GPT-4o-mini handles the bilingual translation based on detected language. The result is a Slack-friendly update your team can read, search, and act on.

What You’re Building

What Gets Automated

What You’ll Achieve

Detect Telegram voice messages and ignore non-audio posts.
Download the audio file automatically using the Telegram file lookup endpoint.
Transcribe audio with OpenAI Whisper, then isolate the clean transcript text.
Detect JA vs EN and translate using a GPT-4o-mini HTTP request before posting to Slack.

Turn a 5–10 minute manual relay into a Slack post in about 10 seconds for short clips.
Keep Japanese and English speakers aligned without asking someone to “translate please.”
Create a searchable Slack record, which makes later handoffs much easier.
Reduce miscommunication by sharing exact transcript text, not memory-based summaries.
Let people keep using voice notes naturally while Slack stays the system of record.

Expected Results

Say your team posts 10 Telegram voice updates per day. Manually, it’s usually about 6 minutes to open the message, replay it, type a decent transcript, translate it, and format something readable for Slack, which is about 1 hour daily. With this workflow, you spend basically 1 minute total (someone just records the voice note like usual) and the automation handles transcription, translation, and Slack posting in under 10 seconds for short clips. That’s close to an hour back on a normal day, plus fewer “what did that say?” interruptions.

Before You Start

n8n instance (try n8n Cloud free)
Self-hosting option if you prefer (Hostinger works well)
Telegram for capturing voice notes via bot.
Slack to post transcripts and translations to a channel.
OpenAI API key (get it from your OpenAI API dashboard)

Skill level: Intermediate. You’ll copy credentials into n8n and tweak a couple of fields, but you won’t be writing an app.

Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).

Step by Step

A Telegram voice note comes in. The Telegram trigger listens for new messages, then an “is this a voice message?” check filters out everything else so your Slack channel doesn’t get noisy.

The workflow fetches the audio from Telegram. It extracts the file identifier from the message, looks up the real file path via the Telegram Bot API, builds a downloadable link, and pulls the audio as a binary file into n8n.

Whisper transcribes, then the text gets translated. A prep step formats the audio correctly for Whisper, Whisper returns a transcript, and the workflow detects whether the transcript is Japanese or English. From there, an OpenAI translation call generates the opposite language so both audiences get a readable version.

Slack receives a clean, labeled update. The final formatting step composes a Slack-ready payload with username attribution and clear flags (JA → EN or EN → JA), then sends it using your Slack bot token.

You can easily modify the translation direction or message style based on your needs. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Telegram Trigger

Set up the workflow to listen for Telegram voice messages and gate the flow to voice-only events.

Add and open Telegram Voice Hook and set Updates to message.
Credential Required: Connect your telegramApi credentials in Telegram Voice Hook.
Open Voice Message Check and set the condition Value 1 to {{$json["message"]["voice"] !== undefined ? "voice" : "other"}} and Value 2 to voice.
Confirm the connection flow: Telegram Voice Hook → Voice Message Check → Extract File Identifier.

Step 2: Connect Telegram File Retrieval

Extract the file identifier, inject the bot token, and retrieve the voice file URL from Telegram.

In Extract File Identifier, enable Keep Only Set and set fields:
- file_id to {{$json["message"]["voice"]["file_id"]}}
- chat_id to {{$json["message"]["chat"]["id"]}}
- from_username to {{$json["message"]["from"]["username"] || $json["message"]["from"]["first_name"] || "unknown"}}
In Insert Bot Token, set bot_token to [CONFIGURE_YOUR_TOKEN] and enable Include Other Fields.
In Telegram File Lookup, set URL to =https://api.telegram.org/bot{{$json["bot_token"]}}/getFile?file_id={{$json["file_id"]}}.
Keep Assemble File Link as-is to build the downloadable file URL and pass chat_id and from_username.
In Fetch Audio Binary, set URL to {{$json["file_url"]}} and Response Format to file.

⚠️ Common Pitfall: Replace [CONFIGURE_YOUR_TOKEN] in Insert Bot Token with your actual Telegram bot token, or the file lookup will fail.

Step 3: Set Up Whisper Transcription

Prepare the audio binary and run the Whisper transcription to extract text.

In Prep Whisper Audio, keep the JavaScript that maps binary.data to binary.audio for Whisper input.
In Whisper Transcription, set Resource to audio, Operation to transcribe, and Binary Property Name to audio.
Credential Required: Connect your openAiApi credentials in Whisper Transcription.
In Isolate Transcript Text, enable Keep Only Set and set:
- transcript_text to {{$json.text}}
- from_username to {{$items("Assemble File Link")[0].json.from_username}}

Step 4: Configure Translation and Slack Output

Detect the source language, call OpenAI for translation, and send the formatted message to Slack.

In Identify Source Language, keep the function code that sets source_lang, target_lang, and original_text.
In OpenAI Translation Call, set:
- URL to https://api.openai.com/v1/chat/completions
- Request Method to POST
- Authentication to headerAuth
- JSON Parameters to true
- Body Parameters JSON to { "model": "gpt-4o-mini", "temperature": 0.2, "messages": [ { "role": "system", "content": "You are a translation engine. Translate exactly, preserving meaning and tone. Output ONLY the translated text." }, { "role": "user", "content": "Source language: {{$json.source_lang}}\nTarget language: {{$json.target_lang}}\nText:\n{{$json.original_text}}" } ] }
- Header Parameters JSON to { "Authorization": "Bearer {{$credentials.httpHeaderAuth.headerValue}}", "Content-Type": "application/json" }
Credential Required: Connect your httpHeaderAuth credentials in OpenAI Translation Call.
Keep Compose Slack Payload as-is to format the Slack message with flags, username, original text, and translation.
In Send Slack Post, set URL to https://slack.com/api/chat.postMessage, Request Method to POST, Authentication to headerAuth, and Body Parameters:
- channel to [YOUR_ID]
- text to {{$json.slack_text}}
Credential Required: Connect your httpHeaderAuth credentials in Send Slack Post.

Tip: Use a Slack channel ID (e.g., C0123456789) for [YOUR_ID] to avoid posting failures.

Step 5: Test and Activate Your Workflow

Validate the full voice-to-translation pipeline and then enable production execution.

Click Execute Workflow and send a Telegram voice message to your bot to trigger Telegram Voice Hook.
Confirm that Fetch Audio Binary returns a file and Whisper Transcription outputs text.
Verify that OpenAI Translation Call returns a translated response and Send Slack Post posts to the target channel.
When successful, toggle the workflow to Active for production use.

🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Troubleshooting Tips

Telegram credentials can fail if the bot token is wrong or the bot isn’t in the chat. Double-check the bot token and confirm the bot has access to the conversation you’re triggering from.
If you’re sending longer audio, Whisper can fail or time out. Keep audio files under 25 MB and increase any HTTP timeouts in your HTTP Request nodes if the download or transcription call is flaky.
Slack posting errors are usually permissions. Check your Slack app scopes first (at minimum chat:write, plus files:write or channels:history depending on your setup), then confirm the channel is accessible to the bot.

Quick Answers

What’s the setup time for this Telegram Slack translation automation?

About 30 minutes once your Telegram, Slack, and OpenAI credentials are ready.

Is coding required for this voice note translation?

No. You’ll paste credentials and adjust a few fields like the Slack channel and translation direction.

Is n8n free to use for this Telegram Slack translation workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI API usage for Whisper transcription and GPT-4o-mini translation (usually pennies for short clips).

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

Can I modify this Telegram Slack translation workflow for different use cases?

Yes, and it’s meant to be tweaked. You can change the language logic in the “Identify Source Language” function and swap the target language in the “OpenAI Translation Call” node. Many teams also adjust the “Compose Slack Payload” step to add bullet points, a short summary line, or a project tag at the top. If you want the same automation for other languages, you can extend the detection and routing instead of hardcoding JA/EN.

Why is my Telegram connection failing in this workflow?

Most of the time it’s a bot token problem or the bot doesn’t have access to the chat where the voice note is posted. Regenerate or re-copy your TELEGRAM_BOT_TOKEN, then verify the bot is actually present in the group and allowed to read messages. If file downloads fail, check the Telegram “getFile” HTTP request and confirm you’re passing the right file_id from the voice message payload.

What volume can this Telegram Slack translation workflow process?

Plenty for most small teams: dozens to a few hundred voice notes a day is realistic, assuming short clips and normal API limits. On n8n Cloud, volume depends on your plan’s monthly executions; if you self-host, you’re mostly limited by your server and OpenAI request throughput. If you expect spikes (like standup time), add basic retry handling on the HTTP Request nodes so one slow transcription doesn’t cause a cascade of failures.

Is this Telegram Slack translation automation better than using Zapier or Make?

Often, yes, because voice transcription and bilingual handling usually needs multiple steps and conditional logic. n8n makes it easier to pull a binary audio file from Telegram, pass it to Whisper, then route translation based on detected language without paying extra for every branch. You also get the option to self-host, which matters if you’re processing lots of messages. Zapier or Make can still work if your flow is tiny, but once you add file handling and AI calls, n8n tends to stay simpler (and cheaper). Talk to an automation expert if you want help picking the best approach.

Once this is running, voice notes stop being a dead end. Slack gets clean bilingual updates, and your team gets momentum back.

Telegram to Slack, voice notes turned bilingual text

How This Automation Works

n8n Workflow Template: Telegram to Slack, voice notes turned bilingual text

Why This Matters: Voice updates that don’t reach the right place