Telegram to Slack, voice notes turned bilingual text
Voice notes are fast. The follow-up isn’t. Someone has to replay the audio, type it out, translate it, then paste it into Slack (and you still get questions).
This Telegram Slack translation automation hits remote team leads first, but project managers and agency operators feel it too. You get a clean Slack update with the transcript and a bilingual translation in under 10 seconds for short clips.
You’ll learn what the workflow does, what you need to run it, and how the pieces fit together so you can customize the language direction and formatting.
How This Automation Works
Here’s the complete workflow you’ll be setting up:
n8n Workflow Template: Telegram to Slack, voice notes turned bilingual text
flowchart LR
subgraph sg0["Telegram Flow"]
direction LR
n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Telegram Trigger"]
n1@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Is Voice?", pos: "b", h: 48 }
n2@{ icon: "mdi:swap-vertical", form: "rounded", label: "Get File ID", pos: "b", h: 48 }
n3["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Telegram getFile"]
n4@{ icon: "mdi:code-braces", form: "rounded", label: "Build File URL", pos: "b", h: 48 }
n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Download Voice File"]
n6@{ icon: "mdi:swap-vertical", form: "rounded", label: "Extract Transcript", pos: "b", h: 48 }
n7@{ icon: "mdi:code-braces", form: "rounded", label: "Detect Language", pos: "b", h: 48 }
n8["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Translate (OpenAI)"]
n9@{ icon: "mdi:code-braces", form: "rounded", label: "Build Slack Message", pos: "b", h: 48 }
n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Post to Slack"]
n11@{ icon: "mdi:swap-vertical", form: "rounded", label: "Add Bot Token", pos: "b", h: 48 }
n12["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Prepare Whisper Input1"]
n13@{ icon: "mdi:robot", form: "rounded", label: "Transcribe a recording", pos: "b", h: 48 }
n1 --> n2
n2 --> n11
n11 --> n3
n4 --> n5
n7 --> n8
n0 --> n1
n3 --> n4
n6 --> n7
n8 --> n9
n9 --> n10
n5 --> n12
n12 --> n13
n13 --> n6
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n13 ai
class n1 decision
class n3,n5,n8,n10 api
class n4,n7,n9,n12 code
classDef customIcon fill:none,stroke:none
class n0,n3,n5,n8,n10,n12 customIcon
Why This Matters: Voice updates that don’t reach the right place
In a lot of teams, Telegram is where quick updates happen and Slack is where work gets tracked. So a simple voice message creates a weird gap: the people who need the info are in Slack, but the info is stuck in audio on Telegram. Then the pings start. “Can someone summarize that?” “What did they say?” “Was that in Japanese or English?” It’s not just time. It’s context loss, slower decisions, and the subtle frustration of repeating yourself.
It adds up fast. Here’s where it breaks down in day-to-day operations.
- Someone has to stop what they’re doing, listen, and manually write a transcript (and they usually paraphrase).
- Translation happens late or not at all, which means half the team is always one message behind.
- Slack ends up with “FYI see Telegram,” so the searchable record of decisions is incomplete.
- Misheard details create rework, especially with tasks, names, dates, and numbers spoken quickly.
What You’ll Build: Telegram voice-to-Slack transcript + translation
This workflow watches a Telegram chat for voice messages posted through your bot. When a voice note arrives, it checks that it’s actually audio, grabs the Telegram file ID, and uses the Telegram Bot API to fetch the audio file. Next, it sends that audio to OpenAI Whisper to create a transcript. With the transcript in hand, the workflow detects whether the source language is Japanese or English, then calls GPT-4o-mini to translate into the other language. Finally, it formats a Slack message with clear labeling (including flags), the original sender attribution, and both the transcript and translated text, then posts it straight into your chosen Slack channel.
The workflow starts in Telegram and ends in Slack. In the middle, Whisper turns audio into text, then GPT-4o-mini handles the bilingual translation based on detected language. The result is a Slack-friendly update your team can read, search, and act on.
What You’re Building
| What Gets Automated | What You’ll Achieve |
|---|---|
|
|
Expected Results
Say your team posts 10 Telegram voice updates per day. Manually, it’s usually about 6 minutes to open the message, replay it, type a decent transcript, translate it, and format something readable for Slack, which is about 1 hour daily. With this workflow, you spend basically 1 minute total (someone just records the voice note like usual) and the automation handles transcription, translation, and Slack posting in under 10 seconds for short clips. That’s close to an hour back on a normal day, plus fewer “what did that say?” interruptions.
Before You Start
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Telegram for capturing voice notes via bot.
- Slack to post transcripts and translations to a channel.
- OpenAI API key (get it from your OpenAI API dashboard)
Skill level: Intermediate. You’ll copy credentials into n8n and tweak a couple of fields, but you won’t be writing an app.
Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).
Step by Step
A Telegram voice note comes in. The Telegram trigger listens for new messages, then an “is this a voice message?” check filters out everything else so your Slack channel doesn’t get noisy.
The workflow fetches the audio from Telegram. It extracts the file identifier from the message, looks up the real file path via the Telegram Bot API, builds a downloadable link, and pulls the audio as a binary file into n8n.
Whisper transcribes, then the text gets translated. A prep step formats the audio correctly for Whisper, Whisper returns a transcript, and the workflow detects whether the transcript is Japanese or English. From there, an OpenAI translation call generates the opposite language so both audiences get a readable version.
Slack receives a clean, labeled update. The final formatting step composes a Slack-ready payload with username attribution and clear flags (JA → EN or EN → JA), then sends it using your Slack bot token.
You can easily modify the translation direction or message style based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Telegram Trigger
Set up the workflow to listen for Telegram voice messages and gate the flow to voice-only events.
- Add and open Telegram Voice Hook and set Updates to
message. - Credential Required: Connect your telegramApi credentials in Telegram Voice Hook.
- Open Voice Message Check and set the condition Value 1 to
{{$json["message"]["voice"] !== undefined ? "voice" : "other"}}and Value 2 tovoice. - Confirm the connection flow: Telegram Voice Hook → Voice Message Check → Extract File Identifier.
Step 2: Connect Telegram File Retrieval
Extract the file identifier, inject the bot token, and retrieve the voice file URL from Telegram.
- In Extract File Identifier, enable Keep Only Set and set fields:
- file_id to
{{$json["message"]["voice"]["file_id"]}} - chat_id to
{{$json["message"]["chat"]["id"]}} - from_username to
{{$json["message"]["from"]["username"] || $json["message"]["from"]["first_name"] || "unknown"}}
- file_id to
- In Insert Bot Token, set bot_token to
[CONFIGURE_YOUR_TOKEN]and enable Include Other Fields. - In Telegram File Lookup, set URL to
=https://api.telegram.org/bot{{$json["bot_token"]}}/getFile?file_id={{$json["file_id"]}}. - Keep Assemble File Link as-is to build the downloadable file URL and pass chat_id and from_username.
- In Fetch Audio Binary, set URL to
{{$json["file_url"]}}and Response Format tofile.
[CONFIGURE_YOUR_TOKEN] in Insert Bot Token with your actual Telegram bot token, or the file lookup will fail.Step 3: Set Up Whisper Transcription
Prepare the audio binary and run the Whisper transcription to extract text.
- In Prep Whisper Audio, keep the JavaScript that maps
binary.datatobinary.audiofor Whisper input. - In Whisper Transcription, set Resource to
audio, Operation totranscribe, and Binary Property Name toaudio. - Credential Required: Connect your openAiApi credentials in Whisper Transcription.
- In Isolate Transcript Text, enable Keep Only Set and set:
- transcript_text to
{{$json.text}} - from_username to
{{$items("Assemble File Link")[0].json.from_username}}
- transcript_text to
Step 4: Configure Translation and Slack Output
Detect the source language, call OpenAI for translation, and send the formatted message to Slack.
- In Identify Source Language, keep the function code that sets source_lang, target_lang, and original_text.
- In OpenAI Translation Call, set:
- URL to
https://api.openai.com/v1/chat/completions - Request Method to
POST - Authentication to
headerAuth - JSON Parameters to
true - Body Parameters JSON to
{ "model": "gpt-4o-mini", "temperature": 0.2, "messages": [ { "role": "system", "content": "You are a translation engine. Translate exactly, preserving meaning and tone. Output ONLY the translated text." }, { "role": "user", "content": "Source language: {{$json.source_lang}}\nTarget language: {{$json.target_lang}}\nText:\n{{$json.original_text}}" } ] } - Header Parameters JSON to
{ "Authorization": "Bearer {{$credentials.httpHeaderAuth.headerValue}}", "Content-Type": "application/json" }
- URL to
- Credential Required: Connect your httpHeaderAuth credentials in OpenAI Translation Call.
- Keep Compose Slack Payload as-is to format the Slack message with flags, username, original text, and translation.
- In Send Slack Post, set URL to
https://slack.com/api/chat.postMessage, Request Method toPOST, Authentication toheaderAuth, and Body Parameters:- channel to
[YOUR_ID] - text to
{{$json.slack_text}}
- channel to
- Credential Required: Connect your httpHeaderAuth credentials in Send Slack Post.
C0123456789) for [YOUR_ID] to avoid posting failures.Step 5: Test and Activate Your Workflow
Validate the full voice-to-translation pipeline and then enable production execution.
- Click Execute Workflow and send a Telegram voice message to your bot to trigger Telegram Voice Hook.
- Confirm that Fetch Audio Binary returns a file and Whisper Transcription outputs text.
- Verify that OpenAI Translation Call returns a translated response and Send Slack Post posts to the target channel.
- When successful, toggle the workflow to Active for production use.
Troubleshooting Tips
- Telegram credentials can fail if the bot token is wrong or the bot isn’t in the chat. Double-check the bot token and confirm the bot has access to the conversation you’re triggering from.
- If you’re sending longer audio, Whisper can fail or time out. Keep audio files under 25 MB and increase any HTTP timeouts in your HTTP Request nodes if the download or transcription call is flaky.
- Slack posting errors are usually permissions. Check your Slack app scopes first (at minimum chat:write, plus files:write or channels:history depending on your setup), then confirm the channel is accessible to the bot.
Quick Answers
About 30 minutes once your Telegram, Slack, and OpenAI credentials are ready.
No. You’ll paste credentials and adjust a few fields like the Slack channel and translation direction.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI API usage for Whisper transcription and GPT-4o-mini translation (usually pennies for short clips).
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and it’s meant to be tweaked. You can change the language logic in the “Identify Source Language” function and swap the target language in the “OpenAI Translation Call” node. Many teams also adjust the “Compose Slack Payload” step to add bullet points, a short summary line, or a project tag at the top. If you want the same automation for other languages, you can extend the detection and routing instead of hardcoding JA/EN.
Most of the time it’s a bot token problem or the bot doesn’t have access to the chat where the voice note is posted. Regenerate or re-copy your TELEGRAM_BOT_TOKEN, then verify the bot is actually present in the group and allowed to read messages. If file downloads fail, check the Telegram “getFile” HTTP request and confirm you’re passing the right file_id from the voice message payload.
Plenty for most small teams: dozens to a few hundred voice notes a day is realistic, assuming short clips and normal API limits. On n8n Cloud, volume depends on your plan’s monthly executions; if you self-host, you’re mostly limited by your server and OpenAI request throughput. If you expect spikes (like standup time), add basic retry handling on the HTTP Request nodes so one slow transcription doesn’t cause a cascade of failures.
Often, yes, because voice transcription and bilingual handling usually needs multiple steps and conditional logic. n8n makes it easier to pull a binary audio file from Telegram, pass it to Whisper, then route translation based on detected language without paying extra for every branch. You also get the option to self-host, which matters if you’re processing lots of messages. Zapier or Make can still work if your flow is tiny, but once you add file handling and AI calls, n8n tends to stay simpler (and cheaper). Talk to an automation expert if you want help picking the best approach.
Once this is running, voice notes stop being a dead end. Slack gets clean bilingual updates, and your team gets momentum back.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.