Gmail to Telegram, voice summaries for every email
Your inbox is loud, but most emails don’t deserve full attention. You still open them, scan them, mentally file them, and lose your place in whatever you were doing.
This Gmail Telegram audio setup hits founders first, honestly. Consultants and marketers feel it too, because email keeps interrupting deep work and client delivery. The goal is simple: you’ll listen to a 2–3 sentence voice summary in Telegram and decide fast.
You’ll see how the workflow turns each new Gmail message into a spoken recap (generated by AI, then converted to voice), and sends it to your Telegram chat automatically.
How This Automation Works
Here’s the complete workflow you’ll be setting up:
n8n Workflow Template: Gmail to Telegram, voice summaries for every email
flowchart LR
subgraph sg0["When Email Received Flow"]
direction LR
n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Download Audio File"]
n1@{ icon: "mdi:web", form: "rounded", label: "Generate Summary with AI", pos: "b", h: 48 }
n2@{ icon: "mdi:cog", form: "rounded", label: "Get Telegram Chat ID", pos: "b", h: 48 }
n4["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Convert Text to Speech"]
n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Send Audio to Telegram"]
n6@{ icon: "mdi:play-circle", form: "rounded", label: "When Email Received", pos: "b", h: 48 }
n7["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Prepare Email Data"]
n7 --> n1
n0 --> n5
n6 --> n7
n2 --> n0
n4 --> n2
n1 --> n4
end
subgraph sg1["When Telegram Message Received Flow"]
direction LR
n3@{ icon: "mdi:cog", form: "rounded", label: "Save Chat ID to Database", pos: "b", h: 48 }
n8["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>When Telegram Message Received"]
n8 --> n3
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n6,n8 trigger
class n0,n1,n4 api
class n7 code
classDef customIcon fill:none,stroke:none
class n0,n4,n5,n7,n8 customIcon
Why This Matters: Email Steals Attention in Tiny, Expensive Chunks
Checking email isn’t one task. It’s dozens of micro-decisions that break your day into little pieces. You open a message “just to see,” then you’re reading a thread, searching for context, wondering if it’s urgent, and switching back to the work you were doing. Repeat that 20 times and you’ve basically donated an hour. Add mobile notifications, and it’s worse. The real cost isn’t time alone; it’s the momentum you lose, plus the mistakes that happen when you respond half-focused.
It adds up fast. Here’s where it breaks down in real life:
- You keep opening messages that only needed a quick “ignore / later / now” decision.
- Important threads get buried because everything looks equally urgent at a glance.
- Inbox scanning turns into a loop, especially on your phone between meetings.
- Manual triage is inconsistent, so you miss follow-ups or reply too late.
What You’ll Build: Gmail to Telegram Voice Summaries
This workflow watches your Gmail inbox for new emails, then immediately turns each message into something you can consume without looking at a screen. When a new email lands, n8n extracts the useful parts (sender, subject, date, and a snippet). That cleaned-up payload goes to an AI model (GPT-5 via the AI/ML API) which produces a short, natural summary designed to be spoken out loud. Next, the workflow sends that text to a text-to-speech model (Inworld TTS-1-Max) to generate a lifelike voice message. Finally, the audio is downloaded and delivered to you in Telegram via your bot, so you can listen and decide what to do next.
The workflow starts with a Gmail trigger. Then AI writes a 2–3 sentence spoken recap and converts it into audio. Telegram receives the voice message in the same chat every time, using a saved chat_id so you don’t have to hardcode anything.
What You’re Building
| What Gets Automated | What You’ll Achieve |
|---|---|
|
|
Expected Results
Say you get about 30 emails a day that you feel compelled to check. Manually, even a quick scan is maybe 1 minute each once you unlock your phone, read the subject, open it, and back out, so you’re losing about 30 minutes daily. With this workflow, you spend close to zero time “checking,” because summaries come to Telegram automatically. If you listen to only the 10 that sound important and each voice note is under 30 seconds, you’re down to roughly 5 minutes, with far fewer context switches.
Before You Start
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Gmail for inbox access via OAuth2
- Telegram to receive voice summaries via a bot
- AI/ML API key (get it from your AI/ML API dashboard)
Skill level: Beginner. You’ll connect accounts, paste an API key, and run one test message in Telegram to capture your chat_id.
Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).
Step by Step
A new email arrives in Gmail. The Gmail trigger watches your inbox and fires when a fresh message comes in, so you don’t have to forward anything or tag it.
The email is cleaned up for summarization. A formatting step builds a simple “emailData” block (sender, subject, date, snippet) so the AI model isn’t guessing what matters.
AI writes a short spoken recap, then turns it into audio. The workflow calls the AI/ML API to create a 2–3 sentence summary, then sends that text to the text-to-speech voice model to synthesize an audio file.
Telegram receives a voice note in the right chat. The workflow retrieves your saved Telegram chat_id from a Data Table, downloads the generated audio file, and delivers it via your Telegram bot.
You can easily modify the summarization prompt to match your tone, or add filters so only specific senders get voice notes. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Gmail Trigger
This workflow starts when a new email arrives in Gmail, then formats the email content for AI summarization.
- Add the Gmail Intake Trigger node as your trigger.
- Set Poll Times to
everyMinute. - Credential Required: Connect your gmailOAuth2 credentials.
- Connect Gmail Intake Trigger to Format Email Payload to match the execution flow.
Step 2: Connect Telegram
Telegram is used both to capture the chat ID and deliver the audio summary back to users.
- Add Telegram Message Trigger and set Updates to
message. - Credential Required: Connect your telegramApi credentials to Telegram Message Trigger.
- Connect Telegram Message Trigger to Store Chat ID Record to persist incoming chat IDs.
- In Deliver Audio to Telegram, set Operation to
sendAudioand Chat ID to{{ $json.chat_id }}. - Credential Required: Connect your telegramApi credentials to Deliver Audio to Telegram.
Step 3: Set Up AI Summary Generation
This step transforms the formatted email into a concise, spoken-style summary.
- In Format Email Payload, keep the JavaScript Code as provided to build
emailDatafromFrom,Subject,date, andsnippet. - Connect Format Email Payload to Draft AI Summary.
- In Draft AI Summary, set Model to
openai/gpt-5-1-chat-latest. - Set Prompt to
You are a voice assistant that creates brief, natural-sounding email summaries for audio playback. Create a concise summary (2-3 sentences max) in English. Make it sound natural when spoken aloud. Provide ONLY the summary text, nothing else. Received email: {{ $json.emailData }}. - Credential Required: Connect your aimlApi credentials to Draft AI Summary.
Step 4: Configure the Audio Synthesis and Delivery Pipeline
The AI summary is converted into speech, retrieved as a file, and sent to Telegram.
- Connect Draft AI Summary to Synthesize Speech Audio.
- In Synthesize Speech Audio, set URL to
https://api.aimlapi.com/v1/ttsand Method toPOST. - Enable Send Body and set Body Parameters: model to
inworld/tts-1-maxand text to{{ $json.content }}. - Credential Required: Connect your aimlApi credentials to Synthesize Speech Audio.
- Connect Synthesize Speech Audio to Fetch Telegram Chat ID, then to Retrieve Audio File.
- In Retrieve Audio File, set URL to
{{ $('Synthesize Speech Audio').item.json.audio.url }}and keep Response Format asfile. - In Deliver Audio to Telegram, set Binary Data to
trueand Binary Property Name todata. - Set Additional Fields → Title to
{{ $('Gmail Intake Trigger').item.json.From }} | {{ $('Gmail Intake Trigger').item.json.Subject }}.
[YOUR_ID] in both Fetch Telegram Chat ID and Store Chat ID Record or the chat lookup will fail.Step 5: Test and Activate Your Workflow
Verify the full flow from email intake to Telegram audio delivery before enabling production usage.
- Use Telegram Message Trigger to send a test message to your bot and confirm Store Chat ID Record upserts a record.
- Manually trigger Gmail Intake Trigger by sending an email to the connected inbox.
- Check that Format Email Payload outputs
emailDataand Draft AI Summary returns a short summary. - Confirm Synthesize Speech Audio returns
audio.url, Retrieve Audio File downloads the file, and Deliver Audio to Telegram sends the audio. - When successful, toggle the workflow Active to run continuously.
Troubleshooting Tips
- Gmail OAuth credentials can expire or lack the right scopes. If new emails aren’t triggering, check the Gmail credential status in n8n and re-authenticate first.
- If you’re using Wait behavior or external TTS rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Telegram delivery usually fails because the chat_id was never stored. Run the Telegram Trigger once, send any message to your bot, then confirm the Data Table has your latest chat_id record.
Quick Answers
About 30 minutes if your Gmail and Telegram accounts are ready.
No. You’ll import the workflow, connect credentials, and test your Telegram bot once.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in AI/ML API usage for the LLM summary and text-to-speech (it’s typically a few cents per batch of emails, depending on your volume and model choice).
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and you should. You can change the summary style in the “Draft AI Summary” prompt (for example: more direct, more detailed, different language), then switch the voice by updating the TTS request in “Synthesize Speech Audio.” Common tweaks include only summarizing starred emails, skipping newsletters, sending voice notes only during work hours, and logging each summary to Google Sheets for later search.
Most of the time it’s an invalid bot token or the workflow doesn’t have your current chat_id saved. Re-check the Telegram credentials in n8n, then run the Telegram Trigger and send your bot a message to capture chat_id again. If it still fails, make sure the bot is allowed to message you (start the chat) and that you’re sending to the same chat type you stored.
Plenty for normal inbox traffic. On n8n Cloud, your practical limit is your plan’s monthly executions, while self-hosting mainly depends on your server and API rate limits. If you’re processing hundreds of emails a day, add filters (sender rules, labels, or time windows) so you don’t generate audio for low-value messages.
Often, yes. This workflow combines multi-step AI calls (summary, then TTS), file download handling, and a reusable chat_id capture flow, which is where simpler tools start to feel cramped or expensive at volume. n8n also lets you self-host, so you’re not paying per tiny step forever. That said, if you just need a basic “send email text to Telegram” alert, Zapier or Make can be quicker to click together. If you want help picking, Talk to an automation expert.
Once this is running, your inbox becomes something you can listen to, not something that constantly pulls you back to a screen. Set it up once, and let Telegram do the tapping you on the shoulder.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.