Telegram to Google Sheets, voice notes become searchable
You record a quick voice note, then it vanishes into your chat history. Later, you remember “that one idea” and spend 10 minutes scrolling, re-listening, and still can’t find it.
This voice note transcription automation hits content creators first, honestly. But journalists and busy ops folks feel it too, because “I’ll transcribe it later” turns into a backlog fast.
In this guide, you’ll see how a Telegram bot transcribes voice messages, saves the original audio to Google Drive, and logs everything into Google Sheets so you can search, filter, and share in seconds.
How This Automation Works
Here’s the complete workflow you’ll be setting up:
n8n Workflow Template: Telegram to Google Sheets, voice notes become searchable
flowchart LR
subgraph sg0["Telegram Voice Message Flow"]
direction LR
n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Un-supported message type"]
n1@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Is audio message?", pos: "b", h: 48 }
n2["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Download audio message"]
n3@{ icon: "mdi:robot", form: "rounded", label: "Transcribe a recording", pos: "b", h: 48 }
n4@{ icon: "mdi:cog", form: "rounded", label: "Upload file", pos: "b", h: 48 }
n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/merge.svg' width='40' height='40' /></div><br/>Merge"]
n6["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Telegram Voice Message Trigger"]
n7["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Inform user via Telegram"]
n8@{ icon: "mdi:database", form: "rounded", label: "Log voice record to google s..", pos: "b", h: 48 }
n9["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Transform the output of voic.."]
n5 --> n9
n4 --> n5
n1 --> n2
n1 --> n0
n2 --> n3
n2 --> n4
n3 --> n5
n6 --> n1
n9 --> n8
n9 --> n7
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n6 trigger
class n3 ai
class n1 decision
class n8 database
class n9 code
classDef customIcon fill:none,stroke:none
class n0,n2,n5,n6,n7,n9 customIcon
Why This Matters: Voice Notes That You Can’t Find Again
Voice notes are great in the moment. They’re fast, they’re effortless, and they capture thoughts while you’re walking, driving, or bouncing between meetings. The problem shows up later. Telegram isn’t built to help you search inside audio, so the “library” you think you’re creating turns into a pile of recordings with no index. If you’re collecting ideas, interview clips, customer quotes, or meeting takeaways, that pile quietly becomes unusable. You stop trusting your own system, and you start redoing work you already did.
It adds up fast. Here’s where it breaks down in real life.
- Finding one specific idea often means replaying a bunch of voice notes, which is frustrating and slow.
- Manual transcription gets postponed, then forgotten, and the best details disappear behind “I’ll do it later.”
- Sharing an audio snippet with a teammate is awkward when there’s no clean link, no context, and no text.
- Without a structured log (date, duration, transcript, URL), you can’t sort, filter, or build a reliable content pipeline.
What You’ll Build: Telegram Voice Notes Logged to Sheets Automatically
This workflow turns Telegram into your “capture inbox” and Google Sheets into your searchable archive. It starts when someone sends a voice message to your Telegram bot. n8n checks the message type so you only process real voice notes, then downloads the audio file from Telegram. Next, the audio gets transcribed using OpenAI Whisper, while the original file is uploaded to Google Drive for safekeeping. After that, the transcript and the Drive file details are merged into one clean record, formatted into columns, and appended to a Google Sheet. Finally, the bot replies in Telegram with a confirmation message that includes the transcript and a download link to the saved audio.
The flow is simple to understand, which is why it’s so useful. Telegram triggers it, OpenAI handles the transcription, Google Drive stores the original, and Google Sheets becomes the place you actually search and organize your notes.
What You’re Building
| What Gets Automated | What You’ll Achieve |
|---|---|
|
|
Expected Results
Say you capture 10 voice notes a week, and each is 30 seconds to 2 minutes. Manually, you’d download the audio (a few minutes), upload it somewhere (a few more), then type out a rough transcript (often 10 minutes per note). That’s about 2 hours a week just to make your own notes usable. With this workflow, you send the voice note like normal and wait for the reply; the log entry and Drive backup happen in the background, usually within a couple minutes.
Before You Start
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Telegram Bot to receive voice messages via a bot.
- Google Sheets to store a searchable transcript log.
- OpenAI API key (get it from the OpenAI dashboard)
Skill level: Beginner. You’ll connect accounts, paste an API key, and choose a target Sheet and Drive folder.
Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).
Step by Step
A Telegram bot receives a new message. The workflow starts with a Telegram Trigger that listens for new messages sent to your bot, so you don’t need to open n8n to “run” anything.
The workflow checks if it’s actually a voice note. A simple conditional step validates the message type. If someone sends text or an image, the bot replies politely and doesn’t waste your transcription credits.
The audio is downloaded, transcribed, and backed up. n8n fetches the .oga file from Telegram, sends it to OpenAI Whisper for transcription, and uploads the original audio to Google Drive so you always have the source file.
Everything gets merged into one clean log entry. The transcript, date/time, duration, and the Google Drive download URL are mapped into a structured row and appended to Google Sheets. A confirmation message is then sent back in Telegram with the transcript and link.
You can easily modify the logging format to include tags, speaker names, or a short summary based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Telegram Trigger
Start by setting up the Telegram listener so the workflow can capture incoming voice messages.
- Add and open Telegram Voice Trigger.
- Set Updates to
message. - Under Additional Fields, enable Download and set it to
true. - Credential Required: Connect your telegramApi credentials.
Step 2: Validate Input and Handle Non-Voice Messages
Ensure only voice messages are processed and provide a helpful reply for unsupported inputs.
- Open Validate Voice Type and confirm the condition uses String → Contains with Left Value set to
={{ $json.message.toJsonString() }}and Right Value set toaudio/ogg. - Connect the false output of Validate Voice Type to Unsupported Input Reply.
- In Unsupported Input Reply, set Text to
=Sorry, I can’t read your input right now. Please send me a voice message, and I’ll help you transcribe and track it! 🎙️💬. - Set Chat ID to
={{ $('Telegram Voice Trigger').item.json.message.chat.id }}. - Credential Required: Connect your telegramApi credentials for Unsupported Input Reply.
Step 3: Fetch the Voice File and Process in Parallel
Download the voice file, then transcribe and store it simultaneously.
- Configure Fetch Voice File with Resource set to
file. - Set File ID to
={{ $json.message.voice.file_id }}. - Credential Required: Connect your telegramApi credentials for Fetch Voice File.
- Fetch Voice File outputs to both Transcribe Audio Clip and Store Audio in Drive in parallel.
- In Transcribe Audio Clip, set Resource to
audioand Operation totranscribe. - Credential Required: Connect your openAiApi credentials for Transcribe Audio Clip.
- In Store Audio in Drive, set Name to
=audio-{{ $now.toFormat("yyyyLLdd-HHmmss") }}-{{$binary.data.fileName}}. - Set Drive to
My Driveand Folder to your destination folder ID. - Credential Required: Connect your googleDriveOAuth2Api credentials for Store Audio in Drive.
true.Step 4: Merge Outputs and Map the Final Record
Combine transcription and file metadata, then structure the output for logging and notifications.
- Connect both Transcribe Audio Clip and Store Audio in Drive to Combine Transcript & File.
- Open Map Voice Record Output and paste the JavaScript shown in the node to map
DateTime,Duration,Transcript,AudioURL, andChatID. - Confirm the code references the trigger chat ID using
$('Telegram Voice Trigger').first().json.message.chat.id.
Step 5: Configure Output Actions
Log the transcription to Google Sheets and notify the user in Telegram.
- Map Voice Record Output outputs to both Append Voice Log and Notify User in Telegram in parallel.
- In Append Voice Log, set Operation to
append, choose your Document, and select the Sheet (e.g.,Sheet1). - Credential Required: Connect your googleSheetsOAuth2Api credentials for Append Voice Log.
- In Notify User in Telegram, set Text to
=✅ Voice Transcription Complete Your voice recording (⏱️ {{ $json.Duration }} seconds, recorded at {{ $json.DateTime }}) has been successfully transcribed and securely stored. 📎 Original audio stored here: {{ $json.AudioURL }} Thank you for using VoiceScribe AI! 🎙️. - Set Chat ID to
={{ $json.ChatID }}. - Credential Required: Connect your telegramApi credentials for Notify User in Telegram.
Step 6: Test and Activate Your Workflow
Verify the end-to-end flow and then turn on the automation.
- Click Execute Workflow and send a voice message to your Telegram bot.
- Confirm Append Voice Log adds a new row containing
DateTime,Duration,Transcript, andAudioURL. - Verify that Notify User in Telegram responds with the formatted confirmation message and link.
- If everything succeeds, toggle the workflow to Active to enable production use.
Troubleshooting Tips
- Google Drive credentials can expire or need specific permissions. If things break, check the n8n Credentials screen and confirm the Drive scope and target folder access first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- OpenAI prompts and defaults can be a little generic. Add your preferred language and formatting early (for example, “return clean paragraphs with punctuation”) or you will keep polishing transcripts manually.
Quick Answers
About 30 minutes if your accounts are ready.
No. You’ll connect Telegram, Google, and OpenAI, then pick the Sheet and Drive folder.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI Whisper API costs (usually pennies per minute of audio).
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, pretty easily. You can replace the “Transcribe Audio Clip” step with Deepgram or AssemblyAI, then keep the same Google Drive + Google Sheets logging. Common tweaks include adding a short summary after transcription, tagging notes by project, or routing specific keywords to Slack instead of (or in addition to) Sheets.
Usually it’s a bot token issue or the wrong update type being listened for. Double-check the token in n8n Credentials, then confirm the Telegram Trigger is configured for messages that include voice notes. If it works for text but not audio, the message-type filter may be too strict, or the “Fetch Voice File” step is missing permission to download files from Telegram’s API.
Most small teams can run hundreds of voice notes a month without thinking about it.
Often, yes, once you go beyond a toy setup. This workflow needs branching (voice vs non-voice), binary file handling for audio, and merging multiple outputs (transcript + Drive link) into a structured row. n8n is comfortable with that complexity, and self-hosting avoids per-task pricing when usage grows. Zapier or Make can still work if you’re doing low volume and want the quickest UI-only setup, but you may run into limits or higher costs with file-heavy steps. Talk to an automation expert if you want help choosing.
Once this is running, your voice notes stop being “messages” and start being an actual knowledge base you can search. Set it up once, then get back to work that needs your brain.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.