Telegram + Google Gemini: instant replies for DMs
Your Telegram DMs don’t slow down just because you’re busy. You answer the same questions, hunt for old context, and still manage to miss messages when things get hectic.
This Telegram Gemini replies setup hits support leads hardest, but marketers running campaigns and founders handling inbound feel it too. The goal is simple: fast answers that stay consistent, even when the message is an image.
You’ll see exactly how this n8n workflow routes DMs, keeps short-term memory across a conversation, and replies instantly with Google Gemini.
How This Automation Works
Here’s the complete workflow you’ll be setting up:
n8n Workflow Template: Telegram + Google Gemini: instant replies for DMs
flowchart LR
subgraph sg0["Telegram Flow"]
direction LR
n0@{ icon: "mdi:robot", form: "rounded", label: "Knowledge Base Agent", pos: "b", h: 48 }
n1@{ icon: "mdi:memory", form: "rounded", label: "Simple Memory", pos: "b", h: 48 }
n2@{ icon: "mdi:swap-vertical", form: "rounded", label: "Map image prompt", pos: "b", h: 48 }
n3@{ icon: "mdi:swap-vertical", form: "rounded", label: "Map text prompt", pos: "b", h: 48 }
n4@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Route Types", pos: "b", h: 48 }
n5@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model", pos: "b", h: 48 }
n6@{ icon: "mdi:robot", form: "rounded", label: "Analyze image", pos: "b", h: 48 }
n7["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Telegram Trigger"]
n8["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Get a file"]
n9["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Send a text message"]
n8 --> n6
n4 --> n3
n4 --> n8
n6 --> n2
n1 -.-> n0
n3 --> n0
n2 --> n0
n7 --> n4
n0 --> n9
n5 -.-> n0
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n7 trigger
class n0,n6 ai
class n5 aiModel
class n1 ai
class n4 decision
classDef customIcon fill:none,stroke:none
class n7,n8,n9 customIcon
Why This Matters: Slow, Inconsistent DM Replies
DMs look harmless until you’re drowning in them. One person asks for pricing, another sends a screenshot of an error, and someone else replies to a message you barely remember sending. If you answer manually, you either go too slow (and lose the lead) or you answer too fast (and get it wrong). The worst part is the context switching: you stop real work, dig for history, write a careful reply, then do it again 10 minutes later.
It adds up fast. Here’s where it breaks down in real teams.
- Reply quality varies by who’s online, which means customers get mixed signals about policies, pricing, or timelines.
- Image messages are a time sink because someone has to open them, interpret them, and type a response from scratch.
- You lose conversation context across back-and-forth messages, so people repeat themselves and get annoyed.
- Important DMs get buried during launches, support spikes, or travel days when you can’t sit in Telegram all day.
What You’ll Build: An AI DM Responder With Memory (Text + Images)
This workflow turns your Telegram bot into a fast first responder for direct messages. A user sends a DM to your bot, and n8n immediately checks what kind of message it is. If it’s text, the workflow builds a clean prompt and passes it to an AI agent that can keep short-term context across the conversation. If it’s an image, the workflow retrieves the Telegram file, asks Gemini to inspect the image content, then composes a prompt that includes what the image shows. In both cases, the AI agent generates a Telegram-friendly reply and sends it back instantly, without you opening the chat.
The workflow starts with a Telegram trigger and routes messages by type. Gemini handles both chat replies and image understanding, while a memory buffer keeps the last 20 messages available so responses don’t feel “stateless.” Finally, n8n sends the formatted reply right back to the same DM thread.
What You’re Building
| What Gets Automated | What You’ll Achieve |
|---|---|
|
|
Expected Results
Say you get about 30 DMs a day, and around a third include an image (screenshots, receipts, error messages). Manually, even “quick” replies take maybe 5 minutes each once you read, check context, and respond, so that’s roughly 2 to 3 hours daily. With this workflow, you spend about 10 minutes setting guardrails and reviewing edge cases, while the bot handles the routine questions and image explanations automatically. Most teams get a couple hours back on day one.
Before You Start
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Telegram for receiving and sending DMs via a bot
- Google Gemini API to generate replies and analyze images
- Telegram Bot Token (get it from @BotFather)
Skill level: Beginner. You’ll connect accounts, paste API keys, and adjust a couple prompts.
Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).
Step by Step
A Telegram DM triggers the workflow. The Telegram Trigger node fires as soon as your bot receives a new message, so replies can be near-instant.
The message gets routed by type. A Switch node checks if the incoming content is plain text or an image. That decision controls which prompt gets composed next, so Gemini receives the right context instead of a messy “one prompt fits all.”
Images are retrieved and inspected. If the DM includes an image, n8n fetches the Telegram file, then sends it to Gemini’s image understanding node to describe what’s in it (errors, UI elements, documents, whatever the user sent).
An AI agent writes the reply with memory. The workflow passes the final prompt to a context-aware agent backed by a memory buffer window, which keeps the last 20 messages available so the response stays coherent across a back-and-forth conversation.
The response is sent to the same chat. The Telegram send node dispatches a formatted reply that reads well in Telegram and doesn’t require you to copy-paste anything.
You can easily modify the prompts to match your tone, or route certain keywords to a human escalation path. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Telegram Trigger
Set up the workflow entry point so your bot receives Telegram messages.
- Add and open Telegram Incoming Trigger.
- Set Updates to
message. - Confirm the node is connected to Route Message Type as the next step.
Step 2: Route Message Types and Retrieve Images
Split the flow so text messages and image messages are handled appropriately.
- Open Route Message Type and verify the first rule checks Left Value
={{ $json.message.text }}with the string exists operator. - Verify the second rule checks Left Value
={{ $json.message.photo[2] }}with the object exists operator. - Confirm the output labeled Text goes to Compose Text Prompt, and the output labeled Image goes to Retrieve Telegram File.
- In Retrieve Telegram File, set Resource to
fileand File ID to={{ $json.message.photo[2].file_id }}. - Credential Required: Connect your
telegramApicredentials in Retrieve Telegram File.
[2] exists in the incoming Telegram payload; otherwise, adjust the index to match your bot’s media sizes.Step 3: Analyze Images and Compose Prompts
Prepare clean text for the AI by building prompt strings from user text or image analysis.
- Open Inspect Image Content and set Resource to
image, Input Type tobinary, and Operation toanalyze. - Set Model to
models/gemini-2.5-flash. - Credential Required: Connect your
googlePalmApicredentials in Inspect Image Content. - In Compose Image Prompt, add a string assignment named text with value
=User image description: {{ $json.content.parts[0].text }} User image caption: {{ $('Telegram Incoming Trigger').item.json.message.caption }}. - In Compose Text Prompt, add a string assignment named text with value
={{ $json.messages[0].text.body }}.
Step 4: Set Up the AI Agent and Memory
Configure the AI assistant to respond to either text or image prompts and maintain conversation context.
- Open Context Response Agent and set Text to
=Use these Descriptions to reply with a message to the user according to his question simply, shortly, and make sure he understand the thing he attaches: "" {{ $json.text }} "" I need your output message to be well spaced and formatted and look as attractive as possible for a telegram response!. - Ensure Compose Text Prompt and Compose Image Prompt both connect to Context Response Agent.
- Open Session Memory Buffer and set Session Key to
=memory_{{ $('Telegram Incoming Trigger').item.json.message.message_id }}with Context Window Length set to20. - Confirm Session Memory Buffer is linked to Context Response Agent via the ai_memory connection.
- Open Gemini Chat Engine and connect it to Context Response Agent as the ai_languageModel.
- Credential Required: Connect your
googlePalmApicredentials in Gemini Chat Engine.
Step 5: Configure the Telegram Reply Output
Send the AI-generated response back to the originating Telegram chat.
- Open Dispatch Telegram Reply and set Text to
={{ $json.output }}. - Set Chat ID to
={{ $('Telegram Incoming Trigger').item.json.message.chat.id }}. - Credential Required: Connect your
telegramApicredentials in Dispatch Telegram Reply.
Step 6: Test and Activate Your Workflow
Validate the full path for both text and image messages, then enable the automation.
- Click Execute Workflow and send a text message to your Telegram bot to test the Route Message Type → Compose Text Prompt → Context Response Agent path.
- Send an image with a caption to test the Retrieve Telegram File → Inspect Image Content → Compose Image Prompt → Context Response Agent path.
- Confirm a properly formatted response arrives via Dispatch Telegram Reply.
- Toggle the workflow to Active to run it continuously in production.
Troubleshooting Tips
- Telegram credentials can fail if the bot token was regenerated. If replies suddenly stop, verify the token in @BotFather and update it in n8n’s Telegram credentials.
- If image replies are blank, the file retrieval step is usually the culprit. Check the “Retrieve Telegram File” node output to confirm you’re actually downloading a file and not just receiving metadata.
- Gemini API calls can error due to quota limits or missing permissions on the key. Open your Google AI API console, confirm billing/quota, then re-test the Gemini chat and image inspection nodes.
Quick Answers
About 30 minutes if your bot token and Gemini key are ready.
No. You’ll mostly connect accounts and tweak prompts to fit your use case.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Google Gemini API usage costs, which depend on how many DMs and image analyses you run.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and you should. Most customizations happen in the “Compose Text Prompt” and “Compose Image Prompt” steps, plus the “Context Response Agent” instructions that tell the bot how to behave. Common tweaks include adding your FAQs, setting boundaries (refund policy, availability, pricing rules), and routing certain keywords to a human instead of the bot.
Usually it’s an invalid or replaced bot token. Regenerate or re-copy the token from @BotFather, update the Telegram credentials in n8n, then re-test the Telegram Trigger and the reply node. If it works in one node but not the other, confirm both nodes are using the same credential entry.
If you self-host, there’s no execution cap (it mainly depends on your server and Gemini quota). On n8n Cloud, the limit depends on your plan’s monthly executions, and you can upgrade if the bot gets busy.
Often, yes, because this isn’t a simple “message in, message out” zap. You’re routing between text and image flows, retrieving files, and keeping conversation context with a memory window, which is harder (and sometimes pricey) to do cleanly in Zapier. n8n also gives you more control over how prompts are constructed, which matters when you care about consistency. If you self-host, you’re not paying per task in the same way, so high DM volume is less scary. That said, if you only need a basic auto-reply with no memory and no image handling, Zapier or Make can be quicker to click together. Talk to an automation expert if you want help choosing.
Once this is running, your DMs stop being a constant interruption and start acting like an organized intake channel. The workflow handles the repetitive questions and the messy image explanations so you can focus on the conversations that actually need you.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.