Telegram to Google Docs, Arabic PDF text ready

You get an Arabic PDF in Telegram, and you already know what’s coming. Someone has to “just pull the text,” fix weird line breaks, add page numbers, and then paste it into something the team can search.

This Telegram PDF OCR workflow hits ops teams and admins first, but marketers dealing with Arabic press clips and reports feel it too. The payoff is simple: send a PDF, get a clean Google Docs link back, and stop burning about 1–2 hours a week on copy-paste cleanup.

Below, you’ll see how the automation runs, what you need to connect, and the real-world time savings you can expect once it’s live.

How This Automation Works

The full n8n workflow, from trigger to final output:

n8n Workflow Template: Telegram to Google Docs, Arabic PDF text ready

Click to explore

flowchart LR

    subgraph sg0["Telegram Bot Flow"]
        direction LR
        n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Download Document from Teleg.."]
        n1["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Telegram Bot Trigger"]
        n2@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Check If Document Attached", pos: "b", h: 48 }
        n3["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Upload PDF to Mistral AI"]
        n4["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Get Mistral Signed URL"]
        n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Process OCR with Mistral"]
        n6["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Parse OCR Results by Page"]
        n7@{ icon: "mdi:cog", form: "rounded", label: "Update Google Doc with Content", pos: "b", h: 48 }
        n8@{ icon: "mdi:cog", form: "rounded", label: "Create New Google Doc", pos: "b", h: 48 }
        n9@{ icon: "mdi:swap-vertical", form: "rounded", label: "Process Document Updates", pos: "b", h: 48 }
        n10@{ icon: "mdi:cog", form: "rounded", label: "Aggregate OCR Results", pos: "b", h: 48 }
        n11["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Send Document Link to User"]
        n12["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Request PDF File Format"]
        n13["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Status: File Received (1/5)"]
        n14["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Status: Sent to Processor (2.."]
        n15["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Status: File Signed (3/5)"]
        n16["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Status: Results Received (4/5)"]
        n17["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/telegram.svg' width='40' height='40' /></div><br/>Status: Creating Document (5.."]
        n1 --> n2
        n10 --> n8
        n10 --> n17
        n8 --> n9
        n4 --> n5
        n4 --> n15
        n9 --> n7
        n5 --> n6
        n5 --> n16
        n3 --> n4
        n3 --> n14
        n6 --> n10
        n2 --> n0
        n2 --> n12
        n7 --> n9
        n7 --> n11
        n0 --> n3
        n0 --> n13
    end

    %% Styling
    classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
    classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
    classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
    classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
    classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef disabled stroke-dasharray: 5 5,opacity: 0.5
    class n1 trigger
    class n2 decision
    class n3,n4,n5 api
    class n6 code
    classDef customIcon fill:none,stroke:none
    class n0,n1,n3,n4,n5,n6,n11,n12,n13,n14,n15,n16,n17 customIcon

The Problem: Arabic PDFs Are Hard to Reuse

Arabic PDFs are often “readable” to humans but useless to your tools. You can’t search them well, you can’t quote them quickly, and copying text usually comes out scrambled (wrong order, broken lines, missing punctuation). Then someone ends up retyping sections or stitching together chunks from different pages. It’s slow, and frankly it’s mentally exhausting work that nobody wants to own. Even worse, the final output varies depending on who did it, so your team spends extra time double-checking and reformatting.

The friction compounds. Here’s where it breaks down in day-to-day work:

People copy text from PDFs into WhatsApp or email, and it loses context and page references.
Arabic OCR is inconsistent across generic tools, so you waste time fixing obvious errors.
Files get saved in random places, which means the “final version” is always a mystery.
Someone has to manually tell the requester it’s done, usually after they follow up twice.

The Solution: Send a Telegram PDF, Receive a Google Doc

This n8n workflow turns Telegram into a simple intake box for Arabic PDFs. A user sends a PDF to your bot, the workflow validates it, downloads the file, and sends it to Mistral’s OCR service to extract Arabic text page by page. The output is then cleaned and organized, including page numbering so the text is easy to reference later. Next, n8n creates a Google Doc in your Google Drive and inserts the OCR text in batches (so long documents don’t time out). Finally, the bot replies in Telegram with a clickable Google Docs link, plus progress messages along the way so users don’t wonder if it’s stuck.

The workflow starts with a Telegram message containing a PDF attachment. From there, it pushes the document to Mistral for Arabic OCR, merges the results into one formatted text body, then creates and fills a Google Doc. The last step is the one you care about: a shareable link back in Telegram.

What You Get: Automation vs. Results

What This Workflow Automates

Results You’ll Get

Checks that the Telegram message includes a valid PDF file.
Uploads the PDF to Mistral and runs Arabic OCR across pages.
Combines OCR output into a single, readable structure with page numbers.
Creates a Google Doc and inserts text in batches to handle longer PDFs.

About 20 minutes of setup time, then it runs on autopilot.
A clean Google Docs link you can share immediately with your team.
Searchable Arabic text, so quoting and referencing stops being painful.
Fewer “is it done yet?” messages because progress updates are automatic.
One consistent output format across every file, even when different people submit.

Example: What This Looks Like

Say you handle 10 Arabic PDFs a week, and each one takes about 15 minutes to copy, clean, reformat, and upload somewhere shareable. That’s roughly 2.5 hours weekly, and it’s the kind of work that steals attention in the middle of your day. With this workflow, each request becomes: 1 minute to forward the PDF to Telegram, then you wait for OCR and Google Docs creation (often about 5–15 minutes depending on the PDF). You still review if the document is critical, but the busywork is gone.

What You’ll Need

n8n instance (try n8n Cloud free)
Self-hosting option if you prefer (Hostinger works well)
Telegram for receiving PDF files from users.
Mistral AI OCR to extract Arabic text from pages.
Google Docs + Google Drive to create and store the final document.
Telegram bot token (get it from @BotFather in Telegram).

Skill level: Intermediate. You’ll connect accounts, paste API keys, and confirm Google Drive permissions for the output folder.

Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).

How It Works

A Telegram message triggers everything. When someone sends your bot a document, n8n receives the update in real time through the Telegram trigger.

The file gets validated and downloaded. If there’s no attachment, or it isn’t a PDF, the workflow replies asking for the correct format. When it is a PDF, it fetches the actual document binary so it can be processed.

OCR runs and the text gets organized. n8n uploads the PDF to Mistral, runs Arabic OCR, splits the output by page, and then merges it back into one coherent result. Page numbering is included so references don’t get lost.

A Google Doc is created and filled. The workflow generates a new Google Doc in your Drive folder, then inserts the text in batches. That batching is what keeps large files from failing midway.

You can easily modify the destination folder and naming convention to match your client, project, or department. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Telegram Trigger

This workflow starts when a Telegram message arrives. Configure the trigger and validate that a PDF attachment is present.

Add and open Telegram Update Trigger and keep Updates set to message.
Credential Required: Connect your telegramApi credentials in Telegram Update Trigger.
Open Validate Attachment Presence and confirm the condition checks {{ $json.message.document.file_name }} with the exists operator.
Verify the error branch connects to Ask for PDF Format for non-PDF messages.

Tip: If you want to allow multiple file types, adjust the condition in Validate Attachment Presence and update the message in Ask for PDF Format.

Step 2: Connect Telegram File Retrieval and Status Updates

The workflow retrieves the uploaded file from Telegram and sends progress messages to the user.

In Fetch Telegram Document, set Resource to file and File ID to {{ $json.message.document.file_id }}.
Credential Required: Connect your telegramApi credentials in Fetch Telegram Document.
Open Ask for PDF Format and set Text to Please send the file in PDF format and Chat ID to {{ $('Telegram Update Trigger').item.json.message.chat.id }}.
Credential Required: Connect your telegramApi credentials in Ask for PDF Format.
Connect Telegram credentials to all status nodes: Progress: File Received, Progress: Sent to OCR, Progress: File Signed, Progress: Results Ready, and Progress: Creating Doc.

Tip: There are multiple Telegram nodes in this workflow (8 total). Ensure the same bot credentials are used consistently across all messaging nodes.

Step 3: Set Up Mistral OCR Processing

These nodes upload the PDF to Mistral, get a signed link, and run OCR. Several steps run in parallel to send progress updates.

In Upload PDF to Mistral, set URL to https://api.mistral.ai/v1/files, Method to POST, and Content Type to multipart-form-data.
Configure Body Parameters in Upload PDF to Mistral with purpose = ocr and file as formBinaryData with input field data.
Credential Required: Connect your httpHeaderAuth credentials in Upload PDF to Mistral.
In Retrieve Mistral File Link, set URL to =https://api.mistral.ai/v1/files/{{ $json.id }}/url and add query parameter expiry = 24.
Credential Required: Connect your httpHeaderAuth credentials in Retrieve Mistral File Link and Run Mistral OCR.
In Run Mistral OCR, set JSON Body to { "model": "mistral-ocr-latest", "document": { "type": "document_url", "document_url": "{{ $json.url }}" }, "include_image_base64": true }.

Upload PDF to Mistral outputs to both Retrieve Mistral File Link and Progress: Sent to OCR in parallel. Retrieve Mistral File Link outputs to both Run Mistral OCR and Progress: File Signed in parallel. Run Mistral OCR outputs to both Split OCR Pages and Progress: Results Ready in parallel.

Step 4: Process OCR Pages and Aggregate Output

After OCR, the results are split into pages and aggregated to prepare the final document content.

In Split OCR Pages, keep the JavaScript code as provided to output pageNumber and content for each page.
In Combine OCR Output, confirm the aggregation fields include =pageNumber and content.

Combine OCR Output outputs to both Generate Google Doc and Progress: Creating Doc in parallel.

Step 5: Configure Google Docs Output and User Delivery

The workflow creates a Google Doc, inserts the OCR text, and sends the link back to the Telegram user.

In Generate Google Doc, set Title to =OCR Result from {{ $('Telegram Update Trigger').item.json.message.document.file_name }} and Folder ID to your target folder ID (replace [YOUR_ID]).
Credential Required: Connect your googleDocsOAuth2Api credentials in Generate Google Doc and Insert Text into Doc.
In Insert Text into Doc, set Operation to update and Document URL to {{ $json.id }}.
Ensure the Insert action text in Insert Text into Doc uses the full expression: {{ $('Combine OCR Output').item.json.content.map((c, i) => `${c}\n\n(Page Number: ${$('Combine OCR Output').item.json.pageNumber[i]})` ).join('\n\n--------\n\n') }}.
In Batch Doc Updates, keep Batch Size set to 1 for sequential updates.
In Send Doc Link to User, set Text to =<a href="https://docs.google.com/document/d/{{ $json.documentId }}">{{ $('Generate Google Doc').item.json.name }}</a> and Chat ID to {{ $('Telegram Update Trigger').item.json.message.chat.id }}.
Credential Required: Connect your telegramApi credentials in Send Doc Link to User.

⚠️ Common Pitfall: If you leave Folder ID as [YOUR_ID], Generate Google Doc will fail. Replace it with an actual Google Drive folder ID.

Insert Text into Doc outputs to both Batch Doc Updates and Send Doc Link to User in parallel.

Step 6: Test and Activate Your Workflow

Run a full test using a PDF in Telegram and verify each stage completes successfully.

Click Execute Workflow and send a PDF to your Telegram bot to trigger Telegram Update Trigger.
Confirm you receive progress messages from Progress: File Received through Progress: Creating Doc.
Verify that a Google Doc is created with the OCR text and that Send Doc Link to User posts the document link to Telegram.
When everything works, toggle the workflow to Active to enable production use.

🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Common Gotchas

Google Docs/Drive credentials can expire or need specific permissions. If things break, check the connected Google account in n8n credentials and confirm the Drive folder sharing settings first.
If you’re using Wait nodes or external OCR processing, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.

Frequently Asked Questions

How long does it take to set up this Telegram PDF OCR automation?

About 20 minutes if your keys are ready.

Do I need coding skills to automate Telegram PDF OCR?

No. You’ll mostly paste credentials and choose the target Google Drive folder.

Is n8n free to use for this Telegram PDF OCR workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Mistral OCR API usage costs for each PDF processed.

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

Can I customize this Telegram PDF OCR workflow for a different naming format?

Yes, and it’s one of the best tweaks to do early. Change the document name in the Google Docs creation step, and adjust the “set/edit fields” mapping that builds the title from Telegram metadata. Common customizations include adding the sender name, inserting the date, using a project code prefix, and saving into different Google Drive folders by chat or user.

Why is my Telegram connection failing in this workflow?

Usually it’s the bot token or webhook setup. Regenerate the token in @BotFather if needed, then update the Telegram credentials in n8n and re-check the webhook is active. Also confirm the bot is actually in the chat you’re testing (group permissions trip people up), and that the message contains a real document attachment rather than a forwarded “preview.”

How many PDFs can this Telegram PDF OCR automation handle?

Plenty for normal team usage. On n8n Cloud Starter, you’re limited by monthly executions, while self-hosting has no execution cap (it depends on your server). In practice, OCR time is the real bottleneck, so larger PDFs simply queue up and take longer.

Is this Telegram PDF OCR automation better than using Zapier or Make?

Often, yes, because this flow has branching, batching, and multi-step OCR processing that gets expensive or awkward elsewhere. n8n also gives you the self-host option, which matters if you process lots of documents and want predictable costs. Zapier or Make can still win for very small, simple flows with two steps. If you’re unsure, Talk to an automation expert and you’ll get a straight recommendation based on volume and risk.

Once this is running, Arabic PDFs stop being a dead end. You’ll get searchable text, a shareable Google Doc, and fewer interruptions during the week.

Telegram to Google Docs, Arabic PDF text ready

How This Automation Works

n8n Workflow Template: Telegram to Google Docs, Arabic PDF text ready

The Problem: Arabic PDFs Are Hard to Reuse