🔓 Unlock all 10,000+ workflows & prompts free Join Newsletter →
✅ Full access unlocked — explore all 10,000 AI workflow and prompt templates Browse Templates →
Home n8n Workflow
January 22, 2026

WhatsApp + OpenAI: faster replies with voice and PDFs

Lisa Granqvist Partner Workflow Automation Expert

Your WhatsApp inbox turns into a messy mix of “quick question” texts, voice notes you can’t search, and PDFs you have to open just to find one line. Then you reply. Then you reply again. And again. Same info, slightly different wording, every day.

This WhatsApp OpenAI automation hits support leads first, honestly. But consultants answering client questions and small business owners handling sales DMs feel it too. The outcome is simple: consistent answers with context, delivered faster, even when the customer sends voice or documents.

This workflow turns WhatsApp messages into an AI-assisted conversation that can transcribe audio, read PDFs, remember prior messages, and pull answers from a knowledge base. You’ll see what it does, what you need, and where teams usually trip up.

How This Automation Works

See how this solves the problem:

n8n Workflow Template: WhatsApp + OpenAI: faster replies with voice and PDFs

The Challenge: Fast Replies When Messages Aren’t “Readable”

Most WhatsApp support and sales conversations don’t come in neatly. A customer sends a voice note while driving. Another drops a PDF invoice and says “can you check page 3.” Someone else asks a question you answered yesterday, but you can’t remember what you said, and scrolling back through chat history feels like a punishment. Even worse, you reply quickly but inconsistently, so your “official” answer changes depending on who’s holding the phone that day.

The friction compounds. Not because one message is hard, but because you repeat the same micro-tasks dozens of times.

  • Voice notes force you to stop, listen, replay, and guess what the person meant.
  • PDFs and documents slow everything down since the answer is inside a file, not the chat.
  • Without memory, you ask customers to repeat themselves, which makes you look disorganized.
  • Copy-paste “templates” drift over time, and your team ends up sending conflicting information.

The Fix: An AI WhatsApp Assistant That Reads, Listens, and Remembers

This workflow starts when a message hits your WhatsApp webhook (commonly through a provider like Evolution API). It checks authorization settings first, so you can limit responses to an admin number during testing or keep it open for customer support later. From there, it routes the message by type: text goes one way, voice notes get transcribed with OpenAI, and documents like PDFs get processed so the content becomes searchable text. Once the message is “understandable,” an AI agent (GPT-4o via OpenAI’s chat model) uses your saved prompt plus chat memory stored in Postgres to respond consistently and in context.

When a user sends a document, the workflow can also index it into a Supabase vector database (RAG) so future questions can be answered from that file without you re-opening it. Finally, the automation sends the response back through the same WhatsApp channel, with optional delays to bundle messages so replies feel natural instead of spammy.

What Changes: Before vs. After

Real-World Impact

Say your inbox gets 30 WhatsApp messages a day. If 10 are voice notes (about 3 minutes each to listen and replay), and another 10 involve PDFs or attachments (maybe 5 minutes each to open, search, and respond), that’s roughly 80 minutes of pure “decode the message” work. With this workflow, you skim transcriptions and AI-drafted answers instead. You might spend 10 minutes reviewing and correcting, then hit send. That’s about an hour back on a normal day.

Requirements

  • n8n instance (try n8n Cloud free)
  • Self-hosting option if you prefer (Hostinger works well)
  • WhatsApp provider (Evolution API) to receive and send WhatsApp messages
  • OpenAI for GPT-4o responses and Whisper transcription
  • Supabase to store embeddings for document Q&A (RAG)
  • PostgreSQL for chat memory and prompt storage
  • Redis to buffer and group incoming messages
  • OpenAI API key (get it from your OpenAI dashboard)

Skill level: Intermediate. You’ll be comfortable adding credentials, running SQL once, and testing webhooks end-to-end.

Need help implementing this? Talk to an automation expert (free 15-minute consultation).

The Workflow Flow

A WhatsApp message triggers the workflow. Your provider calls an n8n webhook, then the workflow loads configuration (like the admin number and API key settings) and checks if this sender is allowed.

The message gets normalized. A router detects if it’s text, audio, or a file. Audio becomes text via OpenAI transcription, and documents are converted into plain text so the assistant can actually “read” them.

AI responds with context. The workflow pulls your current system prompt from Postgres, loads conversation memory for that user session, and sends everything into an AI agent. If the question needs knowledge lookup, it uses the Supabase vector store to retrieve relevant snippets before generating a reply.

Replies go back to WhatsApp (and your data stores stay updated). The assistant response is formatted for sending, optionally delayed to avoid rapid-fire messages, then delivered through your WhatsApp API. If a new document came in, the workflow also saves a summary and indexing metadata so it becomes usable later.

You can easily modify the authorization rules to handle all customers instead of admin-only, based on your needs. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Webhook Trigger

Set up the entry point for incoming requests so the workflow can receive and validate messages before any processing begins.

  1. Add the Incoming Webhook Trigger node and copy its production URL for external calls.
  2. Connect Incoming Webhook Trigger to Configuration Set to initialize base parameters.

Keep the Incoming Webhook Trigger URL consistent in your external app to avoid webhook mismatches during testing.

Step 2: Connect Core Datastores and Caches

This workflow relies on Redis, Postgres, and Supabase for message history, configuration, and vector storage. Connect these services early to avoid downstream failures.

  1. Open each Redis node (Retrieve Messages, Redis Store, Message List Cache) and connect your Redis instance.
  2. Open each Postgres node used for data operations (Init Vector Tables, Create Knowledge DB, Fetch Prompt, Clear All Messages, Create RAG Control, Update File List, RAG Summaries, Delete File Record, Delete Knowledge Entry) and connect your Postgres database.
  3. Open Remove Old Files and connect your Supabase credentials for file cleanup.
  4. Open Supabase Vector Store and Supabase Vector Store B and connect your Supabase credentials for vector database operations.

Credential Required: Connect your Redis, Postgres, and Supabase credentials on the nodes listed above. These nodes do not have credentials configured yet.

Step 3: Configure Message Intake, Authorization, and Routing

This step ensures the workflow properly validates incoming payloads and routes messages to the correct processing branch.

  1. In Configuration Set, map incoming webhook fields into a standardized payload.
  2. Connect Configuration Set to Authorization Check to validate allowed senders or tokens.
  3. From Authorization Check, route approved messages into Message Type Switch for branching.
  4. Ensure Message Type Switch outputs to File Identifier, Generate File Output, Base Payload, and Map Fields B for the supported message types.
  5. Connect Retrieve MessagesConditional Gate 1Adjust Fields B to control message list retrieval logic.

⚠️ Common Pitfall: If Authorization Check conditions are too strict, the workflow will never reach Message Type Switch. Test with a known-allowed payload first.

Step 4: Set Up File Handling and RAG Ingestion

These nodes handle file conversion, text extraction, and vector storage for retrieval-augmented generation (RAG).

  1. Connect Base64 PrepConvert File AFile Type Router to normalize incoming file payloads.
  2. Route file text extraction through File Type RouterExtract PDF TextOutput Text.
  3. Connect Output Text to Supabase Vector Store to index the extracted text for RAG.
  4. Use File IdentifierRemove Old Files to clean up outdated Supabase files.
  5. Ensure the cleanup and summary flow is connected: Delete File RecordGenerate SummaryUpdate File List.

Credential Required: Connect your Supabase credentials for Supabase Vector Store, Supabase Vector Store B, and Remove Old Files.

Step 5: Configure AI Models, Embeddings, and Memory

These AI components power classification, summarization, and assistant responses.

  1. Connect OpenAI credentials to ChatGPT Model Core, ChatGPT Model RAG, and ChatGPT Summary Model for language model access.
  2. Connect OpenAI credentials to OpenAI Embeddings A and OpenAI Embeddings B to generate vector embeddings.
  3. Verify Recursive Text SplitterDefault Document LoaderSupabase Vector Store for document chunking and ingestion.
  4. Ensure Postgres Chat Memory Main is connected to Assistant Agent and Postgres Chat Memory Aux is connected to Intent Classifier to preserve chat history.
  5. Confirm OpenAI ConnectorMap Fields A and OpenAI Connector BAdjust Fields A for AI-driven payload preparation.

Credential Required: Connect your OpenAI credentials on ChatGPT Model Core, ChatGPT Model RAG, ChatGPT Summary Model, OpenAI Embeddings A, OpenAI Embeddings B, OpenAI Connector, and OpenAI Connector B.

Step 6: Configure Intent Classification, Parallel Prompting, and Merging

This stage merges prompt context with cached data and routes by intent.

  1. Ensure Initial Client Message writes to Message List Cache after processing Map Fields A, Map Fields B, and Adjust Fields A.
  2. Client Message outputs to both Fetch Prompt and RAG Summaries in parallel.
  3. Combine the parallel outputs using Aggregate DB Data BCombine Streams.
  4. Route the merged output to Intent Classifier and then to Condition Router for downstream decision-making.

If Combine Streams receives only one branch, check the parallel execution from Client Message to ensure both branches are emitting data.

Step 7: Configure Assistant Tools and Sub-Workflow Execution

These tools extend the assistant’s capabilities and trigger external workflows.

  1. Connect Condition Router to Process Handling, then to Run Sub-Workflow (Configure Required) to execute downstream actions.
  2. Attach tools to Assistant Agent: Scheduling Tool, Email Tool, Add Knowledge Tool, Search Knowledge Tool, Remove Knowledge Tool, and Remove RAG File.
  3. Attach RAG Tool Node to Assistant Agent for vector search capabilities.
  4. Ensure Postgres Chat Memory Main is connected to Assistant Agent and Postgres Chat Memory Aux is connected to Intent Classifier for memory support.

⚠️ Common Pitfall: Run Sub-Workflow (Configure Required) must be configured with a valid target workflow ID or it will fail silently.

For AI tool and memory sub-nodes (e.g., Scheduling Tool, Email Tool, Postgres Chat Memory Main, RAG Tool Node), add credentials on the parent node (Assistant Agent or Intent Classifier) rather than the sub-node itself.

Step 8: Test and Activate Your Workflow

Validate the workflow end-to-end before turning it on in production.

  1. Use Incoming Webhook Trigger to send a sample payload and click Execute Workflow.
  2. Confirm successful routing through Authorization Check, Message Type Switch, and the parallel branches from Client Message.
  3. Verify vector ingestion by checking outputs from Supabase Vector Store and Supabase Vector Store B.
  4. Check that Assistant Agent produces a response and that any tool outputs are triggered as expected.
  5. Once validated, toggle the workflow to Active for production use.
🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Watch Out For

  • Evolution API credentials can expire or need specific permissions. If things break, check your Evolution API key and webhook delivery logs first.
  • If you’re using Wait nodes or external processing (like file conversion), processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
  • Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.

Common Questions

How quickly can I implement this WhatsApp OpenAI automation?

Plan for an afternoon if you already have your WhatsApp provider, OpenAI key, and databases ready.

Can non-technical teams implement this faster WhatsApp replies solution?

Yes, but you will want one technical person for the first setup. The SQL initialization, credentials, and webhook testing are the parts that usually need help.

Is n8n free to use for this WhatsApp OpenAI automation workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI API costs (often a few dollars a month at low volume, more if you transcribe lots of audio or process many documents).

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

How do I adapt this WhatsApp OpenAI automation solution to my specific challenges?

You can adjust the Configuration Set and Authorization Check logic to control who the assistant responds to, and tune the assistant’s behavior by editing the prompt stored in Postgres (the workflow fetches it before responding). Common customizations include changing your brand tone, adding a stricter “only answer from approved docs” rule, and swapping the retrieval source (Supabase RAG vs. a simpler FAQ table) depending on how sensitive your answers are.

Why is my Evolution API connection failing in this workflow?

Usually it’s an invalid key or the webhook isn’t reaching n8n. Confirm the provider is sending requests to the correct webhook URL, then update the Evolution API token in the workflow’s config and re-test with a fresh message. If it works for a few minutes and then stops, look for rate limits or blocked outbound requests from your hosting environment.

What’s the capacity of this WhatsApp OpenAI automation solution?

It depends more on your hosting and OpenAI rate limits than the workflow itself.

Is this WhatsApp OpenAI automation better than using Zapier or Make?

For this workflow, n8n has a few advantages: you can self-host for unlimited executions, handle branching logic without paying more for “paths,” and run database-backed memory plus RAG in one place. Zapier and Make can connect WhatsApp to AI, sure, but they get awkward fast when you add voice transcription, file handling, and a knowledge base that needs indexing. Also, running webhooks and long-running flows is generally smoother in n8n. If you only need “incoming message → AI → reply” and nothing else, simpler tools can be enough. Talk to an automation expert if you’re not sure which fits.

Once this is running, your inbox stops being a memory test. The workflow handles the repetitive decoding and drafting, so you can focus on the few messages that actually need a human.

Need Help Setting This Up?

Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.

Lisa Granqvist

Workflow Automation Expert

Expert in workflow automation and no-code tools.

×

Use template

Get instant access to this n8n workflow Json file

💬
Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Launch login modal Launch register modal