Google Drive to Supabase, searchable docs for AI
Your docs are “somewhere in Google Drive,” but finding the right answer still turns into Slack pings, repeated questions, and a lot of frantic copy-paste.
This hits marketing leads hard when they need approved messaging fast. ops managers feel it when processes live in PDFs. And support teams end up rewriting the same replies. Drive Supabase indexing fixes that by turning your folder into a searchable knowledge base your AI can actually use.
This workflow pulls docs from a Drive folder, turns them into clean text, creates embeddings, and stores everything in Supabase so your Q&A agent can retrieve the right passages in seconds.
The Problem: Google Drive Isn’t Searchable for AI
Google Drive search is fine for humans who know what they’re looking for. It’s terrible when the real need is “answer this question using whatever we’ve already written.” Teams end up asking the same questions in Slack or email because the doc exists, but nobody can find the exact section. PDFs make it worse. So do folders that evolved over years. The result is slow responses, inconsistent messaging, and a quiet tax on your best people’s attention.
The friction compounds.
- People waste about 10 minutes just locating the “right” version of a doc.
- Answers drift because everyone paraphrases from memory instead of quoting the source.
- PDFs and long docs get ignored, which means your most valuable knowledge is effectively offline.
- When someone updates a policy, nobody knows which chatbot or internal page needs updating.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: Google Drive to Supabase, searchable docs for AI
flowchart LR
subgraph sg0["When Executed by Another Workflow Flow"]
direction LR
n0@{ icon: "mdi:vector-polygon", form: "rounded", label: "Embeddings Google Gemini4", pos: "b", h: 48 }
n1@{ icon: "mdi:robot", form: "rounded", label: "Default Data Loader2", pos: "b", h: 48 }
n2["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/postgres.svg' width='40' height='40' /></div><br/>Execute a SQL query"]
n3["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Code in JavaScript"]
n4@{ icon: "mdi:play-circle", form: "rounded", label: "When Executed by Another Wor..", pos: "b", h: 48 }
n5@{ icon: "mdi:swap-vertical", form: "rounded", label: "Loop Over Items", pos: "b", h: 48 }
n6@{ icon: "mdi:cog", form: "rounded", label: "Search files and folders", pos: "b", h: 48 }
n7@{ icon: "mdi:cube-outline", form: "rounded", label: "Insert into Supabase Vectors..", pos: "b", h: 48 }
n8@{ icon: "mdi:cog", form: "rounded", label: "Download File", pos: "b", h: 48 }
n8 --> n7
n5 --> n8
n3 --> n2
n2 --> n6
n1 -.-> n7
n6 --> n5
n0 -.-> n7
n7 --> n5
n4 --> n3
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n4 trigger
class n1 ai
class n7 ai
class n0 ai
class n2 database
class n3 code
classDef customIcon fill:none,stroke:none
class n2,n3 customIcon
The Solution: Turn a Drive Folder Into a Supabase Vector Index
This workflow takes a single input (your Google Drive folder URL) and turns the contents into a semantic search index inside Supabase. It starts by parsing the folder identifier, then prepares your database by creating a vector-ready “documents” table using Postgres with pgvector enabled. After that, it pulls a list of files from the folder, processes them in manageable batches, and downloads each file’s content. For PDFs and other binary formats, it converts the file into plain text so it can be understood by an embedding model. Finally, it generates 768‑dimension embeddings using Google Gemini and stores both the text and embeddings in Supabase, ready for retrieval in a RAG app or AI agent.
The workflow starts when you provide a Drive folder URL (often as a sub-workflow trigger in a larger agent setup). From there, it initializes the Supabase table, loops through each file, extracts text, creates embeddings, and writes everything into a Supabase vector store so your app can do semantic search instead of keyword guessing.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say your team has a Drive folder with about 50 PDFs and docs (sales one-pagers, SOPs, onboarding notes). Manually, indexing that means opening each file, copying text, cleaning it up, and pasting it into a database or “knowledge base” tool. Even at a modest 10 minutes per file, that’s about 8 hours of tedious work. With this workflow, you drop in the folder URL, wait for processing, and the Supabase index is ready for semantic search without the copy-paste marathon.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Google Drive to fetch files from a folder.
- Supabase to store documents and vectors in Postgres.
- Google Gemini API key (get it from Google AI Studio).
Skill level: Intermediate. You’ll paste credentials, add a folder URL, and confirm your Supabase table setup (no heavy coding, but you should be comfortable with API keys).
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
You provide the Google Drive folder URL. The workflow is designed to start from a trigger (often as a sub-workflow), then immediately extracts the Drive identifier so later steps pull the right folder every time.
Supabase gets prepared for vector search. A Postgres step initializes the table structure needed for storing documents and embeddings. Important detail: the included SQL can drop an existing “documents” table, so you want a clean environment or a backup.
Files are retrieved and processed in batches. It lists items in the folder, then loops through them with Split in Batches so large folders don’t choke the run. Each file is downloaded from Drive and passed through a text loader that converts binary content into readable text.
Embeddings are created and stored in Supabase. Google Gemini generates a 768-dimension embedding for each document (or extracted chunk), then the Supabase vector store node writes the text plus vectors into your database so your AI can do semantic retrieval.
You can easily modify which Drive folder is indexed, how text is chunked, or which table/namespace to store into based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Sub-Workflow Trigger
Set up the entry point so the workflow can be invoked by another workflow with a Google Drive folder link.
- Add and open Sub-Workflow Trigger.
- Set Input Source to
jsonExample. - Set JSON Example to
{ "Drive_Folder_link": "https://drive.google.com/drive/folders/example" }. - Confirm the execution path flows from Sub-Workflow Trigger to Parse Drive Identifier.
folderId.Step 2: Connect Google Drive
Authorize Google Drive and configure file discovery and download from the parsed folder ID.
- Open Retrieve Drive Items and set Resource to
fileFolder. - Set the Folder ID filter to
{{ $('Parse Drive Identifier').item.json.folderId }}. - Credential Required: Connect your googleDriveOAuth2Api credentials in Retrieve Drive Items.
- Open Fetch File Content and set Operation to
download. - Set File ID to
{{ $json.id }}and enable conversion totext/plainvia Google File Conversion. - Credential Required: Connect your googleDriveOAuth2Api credentials in Fetch File Content.
Step 3: Set Up Processing and Embeddings
Prepare the SQL vector table, parse the Drive ID, and connect the binary-to-text loader and embedding generator.
- Open Parse Drive Identifier and keep the JavaScript logic as-is to extract
folderIdfrom Drive_Folder_link. - Open Initialize Vector Table and set Operation to
executeQuery. - Set Query to the provided SQL block that creates
documentsand thematch_documentsfunction. - Credential Required: Connect your postgres credentials in Initialize Vector Table.
- Ensure Binary Text Loader uses Data Type
binaryand is connected to Store in Supabase Index via the ai_document input. - Gemini Embedding Generator is connected as the embedding model for Store in Supabase Index — Credential Required: Connect your googlePalmApi credentials on the parent Store in Supabase Index embedding connection, not on the sub-node.
documents table every run. Remove the DROP TABLE line if you need to preserve existing vectors.Step 4: Configure Output to Supabase Vector Store
Store the text and embeddings in Supabase and route outputs back to batch processing.
- Open Store in Supabase Index and set Mode to
insert. - Set Table Name to
documents. - Set Query Name to
match_documents. - Credential Required: Connect your supabaseApi credentials in Store in Supabase Index.
- Confirm the execution path Fetch File Content → Store in Supabase Index → Batch Through Files to continue iterating files.
Step 5: Test and Activate Your Workflow
Validate the workflow with a test folder and then activate it for production.
- Click Execute Workflow and pass a valid Drive_Folder_link in Sub-Workflow Trigger.
- Verify that Retrieve Drive Items lists files, Fetch File Content downloads them, and Store in Supabase Index inserts vectors into
documents. - Check Supabase to confirm new rows and embeddings are stored.
- Click Activate to enable the workflow for production use.
Common Gotchas
- Google Drive credentials can expire or need specific permissions. If things break, check the Google Drive node’s OAuth status and scopes in n8n first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Frequently Asked Questions
About 10 minutes if you already have your credentials ready.
No. You’ll mostly paste credentials, set the folder URL, and test a run.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Google Gemini API usage (the free tier is often enough to test, then you pay based on calls).
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, but you’ll want to be intentional about naming and separation. The simplest approach is to duplicate the workflow and change the “Drive folder URL” input for each folder. If you’d rather keep one workflow, pass in a folder URL each time via the trigger, then write to separate tables or add a “source_folder” field using the Edit Fields (Set) step. You can also swap the Gemini Embedding Generator for a different embedding model later, as long as Supabase stores vectors in a compatible size.
Usually it’s expired or mis-scoped OAuth credentials in the Google Drive nodes. Reconnect the account in n8n, then confirm the workflow has permission to list files and download content from that folder. Also check that the folder URL is correct and accessible to the connected Drive user. If it fails only on some files, it may be a permissions edge case on a shared drive item.
On n8n Cloud Starter, you’re limited by monthly executions, so very large folders may require batching across runs. If you self-host, there’s no execution cap (your server resources become the limit). Practically, most small teams index a few hundred docs comfortably, then schedule re-indexing for changes instead of reprocessing everything daily.
Often, yes. This workflow involves batching, file downloading, text extraction, embeddings, and a vector database write, which is where Zapier/Make can get awkward or expensive fast. n8n also lets you self-host, so you’re not paying per tiny step, and it handles more complex logic without fighting the tool. The tradeoff is setup: you’ll spend a little more time upfront getting credentials and Supabase ready. If you just need a simple “new file → send alert” automation, Zapier is fine. If you’re building a real RAG pipeline, n8n is the safer bet. Talk to an automation expert if you want a quick recommendation for your exact case.
Once your Drive folder is indexed in Supabase, the repeat questions stop being a daily interruption. The workflow handles the busywork, and your team gets answers they can trust.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.