Google Drive + Postgres: searchable docs, always ready

Your docs are “somewhere in Drive,” but finding the right paragraph at the right moment turns into a mini scavenger hunt. Someone downloads a file, searches manually, copies a quote into Slack, then does it again next week because nothing is indexed.

This is the kind of mess marketing leads feel when they need approved messaging fast. Ops managers run into it when SOPs are buried. And agency owners hate it because every client question becomes a time sink. With Drive Postgres search automation, new files become “askable” without you lifting a finger.

This workflow watches a Google Drive folder, turns new documents into OpenAI embeddings, stores them in Postgres (PGVector), then moves the file so you don’t process it twice. You’ll see what it fixes, what it produces, and what you need to run it reliably.

How This Automation Works

The full n8n workflow, from trigger to final output:

n8n Workflow Template: Google Drive + Postgres: searchable docs, always ready

Click to explore

flowchart LR

    subgraph sg0["When clicking ‘Test workflow’ Flow"]
        direction LR
        n0@{ icon: "mdi:robot", form: "rounded", label: "Default Data Loader", pos: "b", h: 48 }
        n1@{ icon: "mdi:robot", form: "rounded", label: "Recursive Character Text Spl..", pos: "b", h: 48 }
        n2@{ icon: "mdi:cube-outline", form: "rounded", label: "Postgres PGVector Store", pos: "b", h: 48 }
        n3@{ icon: "mdi:play-circle", form: "rounded", label: "When clicking ‘Test workflow’", pos: "b", h: 48 }
        n4@{ icon: "mdi:swap-vertical", form: "rounded", label: "Loop Over Items", pos: "b", h: 48 }
        n5@{ icon: "mdi:cog", form: "rounded", label: "Move File", pos: "b", h: 48 }
        n6@{ icon: "mdi:cog", form: "rounded", label: "Download File", pos: "b", h: 48 }
        n7@{ icon: "mdi:cog", form: "rounded", label: "Search Folder", pos: "b", h: 48 }
        n8@{ icon: "mdi:play-circle", form: "rounded", label: "Schedule Trigger", pos: "b", h: 48 }
        n9@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Switch", pos: "b", h: 48 }
        n10@{ icon: "mdi:cog", form: "rounded", label: "Extract from PDF", pos: "b", h: 48 }
        n11@{ icon: "mdi:cog", form: "rounded", label: "Extract from Text", pos: "b", h: 48 }
        n12@{ icon: "mdi:cog", form: "rounded", label: "Extract from JSON", pos: "b", h: 48 }
        n13@{ icon: "mdi:vector-polygon", form: "rounded", label: "Embeddings OpenAI", pos: "b", h: 48 }
        n9 --> n10
        n9 --> n11
        n9 --> n12
        n5 --> n4
        n6 --> n9
        n7 --> n4
        n4 --> n6
        n10 --> n2
        n8 --> n7
        n13 -.-> n2
        n12 --> n2
        n11 --> n2
        n0 -.-> n2
        n2 --> n5
        n1 -.-> n0
        n3 --> n7
    end

    %% Styling
    classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
    classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
    classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
    classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
    classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef disabled stroke-dasharray: 5 5,opacity: 0.5
    class n3,n8 trigger
    class n0,n1 ai
    class n2 ai
    class n13 ai
    class n9 decision

The Problem: Drive Documents Aren’t Actually Searchable

Google Drive is great at storing files. It’s not great at answering questions. Sure, you can search titles and maybe a few keywords, but real work happens inside long PDFs, messy JSON exports, and “final_v7” text notes that never got cleaned up. So people fall back to the worst system: asking whoever “might remember.” That costs time, interrupts deep work, and quietly creates inconsistencies because everyone quotes a slightly different version of the truth.

It adds up fast. Here’s where it breaks down once your folder becomes a living library instead of a neat archive.

Teams re-open the same PDF over and over just to find one sentence they used last month.
Important context gets lost because Drive search can’t find “similar meaning,” only matching words.
Manual indexing never happens, and when it does, it becomes stale within a week.
Duplicates creep in because nobody can tell what’s already been processed and stored elsewhere.

The Solution: Auto-Vectorize Drive Files Into Postgres

This n8n workflow turns your Google Drive folder into a steady ingestion pipeline for semantic search. It runs on a schedule (3 AM by default) or manually when you hit “Test workflow,” then looks inside a chosen Drive folder for new files. Each file is downloaded, routed by type (PDF, TXT, or JSON), and parsed into clean text. From there, the workflow splits the content into chunks, generates OpenAI embeddings using the text-embedding-3-small model, and inserts everything into Postgres with PGVector so it’s immediately query-ready for RAG, internal search, or an AI agent. After a successful insert, the file gets moved to a “vectorized” folder so the pipeline stays tidy and deduplicated.

The workflow starts with a Drive folder scan, then processes files in batches so it doesn’t choke on a big upload. It extracts text based on MIME type, generates embeddings, stores them in your PGVector table, and finally relocates the source file to mark it as done.

What You Get: Automation vs. Results

What This Workflow Automates

Results You’ll Get

Monitors a specific Google Drive folder on a schedule or on demand.
Downloads each file and routes it to the right parser (PDF, TXT, JSON).
Splits content into usable chunks and generates OpenAI embeddings automatically.
Inserts vectors into Postgres (PGVector) and moves processed files to a “vectorized” folder.

Most teams get back about 2–5 hours a week they were spending re-finding answers.
Search shifts from “keyword guessing” to meaning-based lookup you can build RAG on.
Less rework because the same source truth is reused across docs, support, and marketing.
Deduplication becomes automatic, so your index doesn’t quietly rot.
New files are query-ready by the next run, including overnight ingestion.

Example: What This Looks Like

Say your team drops 20 new docs a week into a shared Drive folder (a mix of PDFs, meeting notes, and JSON exports). Manually, someone usually spends about 10 minutes per doc downloading, skimming, and pulling the right excerpt when questions pop up, which is roughly 3 hours weekly. With this workflow, it’s closer to 5 minutes to set the doc in the right folder and forget it, then the overnight run ingests everything automatically. Next day, answers are a database query away.

What You’ll Need

n8n instance (try n8n Cloud free)
Self-hosting option if you prefer (Hostinger works well)
Google Drive to store files and monitor folders.
Postgres with PGVector to store embeddings for search.
OpenAI API key (get it from the OpenAI API dashboard).

Skill level: Intermediate. You’ll connect credentials, paste folder IDs, and confirm your Postgres/PGVector table settings.

Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).

How It Works

A schedule (or manual run) kicks things off. You can run it nightly (3 AM is the default) or click “Test workflow” when you want to ingest files right now.

The workflow finds files in your chosen Drive folder. It looks up the folder, then iterates through items in batches, which keeps runs stable even when someone dumps in a big backlog.

Each file is downloaded and converted to text. A file-type router sends PDFs to the PDF parser, plain text to a text parser, and JSON to a JSON parser so you get consistent text out the other end.

Embeddings are generated and stored in Postgres (PGVector). The content is chunked, vectors are created with OpenAI’s embedding model, and then inserted into your configured PGVector collection for semantic search.

You can easily modify supported file types to include things like DOCX or Markdown based on your needs. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Manual and Schedule Triggers

Set up the two triggers that can start the workflow manually or on a schedule.

Open Manual Execution Start and keep it as the on-demand trigger for testing and manual runs.
Open Scheduled Automation Trigger and set the schedule rule to run every 4 hours (Interval → hours → Hours Interval = 4).
Verify both triggers connect to Find Drive Folder to ensure either start point flows into the same processing path.

Step 2: Connect Google Drive and Locate Files

Configure Google Drive access and define where the workflow will search for files to process.

Open Find Drive Folder and set Resource to fileFolder, Return All to true, and Options → Fields to name and id.
Set Filter → Folder ID to the source folder using [YOUR_ID] and What to Search to files.
In Iterate Item Batches, keep default settings to process files one by one from the folder results.
Open Retrieve Drive File and set Operation to download with File ID set to {{ $json.id }}.
Credential Required: Connect your googleDriveOAuth2Api credentials in Find Drive Folder, Iterate Item Batches (if prompted), and Retrieve Drive File.

⚠️ Common Pitfall: Replace [YOUR_ID] in Find Drive Folder with the actual Google Drive folder ID, or no files will be returned.

Step 3: Route Files and Extract Content

Use a file-type router to send each file to the correct parser before vectorization.

Open Route by File Type and confirm the three rules match MIME types using {{ $binary["data"].mimeType }} with right values application/pdf, text/plain, and application/json.
Ensure Route by File Type outputs to Parse PDF Content, Parse Text Content, and Parse JSON Content based on the matching MIME type.
Set Parse PDF Content to Operation pdf, Parse Text Content to Operation text, and Parse JSON Content to Operation fromJson.

Step 4: Set Up AI Processing and Vector Storage

Chunk, enrich, embed, and store the extracted content in PGVector.

Configure Recursive Text Chunker with Chunk Overlap set to 50.
Open Standard Data Loader and verify metadata mapping values are set to {{ $('Retrieve Drive File').item.json.name }} for filename and {{ $('Retrieve Drive File').item.json.id }} for id.
Open PGVector Storage Insert and set Mode to insert, Table Name to collection_vectors, and Collection Name to workflow_generator with Collection Table Name embedding_collections.
Confirm OpenAI Embedding Generator is connected as the embedding model for PGVector Storage Insert.
Credential Required: Connect your postgres credentials in PGVector Storage Insert.
Credential Required: Connect your openAiApi credentials in OpenAI Embedding Generator (this node provides embeddings to PGVector Storage Insert).

Tip: The Standard Data Loader and Recursive Text Chunker are part of the AI chain feeding PGVector Storage Insert, so ensure their connections remain intact before testing.

Step 5: Configure File Relocation After Processing

Move files to a destination folder once they are inserted into PGVector.

Open Relocate Drive File and set Operation to move.
Set File ID to {{ $('Iterate Item Batches').item.json.id }} and choose Drive ID My Drive.
Set Folder ID to the target folder using [YOUR_ID] (cached name vectorized).
Credential Required: Connect your googleDriveOAuth2Api credentials in Relocate Drive File.

⚠️ Common Pitfall: If [YOUR_ID] in Relocate Drive File is not updated, processed files will fail to move and remain in the source folder.

Step 6: Test and Activate Your Workflow

Validate the full flow and then enable automation for production use.

Click Manual Execution Start and run the workflow to test with a sample file from the source folder.
Confirm a successful run shows Route by File Type selecting the correct parser and PGVector Storage Insert inserting into collection_vectors.
Verify the file is moved by Relocate Drive File into the destination folder.
When satisfied, activate the workflow so Scheduled Automation Trigger runs every 4 hours automatically.

🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Common Gotchas

Google Drive credentials can expire or need specific permissions. If things break, check the Google Drive OAuth connection inside n8n’s Credentials page first.
If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
OpenAI requests can fail on quota or billing issues, and it’s easy to miss. If embeddings suddenly stop, check your OpenAI API usage limits and make sure the key in the Embeddings node matches the active project.

Frequently Asked Questions

How long does it take to set up this Drive Postgres search automation?

About an hour if your Drive and Postgres access is ready.

Do I need coding skills to automate Drive Postgres search?

No. You will mostly paste IDs, connect accounts, and test a run.

Is n8n free to use for this Drive Postgres search workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI API costs, which are usually a few cents per document depending on length.

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

Can I customize this Drive Postgres search workflow for DOCX and scanned PDFs?

Yes, but you’ll add one or two extraction steps. You can route DOCX by extending the “Route by File Type” logic and adding another “Extract from File” parser for that MIME type. For scanned PDFs, you typically insert an OCR step before “Parse PDF Content,” then pass the OCR text into the same chunking and embedding path. Common tweaks also include changing chunk size in the text splitter and writing extra metadata fields (like department, client, or doc type) into the PGVector records.

Why is my Google Drive connection failing in this workflow?

Usually it’s expired OAuth consent or the wrong Google account connected. Reconnect the Google Drive credential in n8n, then confirm the account can access both the source folder and the “vectorized” folder. If it still fails, check that the folder IDs in the Find/Search/Move Drive nodes are correct and that the Drive API isn’t restricted by your workspace admin.

How many documents can this Drive Postgres search automation handle?

A lot, as long as your server and database can keep up. On n8n Cloud, execution volume depends on your plan, while self-hosting removes hard execution limits and shifts the bottleneck to CPU/RAM and Postgres performance. In practical terms, most teams start with a small batch size, ingest a backlog overnight, then let the nightly schedule handle new files going forward. If you plan to ingest thousands of large PDFs, you’ll want to tune batching and make sure PGVector indexes are set up properly.

Is this Drive Postgres search automation better than using Zapier or Make?

For this workflow, n8n has a few advantages: more complex logic with unlimited branching at no extra cost, a self-hosting option for unlimited executions, and native LangChain/PGVector-style building blocks that many no-code tools don’t handle cleanly. The batching is also a big deal when you ingest backlogs, because you can control throughput instead of timing out. Zapier or Make can still work if you only need “new file → send notification,” but embeddings plus a database insert is where they start to feel awkward. If you’re unsure, Talk to an automation expert and sanity-check the best path. Honestly, choosing the wrong tool here gets expensive later.

Once this is running, new Drive files stop being “stuff you uploaded” and start being a searchable knowledge source your tools can actually use. Set it up once, then let the folder stay ready.

Google Drive + Postgres: searchable docs, always ready

How This Automation Works

n8n Workflow Template: Google Drive + Postgres: searchable docs, always ready

The Problem: Drive Documents Aren’t Actually Searchable