Google Gemini to Google Sheets, PDFs to rows fast
Your inbox (or fax queue) fills up with scanned PDFs, and the same thing happens every time. Someone opens the file, squints at low-quality text, then retypes it into a spreadsheet and hopes they didn’t swap a digit.
This is where Gemini Sheets automation pays off fast. Healthcare administrators feel it when referrals pile up, medical billing teams feel it when patient IDs get mistyped, and office managers end up playing traffic cop for “just one more document.”
This n8n workflow turns faxes and scanned PDFs into clean Google Sheets rows using Google Gemini, then structures the fields so your sheet stays consistent. You’ll see how it works, what you need, and where teams usually trip up.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: Google Gemini to Google Sheets, PDFs to rows fast
flowchart LR
subgraph sg0["On form submission Flow"]
direction LR
n0@{ icon: "mdi:cog", form: "rounded", label: "Extract from File", pos: "b", h: 48 }
n1@{ icon: "mdi:cog", form: "rounded", label: "Google Drive", pos: "b", h: 48 }
n2["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Call Gemini 2.0 Flash with P.."]
n3@{ icon: "mdi:swap-vertical", form: "rounded", label: "Define Prompt", pos: "b", h: 48 }
n4@{ icon: "mdi:cog", form: "rounded", label: "Upload file", pos: "b", h: 48 }
n5@{ icon: "mdi:database", form: "rounded", label: "Append row in sheet", pos: "b", h: 48 }
n6@{ icon: "mdi:robot", form: "rounded", label: "Basic LLM Chain", pos: "b", h: 48 }
n7@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model", pos: "b", h: 48 }
n8@{ icon: "mdi:robot", form: "rounded", label: "Structured Output Parser", pos: "b", h: 48 }
n9["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/form.svg' width='40' height='40' /></div><br/>On form submission"]
n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Code"]
n10 --> n6
n4 --> n3
n1 --> n0
n3 --> n1
n6 --> n5
n0 --> n2
n9 --> n4
n7 -.-> n6
n8 -.-> n6
n2 --> n10
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n9 trigger
class n6,n8 ai
class n7 aiModel
class n5 database
class n2 api
class n10 code
classDef customIcon fill:none,stroke:none
class n2,n9,n10 customIcon
The Problem: PDFs Turn Into Spreadsheet Chaos
“It’s just a quick copy-paste” is the biggest lie in document-heavy operations. A fax comes in as a PDF, you download it, you zoom in, you hunt for the patient name or matter number, then you paste it into a sheet that already has five different formatting styles. Next week, someone needs that data for reporting, and suddenly you realize half the rows don’t match the headers, dates are inconsistent, and a few IDs are simply wrong. Even when you catch errors, you usually catch them late, after follow-ups and rework.
The friction compounds. Not because one PDF is hard, but because you get dozens (sometimes hundreds) and each one steals focus.
- Manual extraction turns into about 10 minutes per document once you include downloading, reading, and double-checking.
- Scanned PDFs often have messy layouts, so the same field appears in a different place every time.
- One small typo can create a billing delay, a compliance headache, or a “who entered this?” investigation.
- Your spreadsheet becomes unreliable, which means people stop trusting it and start building side lists.
The Solution: Extract Fields With Gemini, Append to Sheets
This workflow starts with a simple upload form inside n8n. A teammate uploads the fax or scanned PDF, and the file is temporarily stored in Google Drive so it’s easy to retrieve and process consistently. From there, the workflow reads the PDF content, sends it to Google Gemini (through an HTTP request configured for Gemini’s PDF capabilities), and asks it to identify the specific fields you care about. Then a second AI step takes the raw extraction and formats it into a strict JSON structure, so you get predictable keys that match your Google Sheet headers. Finally, n8n appends a clean new row to Google Sheets, turning each PDF into usable data without retyping.
The workflow kicks off when someone submits the form with a PDF. Gemini reads the document, a formatting step cleans and structures the fields, and Google Sheets receives one tidy row per file. Done.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say your office processes 30 fax PDFs a day, and manual entry takes about 10 minutes each once you open the file, find the right fields, and recheck the row. That’s roughly 5 hours of attention every day, and it’s the kind of work that drains people. With this workflow, a teammate spends maybe 1 minute uploading each file (about 30 minutes total), then n8n and Gemini do the reading and structuring in the background. You still spot-check a few rows, but you’re no longer stuck in copy-paste mode.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Google Drive for storing uploaded PDFs temporarily.
- Google Sheets to store extracted fields as rows.
- Google Gemini API key (get it from Google Cloud with Vertex AI enabled).
Skill level: Intermediate. You’ll connect Google credentials, add an API key, and align your sheet columns with the workflow’s JSON fields.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
A form submission starts everything. Someone uploads a fax or scanned PDF through an n8n form, which gives you one standard intake path instead of “email it to me” chaos.
The file is stored and reloaded reliably. n8n uploads the PDF to Google Drive, then downloads it again for processing, so the extraction step always works from the same kind of file input.
Gemini reads the PDF and pulls out the fields you care about. The workflow extracts file content, sends it to Google Gemini via an HTTP request designed for PDF analysis, then runs a cleanup step so the next AI stage gets cleaner text.
Your data becomes structured rows. A second Gemini chat model formats the extraction into strict JSON, a structured parser validates it, and Google Sheets gets a new appended row that matches your column headers.
You can easily modify the fields you extract to match your exact spreadsheet headers based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Form Submission Trigger
Set up the form that accepts the fax PDF upload and starts the workflow.
- Add and open Form Submission Trigger.
- Set Form Title to
Fax File. - Set Form Description to
User can Upload the Fax PDF directly here!. - In Form Fields, ensure the single file field label is
Upload Fileand required.
Upload_File.Step 2: Connect Google Drive for Upload and Download
Upload the submitted file to Google Drive and retrieve it for processing.
- Open Drive File Uploader and set Name to
Fax test. - Select your Drive and target folder (e.g.,
FAX Test). - Set Input Data Field Name to
Upload_File. - Credential Required: Connect your
googleDriveOAuth2Apicredentials in Drive File Uploader. - Open Drive File Downloader and set Operation to
download. - Set File ID to
={{ $('Drive File Uploader').item.json.id }}. - Credential Required: Connect your
googleDriveOAuth2Apicredentials in Drive File Downloader.
Step 3: Define the Analysis Prompt and Extract PDF Content
Prepare the analysis instruction and convert the PDF file to binary data for Gemini.
- Open Prompt Definition and ensure the prompt assignment contains the full instruction text for extracting medical fields.
- Open File Content Extractor and set Operation to
binaryToPropery.
{{ $('Prompt Definition').item.json.prompt }}, so keep it detailed and precise.Step 4: Configure Gemini PDF Analysis and Text Cleanup
Send the PDF content to Gemini and normalize the response text for structured parsing.
- Open Gemini PDF Analysis Request and set URL to
https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent. - Set Method to
POSTand Specify Body tojson. - Set JSON Body to
={ ... "data": "{{ $json.data }}", "text": "{{ $('Prompt Definition').item.json.prompt }}" ... }(use the full body from the node). - Credential Required: Connect your
googlePalmApicredentials in Gemini PDF Analysis Request. - Open Text Cleanup Script and keep the provided JavaScript to strip markdown and return
extractedText.
Step 5: Set Up LLM Formatting and Structured Parsing
Format cleaned text into a strict JSON schema using Gemini and a structured output parser.
- Open LLM Formatting Chain and keep Prompt Type set to
define. - Ensure the prompt includes the schema and the text placeholder
{{ $json.extractedText }}. - Confirm Has Output Parser is enabled.
- Open Gemini Chat Model and add Credential Required: Connect your
googlePalmApicredentials. - Open Structured JSON Parser and keep the sample schema JSON as the output format guide.
Step 6: Configure Google Sheets Output
Append the structured data to your spreadsheet with mapped columns.
- Open Sheet Row Appender and set Operation to
append. - Select your target spreadsheet in Document ID (e.g.,
Fax Test Extracted). - Select the target sheet in Sheet Name (e.g.,
Sheet1). - Map each column to its output expression, such as Patient ID →
={{ $json.output.PatientID }}and Mental Status →={{ $json.output['Mental Status'] }}. - Credential Required: Connect your
googleSheetsOAuth2Apicredentials in Sheet Row Appender.
Step 7: Test and Activate Your Workflow
Run a manual test with a sample fax PDF and verify rows are appended to your sheet.
- Click Execute Workflow and submit a test PDF through Form Submission Trigger.
- Verify that Drive File Uploader stores the file and Drive File Downloader retrieves it.
- Confirm Gemini PDF Analysis Request returns content and LLM Formatting Chain outputs valid JSON.
- Check your Google Sheet to ensure Sheet Row Appender added a new row with populated fields.
- When satisfied, toggle the workflow to Active for production use.
Common Gotchas
- Google Drive credentials can expire or lack folder permissions. If uploads or downloads fail, check the Google connection in n8n Credentials and confirm the target Folder ID is accessible.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Frequently Asked Questions
About 45 minutes if your Google accounts and sheet are ready.
No. You’ll mostly connect accounts, paste an API key, and match your JSON fields to your sheet headers.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Google Gemini API usage through Google Cloud, which can add a small cost per document depending on model and size.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and you should. Update the field list in the “Define Prompt” step so Gemini knows exactly what to look for, then adjust the JSON schema in the LLM formatting chain so the keys match your Google Sheets column headers. Common tweaks include adding a new identifier field, tightening date formats (DOB), and forcing a specific output for blank or unreadable fields like “UNKNOWN.” If you change the schema, also update the “Append row in sheet” mapping so the right values land in the right columns.
Usually it’s the API key or the wrong API being enabled in Google Cloud. Make sure Vertex AI is enabled for your project, confirm the credential type used in n8n matches your Gemini setup, and re-save the credentials in n8n after rotating keys. Another common issue is request size, since large PDFs can hit limits, so test with a smaller file to isolate the problem.
It depends on your n8n plan and your server, but most small teams run dozens to a few hundred PDFs per day without issues once credentials and prompts are stable.
Often, yes, because this is more than a simple “PDF in, row out” zap. You’ve got two AI stages, structured parsing, and room for branching when a field is missing or a document type changes. n8n also gives you a self-hosted path, which matters once you’re processing lots of documents and don’t want every run to feel expensive. Zapier or Make can still work if your logic stays simple and you value a guided UI over flexibility. If you’re unsure, Talk to an automation expert and describe your document volume and compliance needs.
Once this is running, PDFs stop being a bottleneck and start behaving like data. You’ll feel the difference the next time the fax queue spikes.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.