Firecrawl to Google Sheets, audit URLs neatly sorted
You finally sit down to start a site audit, and you lose the first hour just hunting for the right URLs. Someone sends a sitemap link, another person pastes a list into Slack, and now you’re deduping rows and guessing what’s missing. Again.
This Firecrawl Sheets audit setup hits SEO leads hardest, but content managers and agency operators feel it too. You need a clean, sortable URL list that’s ready for auditing, not another messy “starter sheet” you don’t quite trust.
This workflow pulls a sitemap-style URL source, verifies the crawl succeeded, duplicates your Google Sheets template, then writes neatly structured rows so you can start auditing immediately.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: Firecrawl to Google Sheets, audit URLs neatly sorted
flowchart LR
subgraph sg0["When chat message received Flow"]
direction LR
n0@{ icon: "mdi:play-circle", form: "rounded", label: "When chat message received", pos: "b", h: 48 }
n1@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Firecrawl OK", pos: "b", h: 48 }
n2@{ icon: "mdi:cog", form: "rounded", label: "Copy template", pos: "b", h: 48 }
n3@{ icon: "mdi:database", form: "rounded", label: "Data mapping", pos: "b", h: 48 }
n4["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Sorting URL into table"]
n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Bad URL"]
n6@{ icon: "mdi:location-exit", form: "rounded", label: "Map a website and get urls", pos: "b", h: 48 }
n1 --> n2
n1 --> n5
n2 --> n4
n4 --> n3
n6 --> n1
n0 --> n6
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n1 decision
class n3 database
class n5 api
class n4 code
classDef customIcon fill:none,stroke:none
class n4,n5 customIcon
The Problem: URL Lists Are Never Audit-Ready
URL collection sounds simple until you do it for real websites. Sitemaps contain URLs you don’t want, teams paste in partial lists, and “canonical” pages get mixed with parameter variants and staging leftovers. Then you spend your best focus time cleaning instead of finding issues. It’s not just time, either. A sloppy starting list means you miss orphaned sections, audit the wrong templates, and argue about what “the site” even includes.
The friction compounds. Here’s where it breaks down in day-to-day work.
- You end up copy-pasting hundreds (or thousands) of URLs into Sheets, then manually sorting and grouping them.
- Crawls fail silently, so you only notice something’s wrong when the sheet looks suspiciously empty.
- Everyone uses a different spreadsheet format, which makes audits hard to compare across clients or quarters.
- Duplicates and messy URL variants sneak in, which means your “audit findings” don’t map cleanly to real templates.
The Solution: Firecrawl → Sheets, Structured for Audits
This workflow starts with a simple chat-style trigger in n8n. You provide a website map or URL source, and Firecrawl requests the site map data so you’re not manually extracting links. Next, the workflow checks if the crawl actually succeeded. If it didn’t, you get an immediate response so you can fix the input instead of debugging a half-filled spreadsheet later. When it works, n8n duplicates a Google Sheets template from Google Drive, builds a clean URL hierarchy (so URLs are grouped in a consistent way), and appends the rows into the new sheet. The end result is a tidy, audit-ready spreadsheet you can filter and share in minutes.
The workflow kicks off from a chat message, pulls the map via Firecrawl, then routes based on success or failure. On success it creates a fresh client-ready sheet from your template, structures the URLs, and writes them into Google Sheets automatically.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you kick off 5 audits a month and each site has roughly 500 URLs you need to triage. Manually, it’s easy to spend about 10 minutes pulling a sitemap, 30 minutes cleaning it, and another 20 minutes rebuilding your “audit starter” sheet. Call it about 1 hour per site. With this workflow, you send the sitemap/source once, wait a minute or two for processing, and a fresh Google Sheet is created and filled. That’s roughly 4–5 hours back every month, without changing how you audit.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Firecrawl for sitemap/map extraction and crawling.
- Google Drive to duplicate your Sheets template file.
- Google Sheets to store the audit-ready URL list.
Skill level: Beginner. You’ll connect accounts, pick a template file, and paste a URL source into the trigger.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
A chat message triggers the run. You submit the site map URL (or the URL source you use internally) through the workflow’s incoming chat trigger in n8n.
Firecrawl requests the website map data. The workflow calls Firecrawl to fetch the URLs so you don’t have to export, scrape, or copy-paste lists from different tools.
A success check decides what happens next. If the crawl fails, n8n returns a quick “invalid” response so you can correct the input. If it succeeds, the workflow continues automatically.
Your sheet is created, structured, and filled. n8n duplicates your Google Sheets template from Google Drive, builds a URL hierarchy in code, and appends the finished rows to Google Sheets for a clean audit starting point.
You can easily modify the hierarchy rules to match your audit style, so “/blog/” and “/product/” get grouped exactly how your team prefers. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Chat Trigger
This workflow starts when a chat message is received and passes the submitted URL into the crawl request.
- Add and open Incoming Chat Trigger.
- Set Mode to
webhook. - Enable Public by setting it to
true.
Step 2: Connect Firecrawl for Website Mapping
The chat input URL is sent to Firecrawl to generate a sitemap-only map.
- Add and open Website Map Request.
- Set URL to
={{ $json.chatInput }}. - Set Operation to
map. - Set Sitemap Only to
trueand Ignore Sitemap tofalse. - Credential Required: Connect your firecrawlApi credentials.
Step 3: Configure the Crawl Validation Logic
The workflow checks whether Firecrawl returned a successful crawl before continuing to build the sheet.
- Add and open Crawl Success Check.
- Configure the condition to evaluate Left Value as
={{ $json.success }}with the boolean operator set to true. - Ensure the true branch goes to Duplicate Sheet Template and the false branch goes to Invalid URL Reply.
Step 4: Set Up the URL Processing Node
The code node builds a hierarchical structure from the mapped URLs and formats rows for Google Sheets.
- Add and open URL Hierarchy Builder.
- Paste the provided JavaScript into JavaScript Code exactly as in the workflow.
- Confirm the code references Website Map Request using
$('Website Map Request').item.json.
Step 5: Configure the Output Actions
On success, the workflow duplicates a template sheet and appends the hierarchy rows to the new sheet.
- Open Duplicate Sheet Template and set Operation to
copy. - Set Name to
={{ $('Incoming Chat Trigger').item.json.chatInput }} - n8n - Arborescence. - Set File ID to your template ID (replace
[YOUR_ID]). - Credential Required: Connect your googleDriveOAuth2Api credentials.
- Open Append Rows to Sheet and set Operation to
append. - Set Sheet Name to
FR. - Set Document ID to
={{ $('Duplicate Sheet Template').item.json.id }}. - Credential Required: Connect your googleSheetsOAuth2Api credentials.
Step 6: Add Error Handling
If the crawl fails, the workflow returns a JSON response to the requester.
- Open Invalid URL Reply.
- Set Respond With to
json. - Set Response Body to
={ "text": "L'url {{ $('Chat input').item.json.chatInput }} n'est pas une url correcte ou elle n'est pas prise en compte par ce service" }.
Chat input, but the trigger node is named Incoming Chat Trigger. If you encounter expression errors, update the expression to use $('Incoming Chat Trigger').item.json.chatInput so the response renders correctly.
Step 7: Test and Activate Your Workflow
Validate the end-to-end flow from chat input to sheet output before going live.
- Click Execute Workflow and send a test URL to Incoming Chat Trigger.
- Confirm Website Map Request returns
success: trueand flows into Crawl Success Check. - Verify Duplicate Sheet Template creates a new sheet and Append Rows to Sheet appends rows in the
FRtab. - If a bad URL is sent, verify Invalid URL Reply returns a JSON response.
- Toggle the workflow to Active to enable production use.
Common Gotchas
- Google Drive permissions can block template duplication. If it fails, check the template file sharing settings and the Google connection used in n8n.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Frequently Asked Questions
About 20–40 minutes if your Google account is already connected.
No. You’ll connect Firecrawl and Google, then choose your template. The “URL hierarchy” part is already built, and you can tweak it later if you want.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Firecrawl usage costs, depending on how many pages you crawl.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and honestly it’s one of the best parts. You can adjust the grouping logic in the URL Hierarchy Builder node so folders like /blog/, /collections/, or language paths get their own categories. Common tweaks include stripping UTM parameters, forcing lowercase, or collapsing trailing slashes. If your sitemap includes subdomains you don’t want, you can filter them out before rows are appended.
Most of the time it’s an invalid URL source, a blocked site, or missing/expired Firecrawl credentials in n8n. Also check the crawl success output in the “Crawl Success Check” branch, because the workflow will intentionally stop and reply when Firecrawl returns a failure.
A few thousand URLs is normal, as long as your crawl source and account limits allow it.
For sitemap-to-sheet workflows, n8n is usually a better fit because you can handle branching (success vs. failure), custom URL formatting, and template duplication without bolting on extra paid steps. Zapier or Make can work, but you’ll often feel boxed in once you want hierarchy rules, deduping, or smarter validation. n8n also gives you self-hosting, which matters if you run lots of audits. The real question is how often you’ll change the logic after month one. If you want help choosing, Talk to an automation expert.
Once your URL list is clean and repeatable, audits get lighter. The workflow handles the setup work so you can spend your attention on findings that actually move rankings.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.