🔓 Unlock all 10,000+ workflows & prompts free Join Newsletter →
✅ Full access unlocked — explore all 10,000 AI workflow and prompt templates Browse Templates →
Home n8n Workflow
January 22, 2026

Firecrawl + Google Sheets: site maps you can use

Lisa Granqvist Partner Workflow Automation Expert

You grab a competitor’s URL, start clicking around, and 30 minutes later you’ve got 12 tabs open and no usable list of pages. Worse, you still don’t know what’s a product page, what’s a category, and what’s just fluff.

This Firecrawl Sheets mapping automation hits marketers doing competitor research first, honestly. But sales teams doing lead enrichment and agency operators building audits feel it too. The outcome is simple: a clean Google Sheet with the site mapped and URLs sorted into product, category, and “other” tabs.

You’ll see exactly how the workflow pulls a full internal sitemap, extracts company insights, classifies URLs with AI, and writes everything into structured Sheets you can filter immediately.

How This Automation Works

Here’s the complete workflow you’ll be setting up:

n8n Workflow Template: Firecrawl + Google Sheets: site maps you can use

Why This Matters: Competitor Pages Are Hard to Catalog Fast

“Just map their site” sounds easy until you actually try it. You start from the homepage, click categories, open product pages, then hit a wall because navigation hides half the catalog behind filters and JavaScript. So you switch to a crawler, export a messy CSV, and still spend your afternoon sorting URLs by hand. It’s not only time. It’s context switching, second guessing, and redoing the same work next month because nothing is structured in a way your team can reuse.

The friction compounds. Here’s where it breaks down in real life.

  • People copy and paste URLs into a sheet, but the list is incomplete and the naming is inconsistent.
  • Exports from crawling tools often include tracking parameters and duplicates, so you waste time cleaning before you can even analyze.
  • Someone has to manually label pages as “product” or “category,” and mistakes quietly ruin the conclusions you draw later.
  • When you revisit the research, you can’t compare changes because last time’s data wasn’t mapped the same way.

What You’ll Build: An AI-Classified Site Map in Google Sheets

This workflow turns a single website URL into a structured, reusable research asset. It starts when you submit a domain through an n8n form. The workflow fetches the homepage, extracts the main text, and cleans it up so an AI agent can pull high-level company insights like industry, audience, and whether the business is B2B or B2C. Then Firecrawl maps the site and returns internal URLs with helpful metadata. Those URLs get processed in batches, classified by an AI evaluator into product pages, category pages, or non-commerce pages, and finally written into dedicated Google Sheets tabs. You end with a spreadsheet that feels like it was prepared by an analyst, not dumped out of a crawler.

The workflow starts with intake and homepage interpretation in n8n. After that, Firecrawl handles the heavy lifting of mapping internal links. Finally, AI classification and Google Sheets outputs give you clean tabs you can filter, share, and reuse for audits or competitor tracking.

What You’re Building

Expected Results

Say you map 5 competitor sites in a week. Manually, you might spend about 2 hours per site between clicking, copying URLs, and labeling pages, so that’s roughly a full day lost. With this workflow, submitting each site takes a minute, then you wait while Firecrawl maps URLs and the AI classifier sorts them in batches (often 10–20 minutes per site depending on size). That’s still real time, but it’s not your time, and the output is already in clean Google Sheets tabs.

Before You Start

  • n8n instance (try n8n Cloud free)
  • Self-hosting option if you prefer (Hostinger works well)
  • Firecrawl for mapping internal URLs at scale
  • Google Sheets to store results in structured tabs
  • LLM credentials (Gemini or compatible) (get it from Google AI Studio or your provider dashboard)

Skill level: Beginner. You’ll connect accounts, paste API keys, and update a target Google Sheet.

Want someone to build this for you? Talk to an automation expert (free 15-minute consultation).

Step by Step

Form submission starts the run. You paste a website URL into the n8n form trigger, which becomes the single source of truth for the rest of the workflow.

The homepage gets cleaned for analysis. n8n pulls the homepage HTML via HTTP Request, extracts the readable text, and sanitizes it so the AI agent can summarize the business without getting distracted by navigation and boilerplate.

Firecrawl maps the whole site. Once the domain is recorded in Google Sheets, Firecrawl fetches internal URLs and the workflow interprets URL metadata, then unpacks the URL array into items that can be processed safely.

URLs are classified and written into tabs. A split-in-batches loop sends URLs to the AI evaluator, a sorting step routes them into product/category/other branches, and Google Sheets nodes append rows into the right place.

You can easily modify the classification rules to exclude blogs or legal pages based on your needs. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Form Trigger

Set up the workflow entry point so submissions kick off the URL classification flow.

  1. Add and open Form Intake Trigger.
  2. Use the default webhook settings generated by the node (the webhook ID is created automatically).
  3. Ensure the form fields collect the website URL needed by Fetch Website Source.

Tip: Keep a dedicated field like website or url in the form response to simplify mapping in downstream code nodes.

Step 2: Connect Website Fetch and Extraction

Configure the initial crawl, HTML retrieval, and text cleanup chain.

  1. Open Fetch Website Source and set the request to the submitted URL (if required by your form structure).
  2. Verify the connection from Fetch Website SourceHTML ExtractionSanitize HTML Text.
  3. Open HTML Extraction and configure selectors for the content you want to analyze (e.g., body text or specific elements).
  4. Review Sanitize HTML Text to ensure it outputs clean text for the AI model.

⚠️ Common Pitfall: If HTML Extraction returns empty data, confirm the target site is accessible and your selectors match the live HTML.

Step 3: Set Up AI Classification and Insights

Use Gemini to generate company insights and categorize URLs.

  1. Open Company Insight Agent and configure the prompt to summarize the company based on sanitized text.
  2. Credential Required: Connect your Google Gemini credentials in Company Insight Agent.
  3. Open Category AI Evaluator and configure the prompt to categorize URL batches.
  4. Credential Required: Connect your Google Gemini credentials in Category AI Evaluator.
  5. Confirm the flow: Sanitize HTML TextCompany Insight AgentDecode JSON PayloadUpdate Domain SheetWebsite Map & URL FetchInterpret URL MetadataUnpack URL ArrayBatch IteratorCategory AI Evaluator.

Tip: Keep AI outputs in strict JSON format so Decode JSON Payload can parse them consistently.

Step 4: Configure URL Processing and Parallel Routing

Split and route categorized URLs into separate paths and prepare rows for Sheets.

  1. Open Sort URLs by Category and confirm it outputs structured buckets for category, product, and misc links.
  2. Verify parallel execution: Sort URLs by Category outputs to both Process Category Links and Process Product Links in parallel, and also to Process Misc Links in parallel.
  3. Review the code nodes (Process Category Links, Process Product Links, Process Misc Links, plus other code nodes in the flow) to ensure they map the URL data into row-ready structures.
  4. Confirm Batch Iterator is connected after each append step to continue processing additional URL batches.

Tip: Because this workflow uses many code nodes, keep consistent field names (e.g., url, category, title) to avoid mismatches across branches.

Step 5: Configure Google Sheets Outputs

Send the classified data to the appropriate Sheets tabs.

  1. Open Update Domain Sheet and set the target spreadsheet and sheet for domain-level data.
  2. Open Append Category Rows, Append Product Rows, and Append Misc Rows and set the spreadsheet and sheet names for each category.
  3. Credential Required: Connect your Google Sheets credentials in Update Domain Sheet, Append Category Rows, Append Product Rows, and Append Misc Rows.

⚠️ Common Pitfall: If rows are not appending, verify the sheet names match exactly and the columns align with your row structure.

Step 6: Test and Activate Your Workflow

Run a manual test to validate the full path from form intake to Sheets output.

  1. Click Execute Workflow and submit a sample form response to Form Intake Trigger.
  2. Confirm successful execution through Fetch Website Source, HTML Extraction, Sanitize HTML Text, and the AI nodes.
  3. Check that Update Domain Sheet and the append nodes write new rows to the correct Sheets tabs.
  4. When satisfied, toggle the workflow to Active to enable production processing.
🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Troubleshooting Tips

  • Google Sheets credentials can expire or need specific permissions. If things break, check the connected Google account in n8n’s Credentials and confirm the Sheet is shared with it.
  • If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
  • Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.

Quick Answers

What’s the setup time for this Firecrawl Sheets mapping automation?

About 30 minutes if your credentials are ready.

Is coding required for this site mapping outcome?

No. You will connect accounts, paste API keys, and edit a few fields like your target Sheet.

Is n8n free to use for this Firecrawl Sheets mapping workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Firecrawl and LLM API usage, which depends on how many pages you map.

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

Can I modify this Firecrawl Sheets mapping workflow for different use cases?

Yes, and you probably should. You can adjust what counts as “product” or “category” by editing the prompt in the Category AI Evaluator, and you can add exclusions in the URL processing steps (like filtering out /blog, /careers, or /legal). Many teams also extend the Google Sheets output with extra columns like “priority,” “notes,” or “target keyword.” If you prefer OpenAI instead of Gemini, swap the chat model/agent node and keep the rest of the structure the same.

Why is my Firecrawl connection failing in this workflow?

Usually it’s an API key problem or the wrong Firecrawl base settings in n8n. Regenerate the Firecrawl API key, update it in Credentials, then rerun with a small site first. If the mapping returns empty, the domain may block crawlers or require cookies, so you may need to adjust crawl settings or start from a different entry URL.

What volume can this Firecrawl Sheets mapping workflow process?

A lot, as long as you batch it. On n8n Cloud, your limit is mostly your monthly executions and how many URLs you choose to classify; on self-hosting, the practical limit is server resources and API rate limits. The workflow is already designed with Split in Batches, which keeps big sites from crashing the run. If you’re mapping thousands of URLs regularly, reduce what you classify, increase batch size carefully, and expect longer waits.

Is this Firecrawl Sheets mapping automation better than using Zapier or Make?

For this kind of multi-step crawling and classification, n8n is usually the smoother choice because batching, branching, and code-based cleanup don’t get awkward or expensive. Zapier and Make can do parts of it, but large URL lists often turn into lots of billable tasks. n8n also gives you the option to self-host, which matters once you’re running this every week. If you only want a simple “URL in, row out” flow for a handful of pages, the lighter tools can be fine. Talk to an automation expert if you’re unsure which route fits your volume.

Once this is in place, competitor research stops being a one-off scramble and becomes a dataset you can build on. The workflow handles the sorting, and you get your headspace back.

Need Help Setting This Up?

Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.

Lisa Granqvist

Workflow Automation Expert

Expert in workflow automation and no-code tools.

×

Use template

Get instant access to this n8n workflow Json file

💬
Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Get a free quote today!
Get a free quote today!

Tell us what you need and we'll get back to you within one working day.

Launch login modal Launch register modal