Google Sheets + OpenAI: enriched leads, ready to use
Your “lead list” is probably a spreadsheet full of company names and half-working URLs. And every time you want to prospect, you end up doing the same loop: open tabs, skim homepages, guess what they sell, then write messy notes you’ll never trust later.
This is what marketing ops feels on campaign builds. A founder feels it when pipeline is soft. And a sales lead feels it when SDRs keep asking, “What’s their ICP again?” This Sheets lead enrichment automation turns that raw list into consistent intel you can actually use.
You’ll see how the workflow pulls companies from Google Sheets, scrapes the site content, asks OpenAI to extract structured insights, and writes everything back into the same row so you can review fast.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: Google Sheets + OpenAI: enriched leads, ready to use
flowchart LR
subgraph sg0["Structured Output Pa Flow"]
direction LR
n0@{ icon: "mdi:robot", form: "rounded", label: "Structured Output Parser", pos: "b", h: 48 }
n1@{ icon: "mdi:brain", form: "rounded", label: "OpenAI Chat Model", pos: "b", h: 48 }
n2@{ icon: "mdi:database", form: "rounded", label: "Get rows from Google Sheet", pos: "b", h: 48 }
n3@{ icon: "mdi:wrench", form: "rounded", label: "Call n8n workflow : Scrape ..", pos: "b", h: 48 }
n4@{ icon: "mdi:database", form: "rounded", label: "Update Company's Row on Goog..", pos: "b", h: 48 }
n7@{ icon: "mdi:swap-vertical", form: "rounded", label: "Loop Over Items", pos: "b", h: 48 }
n8@{ icon: "mdi:robot", form: "rounded", label: "AI Agent", pos: "b", h: 48 }
n11["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Webhook"]
n11 --> n2
n8 --> n7
n7 --> n4
n7 --> n8
n1 -.-> n8
n0 -.-> n8
n2 --> n7
n3 -.-> n8
end
subgraph sg1["Tool called from Agent Flow"]
direction LR
n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>ScrapingBee : Scrape company.."]
n6@{ icon: "mdi:play-circle", form: "rounded", label: "Tool called from Agent", pos: "b", h: 48 }
n9@{ icon: "mdi:swap-vertical", form: "rounded", label: "Set company url", pos: "b", h: 48 }
n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/markdown.dark.svg' width='40' height='40' /></div><br/>HTML to Markdown"]
n9 --> n5
n6 --> n9
n5 --> n10
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n6 trigger
class n0,n8 ai
class n1 aiModel
class n3 ai
class n2,n4 database
class n11,n5 api
classDef customIcon fill:none,stroke:none
class n11,n5,n10 customIcon
The Problem: Lead lists don’t come with context
A list of companies without context is busywork disguised as “research.” Someone has to visit each site, figure out what the company actually does, guess the business model, and translate that into an ICP and an angle you can sell to. It’s slow, and it’s mentally tiring in a way that doesn’t show up on a timesheet. Worse, you do it repeatedly because notes are inconsistent, people use different labels, and half the team interprets the same homepage differently. The result is a lead sheet that looks full but still isn’t ready to use.
It adds up fast. Here’s where it breaks down in real teams.
- One person writes “B2B SaaS,” another writes “software,” and now filtering your outreach list is basically guessing.
- Homepage skimming turns into tab overload, and you lose 10 minutes per company just re-finding the basics.
- When the website content doesn’t match the company (wrong URL, holding page, directory listing), you don’t notice until outreach flops.
- Manual research creates silent errors, which means your “personalized” emails end up sounding generic or off-target.
The Solution: Google Sheets enrichment powered by scraping + OpenAI
This workflow starts with a simple Google Sheet: each row has a company name and a website. n8n pulls those rows, processes them one-by-one, and sends the website to a scraping step that fetches the homepage content (via ScrapingBee). That content is converted from HTML into Markdown so it’s cheaper and cleaner to analyze. Then an OpenAI-powered agent reads the text and produces a structured set of fields like Business Area, Offer, Value Proposition, Business Model, and ICP. Finally, n8n updates the original row in Google Sheets, adding the enriched intel plus a practical “Additional Information” section that tells you if the page content was sufficient, mismatched, or missing key details.
The workflow starts with an incoming webhook (so you can trigger it from anywhere). It then pulls sheet records, loops through each company, scrapes the site, and asks OpenAI for consistent outputs. At the end, your sheet becomes a review-ready lead table instead of a research to-do list.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you have 80 companies in a Google Sheet and you normally spend about 8 minutes per company to skim the homepage, write an “offer” note, and guess ICP. That’s roughly 10 hours of research, and it’s easy to lose momentum halfway through. With this workflow, triggering the webhook takes about a minute, then n8n runs the scrape + OpenAI enrichment in the background while you do other work. You still review the outputs, but you’re reviewing 80 structured rows instead of doing 80 little research projects.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Google Sheets to store the company list and results.
- OpenAI to generate structured lead intel fields.
- ScrapingBee API key (get it from your ScrapingBee dashboard).
Skill level: Intermediate. You’ll connect accounts, paste an API key, and map a few fields in Google Sheets.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
A webhook kicks everything off. You trigger the workflow with a webhook call, which means you can start it from a button, a form tool, Telegram, or even another automation when a new list is ready.
Your sheet rows are pulled and queued. n8n reads the Google Sheet and loads each company name + website URL. A split-in-batches loop processes rows one at a time so the enrichment matches the right row (and you don’t accidentally enrich the first company 80 times).
Scraping happens before the AI does any thinking. The agent calls a “scraper workflow” that sends the URL to ScrapingBee, gets the homepage HTML back, and converts it to Markdown. That Markdown is what the OpenAI chat model reads, because it’s cleaner and typically cheaper than raw HTML.
Structured enrichment is written back into the same row. The AI Agent outputs fields like Offer, Value Proposition, Business Model, ICP, plus an “Additional Information” block that warns you when the source content is insufficient or mismatched. Then the Google Sheets update node writes those values into the right columns.
You can easily modify which pages get scraped (homepage vs. pricing) to fit your process based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Webhook Trigger
Set up the workflow entry point so external systems can trigger the enrichment run.
- Add the Incoming Webhook Trigger node as your trigger.
- Set the Path to
53166f88-c88a-4429-b6b5-498f458686b0. - Connect Incoming Webhook Trigger to Retrieve Sheet Records.
Step 2: Connect Google Sheets
Configure the data source that holds the companies to enrich and the destination for updates.
- Open Retrieve Sheet Records and set Document to
[YOUR_ID]and Sheet toSheet1(gid=0). - Set Authentication to
serviceAccount. - Credential Required: Connect your googleApi credentials in Retrieve Sheet Records.
- Open Modify Sheet Row and set Operation to
updatewith Document[YOUR_ID]and SheetCompanies list. - Credential Required: Connect your googleApi credentials in Modify Sheet Row.
- Verify column mappings use the expressions:
{{$json.output['Business Area']}},{{$json.output['Offers or Product']}},{{$json.output['Value Proposition']}},{{$json.output['Business Model']}},{{$json.output['Ideal Customer Profile']}},{{$json.output['Additional Information']}}, and{{$('Retrieve Sheet Records').item.json.row_number}}.
Step 3: Set Up AI Enrichment
Configure the agent, model, and structured parser that generate company insights.
- Open Company Insight Agent and keep Prompt Type set to
definewith the provided instructions in the Text field (uses{{$json.Company}}and{{$json.Domain}}). - Connect OpenAI Chat Engine to Company Insight Agent as the language model.
- Credential Required: Connect your openAiApi credentials in OpenAI Chat Engine.
- Attach Structured Result Parser to Company Insight Agent as the output parser; it uses the defined schema for business fields.
- Attach Homepage Scrape Tool to Company Insight Agent as the tool workflow. This tool passes
{{ $('Retrieve Sheet Records').item.json.Website }}as website.
Step 4: Configure Scraping and Content Processing
Prepare the URL assignment, homepage fetch, and HTML-to-Markdown conversion used by the agent tool chain.
- Open Agent Tool Trigger and connect it to Assign Website URL to start the tool workflow.
- In Assign Website URL, set the url value to
{{$json.website}}. - In ScrapingBee Homepage Fetch, set the URL to
https://app.scrapingbee.com/api/v1and enable Send Query. - Add query parameters in ScrapingBee Homepage Fetch: api_key with your ScrapingBee key, and url set to
{{$json.url}}. - In HTML to Markdown Convert, set HTML to
{{$json.data}}and Destination Key toresponse.
Step 5: Configure Record Iteration and Updates
Process each row and write the AI output back to the sheet.
- Connect Retrieve Sheet Records to Iterate Records to process items in batches.
- Ensure Iterate Records routes to both Company Insight Agent and Modify Sheet Row in sequence as configured.
- Confirm Company Insight Agent outputs to Iterate Records to continue iterating through rows.
Step 6: Test and Activate Your Workflow
Validate the flow end-to-end and then enable it for production.
- Use Incoming Webhook Trigger and click Execute Workflow to run a manual test.
- Confirm Retrieve Sheet Records loads rows and Modify Sheet Row updates the target columns with AI results.
- Verify ScrapingBee Homepage Fetch returns content and HTML to Markdown Convert produces a
responsefield. - When successful, toggle the workflow to Active to accept live webhook requests.
Common Gotchas
- Google Sheets credentials can expire or need specific permissions. If things break, check the n8n credential entry and the target spreadsheet sharing settings first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Frequently Asked Questions
About 30 minutes if you already have your API keys and the Google Sheet ready.
No. You’ll mostly connect accounts and map your Google Sheets columns to the workflow fields.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI usage and your scraping provider costs (often a few dollars for small batches, then more as you scale).
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, but be intentional. You can extend the scraper tool workflow (the ScrapingBee HTTP request and the HTML-to-Markdown step) to also fetch “/pricing” or “/about,” then pass that combined text into the AI Agent. Common customizations include changing the pages you scrape, tightening the OpenAI prompt to match your ICP definitions, and writing results to a CRM instead of Google Sheets.
Usually it’s permissions or an expired token in your Google Sheets credential inside n8n. Reconnect the Google account, then confirm the spreadsheet is shared with that same account and the workflow is pointing at the correct document and tab. If the sheet structure changed (renamed columns, moved tabs), the “update row” step can also fail because it can’t find the expected fields. Finally, watch out for rate limits if you’re running very large batches back-to-back.
Hundreds per run is normal, and more is possible if you pace it and watch scraping + OpenAI costs.
Often, yes, if you care about control and repeatability. This workflow relies on looping, structured AI output parsing, and a “tool workflow” pattern for scraping, and n8n handles that kind of branching without getting awkward. You can also self-host, which is a big deal if you run enrichments frequently. Zapier or Make can still be fine for small lists, but complex scraping + enrichment flows tend to get brittle (and pricey) there. If you want a second opinion before you commit, Talk to an automation expert.
Once this is running, your spreadsheet stops being a list of names and starts acting like a lightweight research database. You review, pick the best fits, and move on.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.