Bright Data to Google Sheets, clean leads ready
You grab a few leads from Yelp, then a few more, then you realize you’ve been copy-pasting for an hour and the sheet still looks messy. Names don’t match, phone numbers are missing, and half the “emails” are actually contact forms you can’t use.
This hits agency owners building local outreach lists first. But marketing managers and scrappy founders feel it too, especially when Bright Data Sheets automation is the difference between “we’ll start next week” and sending campaigns today.
This workflow turns local directory listings into a clean Google Sheet, plus a CSV in your inbox. You’ll see what it automates, what results to expect, and the few places people usually get stuck.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: Bright Data to Google Sheets, clean leads ready
flowchart LR
subgraph sg0["🔘 Manual Launch Flow"]
direction LR
n0@{ icon: "mdi:play-circle", form: "rounded", label: "🔘 Manual Launch Trigger", pos: "b", h: 48 }
n1@{ icon: "mdi:swap-vertical", form: "rounded", label: "🔗 Define Yelp Business Link", pos: "b", h: 48 }
n2@{ icon: "mdi:robot", form: "rounded", label: "🤖 Agent: Extract Yelp Details", pos: "b", h: 48 }
n3@{ icon: "mdi:brain", form: "rounded", label: "💬 AI Model: Interpret Data", pos: "b", h: 48 }
n4@{ icon: "mdi:cog", form: "rounded", label: "🌐 Bright Data MCP Tool", pos: "b", h: 48 }
n5@{ icon: "mdi:message-outline", form: "rounded", label: "📧 Dispatch Partnership Email", pos: "b", h: 48 }
n6@{ icon: "mdi:robot", form: "rounded", label: "Auto-Repair Output Parser", pos: "b", h: 48 }
n7@{ icon: "mdi:brain", form: "rounded", label: "OpenAI Conversational Model", pos: "b", h: 48 }
n8@{ icon: "mdi:robot", form: "rounded", label: "📝 Convert Scraped Data JSON", pos: "b", h: 48 }
n7 -.-> n6
n6 -.-> n2
n4 -.-> n2
n3 -.-> n2
n0 --> n1
n1 --> n2
n8 -.-> n6
n2 --> n5
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n2,n6,n8 ai
class n3,n7 aiModel
The Problem: Local lead lists get messy fast
Building a local lead list sounds simple until you actually do it. Yelp listings look structured, but once you start collecting them, the cracks show. One business has a phone number, the next only has a website, and the third has a slightly different name across categories. Then the real time drain starts: cleaning fields, removing duplicates, and trying to make the list usable for outreach (or for your CRM). Meanwhile, the best leads go cold because you’re still “prepping the spreadsheet.”
It adds up fast. Here’s where it usually breaks down.
- You waste about 2 hours just collecting 50 listings if you do it by hand.
- Duplicates sneak in because the same business appears under different categories or locations.
- Outreach slows down when the sheet has inconsistent formats (ratings as text, phones with extra characters, blank categories).
- Teams lose confidence in the list, so they don’t use it, which honestly defeats the whole point.
The Solution: Bright Data pulls listings, OpenAI cleans, Sheets stays tidy
This n8n workflow automates the messy middle between “I need local leads” and “I’m ready to send outreach.” You start it with a Yelp business search link (or a directory source you’ve defined), and Bright Data handles the heavy lifting of pulling listing details at scale. Those raw results then go through an AI cleanup layer that normalizes fields like business names, categories, ratings, and contact info. The workflow also checks for duplicates so your sheet doesn’t inflate every time you run a new scrape. Finally, you get the output in two practical places: a structured Google Sheet for filtering and a CSV delivered to your Gmail for quick sharing or importing elsewhere.
The workflow starts with a manual trigger in n8n, then uses a defined Yelp link to guide extraction. From there, a LangChain-style AI agent interprets the scraped content, cleans it, and shapes it into a predictable JSON format that’s ready for spreadsheets and outreach.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you need 100 leads for “dentists in Austin.” Manually, you might spend about 2 minutes per listing to open pages, copy fields, and paste into Sheets, which is roughly 3 hours, and that’s before cleanup. With this workflow, you spend about 5 minutes setting the Yelp link and location parameters, then let Bright Data and the AI cleanup run. Even if processing takes 20–30 minutes in the background, you’re not doing repetitive work, and the output lands ready to filter and start outreach.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Bright Data for directory scraping at scale
- Google Sheets to store and filter cleaned leads
- OpenAI API key (get it from your OpenAI dashboard)
Skill level: Intermediate. You’ll connect accounts, paste API keys, and adjust a couple of input fields like city/ZIP and category.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
You launch the workflow manually. This one starts with a manual trigger, which is perfect when you want to run it “on demand” for a new city, niche, or client request.
You define the Yelp business search link. A Set node stores the directory URL and the parameters you care about, so the scrape stays focused and repeatable.
Bright Data fetches the listing details and the AI agent interprets them. The workflow uses the Bright Data MCP tool to pull structured content, then an AI Agent plus OpenAI chat models clean up names, normalize categories, and shape the output into consistent fields.
You receive usable output. The workflow produces a structured dataset (ideal for Google Sheets) and sends a CSV via Gmail so you can forward it, import it, or archive it.
You can easily modify the location and category inputs to target different markets based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Manual Trigger
This workflow starts manually, so you’ll trigger it from within n8n to test Yelp scraping and email delivery.
- Add or confirm the 🔘 Manual Launch Trigger node at the start of the workflow.
- Ensure 🔘 Manual Launch Trigger is connected to 🔗 Define Yelp Business Link.
Step 2: Connect the Yelp Source URL
Define the Yelp URL that will be scraped by the AI agent.
- Open 🔗 Define Yelp Business Link and set an assignment for URL to
https://www.yelp.com/biz/william-kimbrough-md-washington. - Confirm the node outputs the URL field for use in the agent prompt.
Step 3: Set Up the AI Extraction Pipeline
The AI agent orchestrates scraping, interpretation, and structured parsing using connected AI tools and models.
- Open 🤖 Agent: Extract Yelp Details and set Text to the full prompt including the expression
{{ $json.URL }}. - Ensure 💬 AI Model: Interpret Data is connected as the language model for 🤖 Agent: Extract Yelp Details.
Credential Required: Connect your openAiApi credentials to 💬 AI Model: Interpret Data. - Ensure 🌐 Bright Data MCP Tool is connected as the AI tool for 🤖 Agent: Extract Yelp Details and uses Tool Name
scrape_as_markdownwith Tool Parameters set to{{ /*n8n-auto-generated-fromAI-override*/ $fromAI('Tool_Parameters', ``, 'json') }}.
Credential Required: Connect your mcpClientApi credentials to 🌐 Bright Data MCP Tool (added via the agent’s AI tool connection). - Confirm Auto-Repair Output Parser and 📝 Convert Scraped Data JSON are connected as the output parser chain for 🤖 Agent: Extract Yelp Details.
- In 📝 Convert Scraped Data JSON, keep the JSON Schema Example that defines fields like
business_name,location,rating, andwebsite. - Ensure OpenAI Conversational Model is connected as the language model for Auto-Repair Output Parser.
Credential Required: Connect your openAiApi credentials to OpenAI Conversational Model.
Step 4: Configure the Email Output
Send a personalized outreach email using the scraped business details.
- Open 📧 Dispatch Partnership Email and set Send To to
[YOUR_EMAIL]. - Set Subject to
=Potential Partnership with {{ $json.output[0].business_name }}. - Set Message to the provided template using expressions like
{{ $json.output[0].rating }},{{ $json.output[0].category }}, and{{ $json.output[0].website }}. - Credential Required: Connect your gmailOAuth2 credentials to 📧 Dispatch Partnership Email.
[YOUR_EMAIL] and the signature placeholders before testing, otherwise emails may fail or include incorrect branding.Step 5: Test and Activate Your Workflow
Run a manual test to validate the scrape and email content, then activate the workflow for ongoing use.
- Click Execute Workflow on 🔘 Manual Launch Trigger.
- Verify that 🤖 Agent: Extract Yelp Details returns structured output matching the schema in 📝 Convert Scraped Data JSON.
- Confirm 📧 Dispatch Partnership Email sends a message with the business name and rating filled in correctly.
- When satisfied, toggle the workflow to Active to use it in production.
Common Gotchas
- Bright Data credentials can expire or need specific permissions. If things break, check your Bright Data workspace settings and access tokens first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Frequently Asked Questions
About 30 minutes if you already have your accounts and API keys ready.
No. You will connect services and paste keys into n8n, then tweak a couple of input fields like location and category.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in OpenAI API costs (often just a few cents per run) and whatever Bright Data usage you consume.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and it’s one of the best reasons to use it. You’ll update the “Define Yelp Business Link” step to point at a different search URL, then adjust the AI agent instructions to keep the same output columns. Common tweaks include switching niches (restaurants to dentists), changing the geo scope (ZIP to city), and adding extra fields you care about (like price range or review count) so your Google Sheet stays consistent.
Usually it’s expired credentials or the wrong workspace/project permissions inside Bright Data. Regenerate the token, then update it in the Bright Data node in n8n. If it still fails, check usage limits and make sure the target URL is reachable from Bright Data’s network. One more thing: community nodes used here require self-hosted n8n, so n8n Cloud may not support every node in this exact template.
A few hundred per run is typical for most teams.
For this workflow, n8n has a few advantages: more complex logic with unlimited branching at no extra cost, a self-hosting option for unlimited executions, and native support for the kind of “scrape → parse → clean → structure” flow that gets awkward in simpler tools. Zapier and Make are great when you’re moving clean data between apps, but scraping and deduping tends to get expensive or brittle. If you want full control over prompts, parsing, and output formatting, n8n is a better fit. If you only need a two-step “new row → send email,” keep it simple with Zapier. Talk to an automation expert if you’re not sure which fits.
Once your lead list is clean and consistent, outreach becomes the easy part. Set this up once, run it when you need fresh prospects, and get back to work that actually moves revenue.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.