Bright Data + Google Gemini for clean LinkedIn leads
Copying LinkedIn profile details into a spreadsheet sounds simple. Then you do it for 30 leads, lose half an afternoon, and still end up with messy titles, missing company fields, and notes that don’t match across your team.
This LinkedIn leads automation hits marketing ops first, honestly. But sales teams chasing outbound lists and recruiters building pipelines feel the same drag. You will go from “random profile scraps” to structured, usable lead rows you can actually score and report on.
Below, you’ll see how the workflow scrapes LinkedIn via Bright Data, has Google Gemini clean and standardize it, then saves and notifies you so the data is ready for outreach or analytics.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: Bright Data + Google Gemini for clean LinkedIn leads
flowchart LR
subgraph sg0["When clicking ‘Test workflow’ Flow"]
direction LR
n0@{ icon: "mdi:play-circle", form: "rounded", label: "When clicking ‘Test workflow’", pos: "b", h: 48 }
n1@{ icon: "mdi:swap-vertical", form: "rounded", label: "Set the URLs", pos: "b", h: 48 }
n2@{ icon: "mdi:cog", form: "rounded", label: "Bright Data MCP Client For L..", pos: "b", h: 48 }
n3@{ icon: "mdi:cog", form: "rounded", label: "List all tools for Bright Data", pos: "b", h: 48 }
n4@{ icon: "mdi:cog", form: "rounded", label: "Bright Data MCP Client For L..", pos: "b", h: 48 }
n5@{ icon: "mdi:swap-vertical", form: "rounded", label: "Set the LinkedIn Company URL", pos: "b", h: 48 }
n6["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Webhook for LinkedIn Company.."]
n7@{ icon: "mdi:robot", form: "rounded", label: "LinkedIn Data Extractor", pos: "b", h: 48 }
n8@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model", pos: "b", h: 48 }
n9@{ icon: "mdi:cog", form: "rounded", label: "List all available tools for..", pos: "b", h: 48 }
n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Code"]
n11["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/merge.svg' width='40' height='40' /></div><br/>Merge"]
n12@{ icon: "mdi:cog", form: "rounded", label: "Aggregate", pos: "b", h: 48 }
n13@{ icon: "mdi:code-braces", form: "rounded", label: "Create a binary data for Lin..", pos: "b", h: 48 }
n14@{ icon: "mdi:cog", form: "rounded", label: "Write the LinkedIn person in..", pos: "b", h: 48 }
n15@{ icon: "mdi:code-braces", form: "rounded", label: "Create a binary data for Lin..", pos: "b", h: 48 }
n16@{ icon: "mdi:cog", form: "rounded", label: "Write the LinkedIn company i..", pos: "b", h: 48 }
n17["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Webhook for LinkedIn Person .."]
n10 --> n11
n11 --> n12
n12 --> n6
n12 --> n15
n1 --> n2
n7 --> n11
n8 -.-> n7
n5 --> n4
n3 --> n5
n0 --> n9
n0 --> n3
n9 --> n1
n2 --> n17
n2 --> n13
n4 --> n10
n4 --> n7
n13 --> n14
n15 --> n16
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n7 ai
class n8 aiModel
class n6,n17 api
class n10,n13,n15 code
classDef customIcon fill:none,stroke:none
class n6,n10,n11,n17 customIcon
The Problem: LinkedIn lead data is messy and slow to collect
LinkedIn is full of useful signals, but getting them into a system you can use is the painful part. You open a profile, copy a job title, paste it somewhere, then realize the company page has the real headcount or industry detail you needed. Next thing you know, you’ve got five browser tabs, a half-filled sheet, and a list that’s inconsistent across rows because you were rushing. The worst part is the rework. You either clean it later (which never happens) or you clean it mid-stream and your “quick list build” turns into a project.
It adds up fast. Here’s where it breaks down in real life.
- Manual copy-paste forces you to make formatting decisions hundreds of times per week.
- Small inconsistencies (like “VP Sales” vs “Vice President of Sales”) quietly ruin scoring and segmentation.
- People pull different fields, so your “team list” becomes five versions of the truth.
- Even careful teams miss context from the company page, which means weaker targeting and worse outreach.
The Solution: Bright Data scrapes, Gemini cleans, n8n delivers
This workflow turns raw LinkedIn pages into clean lead records you can actually use. You start by providing the LinkedIn person URL (and the related company URL). n8n uses Bright Data’s MCP Server LinkedIn tools to scrape the underlying page data reliably, even when LinkedIn is picky. Then an AI layer (Google Gemini) transforms that scrape into structured fields and a readable narrative, so job titles, locations, company details, and summaries come out consistent. Finally, the workflow posts results to a webhook endpoint you choose and saves copies to disk, giving you both an immediate “push” and a stored record for later.
The flow kicks off from a manual run trigger (easy for one-off research). Bright Data pulls the person and company data in parallel, then Gemini standardizes and shapes it into a clean output. After that, n8n aggregates the final record and sends it out through a webhook while also saving files for traceability.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you’re building a list of 25 LinkedIn leads for a campaign. Manually, you might spend about 8 minutes per person profile plus another 5 minutes grabbing the matching company details, which is roughly 5 hours. With this workflow, you paste the URLs once, run it, and wait for the scrape and Gemini cleanup to finish (often under 10 minutes per batch, depending on LinkedIn and your setup). You still review the final rows, but that’s a quick scan, not a rebuild.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Bright Data account to access Web Unlocker scraping.
- Bright Data MCP Server to scrape via MCP tools.
- Google Gemini API key (get it from Google AI Studio).
Skill level: Intermediate. You’ll paste URLs, add credentials, and be comfortable adjusting a webhook endpoint and local file paths.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
You provide LinkedIn URLs. The workflow starts from a manual run, then assigns the LinkedIn person URL and company URL into the flow so every downstream step uses the right targets.
Bright Data does the scraping. n8n connects to the Bright Data MCP Server and calls the LinkedIn scraping tools for both the person page and company page, pulling back the raw response content that would be painful to collect by hand.
Gemini cleans and structures it. The workflow parses the MCP response, extracts a company narrative, and runs the transformation through a Gemini chat step so the output becomes consistent fields instead of unpredictable blobs of text.
You get a saved record and a push notification. n8n aggregates the final result, posts the person and company payloads to your webhook endpoint, and also writes files to disk so you have a persistent copy for auditing or reuse.
You can easily modify the webhook destination to send results to Slack, Airtable, Notion, or your CRM based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Manual Trigger
Start the workflow with a manual trigger so you can test and iterate quickly.
- Add the Manual Run Trigger node as the workflow trigger.
- Leave all parameters empty in Manual Run Trigger (default manual execution).
- Connect Manual Run Trigger to both Fetch MCP Tool Catalog and Retrieve Bright Data Tools to match the parallel start.
Manual Run Trigger outputs to both Fetch MCP Tool Catalog and Retrieve Bright Data Tools in parallel.
Step 2: Connect MCP Tool Services
These nodes initialize the MCP tool catalog and Bright Data tools used for scraping.
- Open Fetch MCP Tool Catalog and ensure it has credentials.
- Credential Required: Connect your mcpClientApi credentials.
- Open Retrieve Bright Data Tools and ensure it has credentials.
- Credential Required: Connect your mcpClientApi credentials.
Fetch MCP Tool Catalog connects to Assign Profile URLs, while Retrieve Bright Data Tools connects to Set Company Profile URL.
Step 3: Set Up Profile URL Inputs
Provide the LinkedIn person and company URLs and webhook targets that downstream nodes will use.
- In Assign Profile URLs, set url to
https://www.linkedin.com/in/[YOUR_ID]/. - In Assign Profile URLs, set webhook_url to
https://webhook.site/[YOUR_ID]. - In Set Company Profile URL, set url to
https://www.linkedin.com/company/[YOUR_ID]/. - In Set Company Profile URL, set webhook_url to
https://webhook.site/[YOUR_ID].
[YOUR_ID] will result in empty or invalid scrape responses.Step 4: Configure the Scrape and Parsing Pipeline
Scrape person and company profiles, then parse the company data for downstream processing.
- Open MCP Person Scrape and set toolName to
web_data_linkedin_person_profile. - In MCP Person Scrape, set operation to
executeTooland toolParameters to={ "url": "{{ $json.url }}" }. - Credential Required: Connect your mcpClientApi credentials in MCP Person Scrape.
- Open MCP Company Scrape and set toolName to
web_data_linkedin_company_profile. - In MCP Company Scrape, set operation to
executeTooland toolParameters to={ "url": "{{ $json.url }}" }. - Credential Required: Connect your mcpClientApi credentials in MCP Company Scrape.
- In Parse MCP Response, keep jsCode set to
jsonContent = JSON.parse($input.first().json.result.content[0].text) return jsonContent.
MCP Person Scrape outputs to both Post Person Webhook and Build Person Binary in parallel, while MCP Company Scrape outputs to both Parse MCP Response and Extract Company Narrative in parallel.
Step 5: Set Up AI Narrative and Aggregation
Use Gemini to generate the company story, then merge and aggregate results.
- In Extract Company Narrative, set text to
=Write a complete story of the provided company information in JSON. Use the following Company info to produce a story or a blog post. Make sure to incorporate all the provided company context. Here's the Company Info in JSON - {{ $json.input }}. - Keep the attribute configuration in Extract Company Narrative with company_story required.
- Open Gemini Chat Engine and set modelName to
models/gemini-2.0-flash-exp. - Credential Required: Connect your googlePalmApi credentials in Gemini Chat Engine.
- Ensure Gemini Chat Engine is connected as the language model for Extract Company Narrative.
- In Combine Streams, keep defaults to merge Parse MCP Response and Extract Company Narrative output.
- In Aggregate Results, aggregate about and output.company_story as configured.
Step 6: Configure Output Destinations
Send webhook responses and save person/company data to disk.
- In Post Person Webhook, set url to
={{ $('Assign Profile URLs').item.json.webhook_url }}and enable sendBody. - In Post Person Webhook, set the body parameter response to
={{ $json.result.content[0].text }}. - In Build Person Binary, keep functionCode set to the base64 JSON conversion script.
- In Save Person File, set operation to
writeand fileName tod:\LinkedIn-Person.json. - In Post Company Webhook, set url to
={{ $('Set Company Profile URL').item.json.webhook_url }}and specifyBody tojson. - In Post Company Webhook, set jsonBody to
={ "about": {{ JSON.stringify($json.about[0]) }}, "story": {{ JSON.stringify($json.company_story[0]) }} }. - In Build Company Binary, keep functionCode set to the base64 JSON conversion script.
- In Save Company File, set operation to
writeand fileName tod:\LinkedIn-Company.json.
Aggregate Results outputs to both Post Company Webhook and Build Company Binary in parallel.
d:\LinkedIn-Person.json and d:\LinkedIn-Company.json must exist and be writable on the n8n host.Step 7: Test and Activate Your Workflow
Run a manual execution to validate scraping, AI generation, and output delivery.
- Click Execute Workflow to run Manual Run Trigger.
- Confirm successful runs on Post Person Webhook and Post Company Webhook by checking the webhook endpoint logs.
- Verify files saved by Save Person File and Save Company File at
d:\LinkedIn-Person.jsonandd:\LinkedIn-Company.json. - Once verified, switch the workflow toggle to Active for production use.
Common Gotchas
- Bright Data credentials can expire or need specific permissions. If things break, check your Bright Data API token and the Web Unlocker zone setup in the Bright Data control panel first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Frequently Asked Questions
Plan on about an hour if your Bright Data and Gemini keys are ready.
No. You’ll mostly paste URLs, connect credentials, and choose where the webhook should send the output.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Bright Data usage and Gemini API usage, which depends on how many profiles you process.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and it’s a common tweak. You can add or extend the Google Sheets node to append a row after the “Aggregate Results” step, mapping Gemini’s structured fields into consistent columns. Many teams also adjust the “Extract Company Narrative” and Gemini prompt so the same column names are always produced, even when profiles look different.
Usually it’s an API token issue or the MCP Client (STDIO) credentials aren’t pointing to the right local MCP Server. Double-check the Bright Data API_TOKEN environment value, then confirm your Web Unlocker zone exists and matches what you configured. If it worked yesterday and fails today, regenerate the token and update it in n8n. Rate limits can also show up as random failures when you try to scrape too many profiles back-to-back.
It depends on your n8n plan and your Bright Data limits. On n8n Cloud, your monthly executions cap how many runs you can do, while self-hosting has no execution limit (your server becomes the constraint). Practically, most teams process leads in batches of 10 to 50 to keep scraping stable and make reviewing the output easy.
For this specific job, yes, because you’re combining web scraping, parsing, AI transformation, and multiple outputs in one flow. Zapier and Make can work for simpler “send data from A to B” tasks, but they get awkward when you need branching logic, file writes, or custom handling of scrape responses. n8n also gives you the option to self-host, which is useful when runs get frequent. One more thing: this template relies on an MCP community node, so you’ll want the flexibility of n8n’s ecosystem. If you’re unsure, Talk to an automation expert and we’ll point you in the right direction.
Clean lead data is the difference between “spray and pray” outreach and a system you can scale. Set this up once, then let the workflow handle the tedious parts while you focus on messaging and follow-up.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.