Bright Data to Supabase, clean LinkedIn profiles
LinkedIn research looks simple until you do it all day. You copy a profile link, skim experience, guess skills, paste notes into a sheet, then repeat. And later, when someone asks “show me everyone with RevOps + HubSpot,” your notes are useless because nothing is standardized.
This Supabase LinkedIn automation hits recruiters first, honestly. But growth teams building lead lists and analysts trying to spot patterns feel the same pain. The outcome is straightforward: you get clean, query-ready profile fields in Supabase without hand-parsing every line.
Below, you’ll see how the workflow scrapes a LinkedIn URL with Bright Data, uses Gemini to turn messy text into consistent fields, and then creates or updates the record in Supabase so your team can actually use the data.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: Bright Data to Supabase, clean LinkedIn profiles
flowchart LR
subgraph sg0["Summarizer Flow"]
direction LR
n0@{ icon: "mdi:cog", form: "rounded", label: "Access and extract data from..", pos: "b", h: 48 }
n1@{ icon: "mdi:swap-vertical", form: "rounded", label: "Set the Input Fields", pos: "b", h: 48 }
n2@{ icon: "mdi:robot", form: "rounded", label: "Summarizer", pos: "b", h: 48 }
n3@{ icon: "mdi:robot", form: "rounded", label: "Skills Extractor", pos: "b", h: 48 }
n4@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model for..", pos: "b", h: 48 }
n5@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model for..", pos: "b", h: 48 }
n6@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model for..", pos: "b", h: 48 }
n7@{ icon: "mdi:robot", form: "rounded", label: "Markdown Content", pos: "b", h: 48 }
n8@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model for..", pos: "b", h: 48 }
n9@{ icon: "mdi:robot", form: "rounded", label: "Emerging Roles", pos: "b", h: 48 }
n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/supabase.svg' width='40' height='40' /></div><br/>Create a row"]
n11@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model for..", pos: "b", h: 48 }
n12@{ icon: "mdi:robot", form: "rounded", label: "Basic Profile Info", pos: "b", h: 48 }
n13["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/merge.svg' width='40' height='40' /></div><br/>Merge"]
n14@{ icon: "mdi:cog", form: "rounded", label: "Aggregate", pos: "b", h: 48 }
n15["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Webhook"]
n16@{ icon: "mdi:swap-vertical", form: "rounded", label: "Set the Scraped Response", pos: "b", h: 48 }
n17["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/supabase.svg' width='40' height='40' /></div><br/>Get a row"]
n18["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/supabase.svg' width='40' height='40' /></div><br/>Update Row"]
n19["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Respond to Webhook"]
n20@{ icon: "mdi:location-exit", form: "rounded", label: "Stop and Error", pos: "b", h: 48 }
n21["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Respond to Webhook Create"]
n22["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Respond to Webhook Update"]
n23["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Respond to Webhook Not Found"]
n24@{ icon: "mdi:swap-horizontal", form: "rounded", label: "If record exist", pos: "b", h: 48 }
n25@{ icon: "mdi:swap-horizontal", form: "rounded", label: "If force create?", pos: "b", h: 48 }
n26@{ icon: "mdi:swap-horizontal", form: "rounded", label: "If status code 200", pos: "b", h: 48 }
n13 --> n14
n15 --> n1
n14 --> n25
n17 --> n24
n2 --> n13
n18 --> n22
n10 --> n21
n9 --> n13
n24 --> n18
n24 --> n23
n25 --> n17
n25 --> n10
n7 --> n13
n3 --> n13
n12 --> n13
n19 --> n20
n1 --> n0
n26 --> n19
n26 --> n12
n26 --> n9
n26 --> n7
n26 --> n2
n26 --> n3
n16 --> n26
n6 -.-> n7
n5 -.-> n2
n0 --> n16
n8 -.-> n9
n4 -.-> n3
n11 -.-> n12
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n2,n3,n7,n9,n12 ai
class n4,n5,n6,n8,n11 aiModel
class n24,n25,n26 decision
class n15,n19,n21,n22,n23 api
classDef customIcon fill:none,stroke:none
class n10,n13,n15,n17,n18,n19,n21,n22,n23 customIcon
The Problem: LinkedIn research stays messy and unsearchable
LinkedIn profiles weren’t designed for structured analysis. A profile is a mix of headings, timeline entries, endorsements, and “about” sections that change depending on who wrote them. So when you scrape or copy it, you get raw text (or raw HTML) that still needs a human to interpret it. That’s where time disappears: reading between the lines, translating vague titles into seniority, and trying to capture skills consistently. Then the real damage shows up later, when you cannot query anything because everyone recorded the same thing differently.
The friction compounds. Here’s where it breaks down in the real world.
- One person writes “Head of Growth,” another writes “Growth Lead,” and your system treats them like two different roles.
- Manual profile review quietly eats about 10 minutes per profile, and it’s rarely just one profile.
- Raw scrapes give you a wall of text, so “skills” become guesses instead of fields you can filter and score.
- When updates happen, your notes go stale because re-checking profiles feels like starting over.
The Solution: Bright Data scraping + Gemini enrichment into Supabase
This workflow creates a repeatable pipeline for LinkedIn profile data. It starts with a webhook request that includes the LinkedIn URL and a few input fields (like who’s logging the request, and whether you want to force-create a record). n8n sends that URL to Bright Data to fetch the profile content reliably, then maps the response into a cleaner payload. From there, Gemini takes over to enrich what you scraped by extracting structured fields such as skills, a professional summary, experience highlights (including markdown-friendly formatting), and emerging role signals. Finally, the workflow writes those standardized fields into Supabase, creating a new row or updating an existing one so your database stays current.
The flow is simple: a webhook kicks it off, Bright Data returns the profile content, several Gemini-powered extractors turn it into consistent fields, and Supabase becomes the single source of truth. If a record already exists, it updates instead of duplicating. If the scrape fails, the workflow returns an error response quickly, so you’re not guessing what happened.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you review 30 LinkedIn profiles for a role each week. Manually, at about 10 minutes per profile to read, interpret, and paste structured notes, that’s roughly 5 hours weekly (and the notes still won’t be consistent). With this workflow, you submit 30 URLs through a webhook or form in about 20 minutes total, then wait for enrichment to complete in the background. You end the week with a Supabase table you can query by skill, title, or summary fields instead of scanning notes.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Bright Data for scraping LinkedIn profile content.
- Google Gemini to extract structured fields from text.
- Supabase to store profiles as queryable records.
- Gemini API key (get it from Google AI Studio).
Skill level: Intermediate. You’ll connect credentials, edit a couple of fields, and test webhook requests.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
A webhook receives the profile request. You send a LinkedIn URL (and optional fields like “force create”) to the Incoming Webhook Trigger. This makes it easy to call from a form, a sheet, or an internal tool.
The profile content gets fetched and validated. Bright Data retrieves the page content, then the workflow maps the scrape response and checks the status code. If the scrape didn’t work, n8n returns an error response immediately so you can retry or investigate.
Gemini turns messy text into consistent fields. Several extractors focus on different outputs (skills, summary, experience highlights, emerging roles, and core profile details). The workflow then merges and aggregates everything into one clean payload.
Supabase becomes the system of record. The workflow checks if the profile already exists, then inserts a new row or updates the existing one. That means you can refresh data without duplicates, which is a bigger deal than it sounds.
You can easily modify the extracted attributes to match your scoring model, like adding seniority, “open to work” signals, or target technologies. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Webhook Trigger
Set up the entry point so external systems can send profile URLs and metadata into the workflow.
- Add and open Incoming Webhook Trigger.
- Copy the webhook URL generated by Incoming Webhook Trigger and share it with your calling system.
- Confirm the webhook receives the expected payload fields that will be normalized in Define Input Fields.
⚠️ Common Pitfall: Testing the webhook without the required URL field will cause downstream parsing to fail. Ensure your POST body includes the URL you want to scrape.
Step 2: Connect Bright Data for URL Scraping
Normalize incoming data and fetch page content for extraction.
- In Define Input Fields, map incoming webhook fields into standardized keys used by the workflow.
- Open Fetch URL Content Data and connect your Bright Data account.
- Keep Map Scrape Response as the next node to structure the Bright Data output into usable fields for validation and extraction.
Credential Required: Connect your Bright Data credentials in Fetch URL Content Data.
Step 3: Set Up Status Validation and Routing
Check the scrape response status and route errors before any AI extraction happens.
- In Map Scrape Response, ensure the status code and body fields are available for validation logic.
- Configure Validate Status Code to evaluate the scrape response and route failures to Return Error Response.
- Confirm Return Error Response connects to Halt With Error to stop the workflow on invalid responses.
⚠️ Common Pitfall: If Validate Status Code returns a false positive, the AI extractors will run on empty or malformed content.
Step 4: Set Up AI Extraction and Parsing
Extract structured profile data from the scraped content using Gemini models and information extractors.
- Ensure Validate Status Code routes to the AI extractors when the status is valid.
- Configure the information extractor nodes: Core Profile Details, Emerging Role Parser, Experience Markdown Parse, Profile Summary Extract, and Skill Details Extractor.
- Connect language models to each extractor: Gemini Profile Model → Core Profile Details, Gemini Emerging Role Model → Emerging Role Parser, Gemini Experience Model → Experience Markdown Parse, Gemini Summary Model → Profile Summary Extract, Gemini Skill Model → Skill Details Extractor.
- Confirm Validate Status Code outputs to all five extractors in parallel once content is validated.
Credential Required: Connect your Google Gemini credentials in each of the Gemini model nodes (Gemini Profile Model, Gemini Emerging Role Model, Gemini Experience Model, Gemini Summary Model, Gemini Skill Model). These credentials apply to the parent language model nodes, not the extractor nodes.
Step 5: Configure Output and Database Actions
Merge AI results, decide whether to insert or update, and persist records in Supabase.
- Ensure all extractors feed into Combine Extracted Data, then into Aggregate Payload.
- Use Force Create Check to decide whether to query or insert directly.
- Configure Retrieve Database Row to look up existing records, then route to Check Record Presence.
- Set Check Record Presence to update via Modify Database Row or return Return Not Found Response if absent.
- Connect Insert Database Row for new entries and return success via Return Create Response; updates should return via Return Update Response.
Credential Required: Connect your Supabase credentials in Retrieve Database Row, Modify Database Row, and Insert Database Row.
Step 6: Add Error Handling
Ensure the workflow returns clear errors and stops safely when scraping fails.
- Verify Validate Status Code routes invalid responses to Return Error Response.
- Confirm Return Error Response connects to Halt With Error to terminate execution on failure.
⚠️ Common Pitfall: If Return Error Response is not connected to Halt With Error, the workflow may continue into AI and database steps with invalid data.
Step 7: Test and Activate Your Workflow
Validate the end-to-end flow from webhook to database update.
- Use Incoming Webhook Trigger to test the workflow with a real URL payload.
- Confirm a successful run produces extracted fields merged in Combine Extracted Data and aggregated in Aggregate Payload.
- Verify the database action executes: a new row via Insert Database Row or an update via Modify Database Row.
- Check the response nodes (Return Create Response or Return Update Response) to confirm API clients receive the correct status.
- Activate the workflow by toggling it to Active once testing passes.
Common Gotchas
- Bright Data credentials can expire or need specific permissions. If things break, check the Bright Data API zone settings and your token in n8n credentials first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Gemini availability is geo-restricted, so “model not found” can be a location issue rather than a bad prompt. If outputs are also bland, rewrite the prompts early to match your hiring or lead criteria.
Frequently Asked Questions
About an hour if your Bright Data, Gemini, and Supabase accounts are ready.
No. You’ll mainly connect credentials and test the webhook request. The only “code-like” part is copying the Supabase table script if you’re starting from scratch.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Gemini API costs (about $0.002–$0.004 per enrichment request, depending on the model) plus Bright Data usage.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, and it’s mostly prompt work. Update the Gemini enrichment prompts (the profile, summary, experience, and emerging role models) to extract seniority as a dedicated field, then include it in the aggregated payload before the Supabase insert/update step. Many teams also add “management scope,” “likely tools,” and “career transition signals” so filtering becomes more than just keyword search.
Usually it’s credentials or the Bright Data zone isn’t allowed to access the target properly. Regenerate your Bright Data token, confirm the zone settings, and then re-save the credential in n8n. Also check the workflow’s status-code validation path: if Bright Data returns a non-200 response, n8n will route to the error response on purpose. If it works sometimes and fails in bursts, you may be hitting Bright Data rate limits or LinkedIn anti-bot friction at higher volume.
On a typical n8n Cloud plan you can run thousands of executions per month, and self-hosting has no hard execution cap (it depends on your server). In practice, Bright Data and Gemini throughput will be your limiter, not Supabase inserts.
Often, yes. This workflow has multiple extraction passes (skills, summary, experience, emerging roles), merge/aggregate logic, and create-vs-update branching, which gets awkward and expensive in tools that price by task. n8n also lets you self-host, which matters when you’re processing a lot of profiles. Zapier or Make can still be fine if you only need a lightweight “URL in, row out” flow with minimal enrichment. If you want help picking the right approach, Talk to an automation expert.
Once your profiles land in Supabase as consistent fields, everything downstream gets easier: search, scoring, dashboards, and handoffs. Set it up once, then let the workflow do the parsing for you.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.