Bright Data + Google Gemini: cleaner trend reports
Trend research sounds simple until you’re juggling scraped pages, half-broken exports, and a “summary” that changes format every time you run it. Then the real work starts: cleaning, parsing, regrouping, and trying to remember where a number came from.
This hits SEO strategists hardest, but growth marketers and content analysts feel it too. If you’ve been looking for trend report automation that produces consistent notes you can actually reuse, this workflow is built for that.
You’ll see how it pulls any URL via Bright Data, has Google Gemini extract topics and regions into structured JSON, saves audit files, and pushes clean results into Google Sheets.
How This Automation Works
The full n8n workflow, from trigger to final output:
n8n Workflow Template: Bright Data + Google Gemini: cleaner trend reports
flowchart LR
subgraph sg0["When clicking ‘Test workflow’ Flow"]
direction LR
n0@{ icon: "mdi:play-circle", form: "rounded", label: "When clicking ‘Test workflow’", pos: "b", h: 48 }
n1@{ icon: "mdi:robot", form: "rounded", label: "Markdown to Textual Data Ext..", pos: "b", h: 48 }
n2@{ icon: "mdi:swap-vertical", form: "rounded", label: "Set URL and Bright Data Zone", pos: "b", h: 48 }
n3["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Initiate a Webhook Notificat.."]
n4["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Initiate a Webhook Notificat.."]
n5@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model for..", pos: "b", h: 48 }
n6@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model for..", pos: "b", h: 48 }
n7["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Perform Bright Data Web Requ.."]
n8@{ icon: "mdi:robot", form: "rounded", label: "Topic Extractor with the str..", pos: "b", h: 48 }
n9@{ icon: "mdi:robot", form: "rounded", label: "Trends by location and categ..", pos: "b", h: 48 }
n10@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model", pos: "b", h: 48 }
n11["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Initiate a Webhook Notificat.."]
n12@{ icon: "mdi:code-braces", form: "rounded", label: "Create a binary file for top..", pos: "b", h: 48 }
n13@{ icon: "mdi:cog", form: "rounded", label: "Write the topics file to disk", pos: "b", h: 48 }
n14@{ icon: "mdi:cog", form: "rounded", label: "Write the trends file to disk", pos: "b", h: 48 }
n15@{ icon: "mdi:code-braces", form: "rounded", label: "Create a binary data for tends", pos: "b", h: 48 }
n10 -.-> n9
n2 --> n7
n15 --> n14
n12 --> n13
n7 --> n1
n0 --> n2
n1 --> n8
n1 --> n3
n1 --> n9
n5 -.-> n1
n8 --> n4
n8 --> n12
n6 -.-> n8
n9 --> n11
n9 --> n15
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n1,n8,n9 ai
class n5,n6,n10 aiModel
class n3,n4,n7,n11 api
class n12,n15 code
classDef customIcon fill:none,stroke:none
class n3,n4,n7,n11 customIcon
The Problem: Trend research turns into a cleanup project
When you’re mining trends from web pages, you usually start with something messy: scraped HTML, markdown blobs, inconsistent headings, and a bunch of “almost useful” text. The painful part is what comes next. You copy chunks into docs, try to standardize categories, then re-check sources because someone asks, “Where did this insight come from?” One week later, you run the same research and the structure changes again, so your spreadsheet stops being comparable. That’s when trend tracking quietly becomes unscalable.
It adds up fast. Here’s where it breaks down in real teams.
- Scraped pages come back in different formats, so you spend about 1-2 hours just cleaning before analysis even starts.
- Topic summaries are inconsistent, which means your Google Sheets rows don’t line up week to week.
- Region or location trends often get handled manually, so the “by market” view is always late or skipped.
- No audit trail exists when insights live in chat messages, and it becomes stressful to defend decisions later.
The Solution: Bright Data + Gemini extraction into structured trend notes
This n8n workflow turns “a URL and a mess” into a structured trend output you can store, share, and reuse. It starts with a simple trigger (manual launch, or you can swap in a form, Telegram message, or a sheet-driven list). From there, Bright Data’s Web Unlocker fetches the page content reliably, even when sites are picky about bots. The workflow converts the returned markdown/HTML into clean plain text, then sends that text through Google Gemini prompts designed to pull out topics, themes, and region-based clusters. Finally, it posts structured payloads to your webhook endpoints and writes audit files to disk so you can trace exactly what was extracted.
The workflow begins when you set a source URL and Bright Data zone, then runs the fetch. Next, Gemini parses the text and extracts structured topics plus regional trend groupings. The output gets sent out (webhook notifications) and preserved as files, and you can push the same JSON into Google Sheets for reporting.
What You Get: Automation vs. Results
| What This Workflow Automates | Results You’ll Get |
|---|---|
|
|
Example: What This Looks Like
Say you review 10 competitor or marketplace pages each week. Manually, you might spend about 15 minutes copying text, 10 minutes cleaning it, and another 10 minutes organizing topics per page, which is roughly 6 hours weekly. With this workflow, you paste the URL once, let Bright Data fetch the content, and Gemini returns structured topics plus region clusters. Even if you allow about 10 minutes per URL for processing and spot-checking, that’s closer to 2 hours total. You get your week back, and your spreadsheet finally stays consistent.
What You’ll Need
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Bright Data for fetching and unlocking target URLs.
- Google Gemini to extract topics, regions, and trend clusters.
- Bright Data Web Unlocker token (get it from Bright Data Web Unlocker zone settings).
Skill level: Intermediate. You’ll connect credentials, edit a URL/zone field, and test a few runs to tune prompts and outputs.
Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).
How It Works
A URL kicks things off. In the included version, you manually launch the workflow and it reads a “source URL + Bright Data zone” from a set node. You can swap this trigger for a Jotform submission, a Telegram message, or a Gmail trigger if you prefer.
Bright Data fetches the content. The workflow calls Bright Data Web Unlocker via HTTP request, which is useful for pages that block basic scrapers. You get back page content that’s workable, even if it starts as markdown or HTML.
Gemini turns text into structure. A “convert markdown to text” step cleans the content, then Gemini models and extraction nodes pull out topics, themes, and regional clusters. The goal is predictable JSON, so your downstream tools don’t have to guess.
Results get shared and saved. The workflow posts separate webhook notifications for text output, topic results, and trend results. It also builds binary files and writes them to disk, which gives you a tidy audit trail for later reviews.
You can easily modify the input source to read URLs from Google Sheets and write structured results back into new rows based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Manual Trigger
Start the workflow with a manual trigger so you can test the data mining flow on demand.
- Add the Manual Launch Trigger node as the entry point.
- Connect Manual Launch Trigger to Configure Source URL & Zone.
Step 2: Connect Bright Data Fetch
Define the target URL and Bright Data zone, then request the markdown content from Bright Data.
- In Configure Source URL & Zone, set url to
https://www.bbc.com/news/world. - Set zone to
web_unlocker1. - Open Execute Bright Data Fetch and set URL to
https://api.brightdata.com/request. - Set Method to
POST, enable Send Body and Send Headers. - Under body parameters, set zone to
{{ $json.zone }}and url to{{ $json.url }}?product=unlocker&method=api, with formatrawand data_formatmarkdown. - Credential Required: Connect your httpHeaderAuth credentials in Execute Bright Data Fetch.
Step 3: Set Up Markdown Conversion and AI Models
Use an LLM chain to convert the fetched markdown into clean text for downstream analysis.
- In Convert Markdown to Text, set Text to
=You need to analyze the below markdown and convert to textual data. Please do not output with your own thoughts. Make sure to output with textual data only with no links, scripts, css etc. {{ $json.data }}. - Ensure the message role includes
You are a markdown expertin Convert Markdown to Text. - Confirm Gemini Model for Text Parse is connected as the language model for Convert Markdown to Text with modelName set to
models/gemini-2.0-flash-exp. - Credential Required: Connect your googlePalmApi credentials in Gemini Model for Text Parse (credentials are added to the model node, not the chain node).
Step 4: Configure Parallel Topic and Trend Extraction
After text conversion, run multiple analyses in parallel and route results to webhooks and file builders.
- Confirm that Convert Markdown to Text outputs to Structured Topic Analyzer, Notify Webhook: Text Output, and Cluster Trends by Region in parallel.
- In Structured Topic Analyzer, set Text to
=Perform the topic analysis on the below content and output with the structured information. Here's the content: {{ $('Execute Bright Data Fetch').item.json.data }}and keep Schema Type asmanual. - In Cluster Trends by Region, set Text to
=Perform the data analysis on the below content and output with the structured information by clustering the emerging trends by location and category Here's the content: {{ $('Execute Bright Data Fetch').item.json.data }}and keep Schema Type asmanual. - Ensure Gemini Model for Topic AI is connected as the language model for Structured Topic Analyzer with modelName
models/gemini-2.0-flash-exp. - Ensure Gemini Model for Trend AI is connected as the language model for Cluster Trends by Region with modelName
models/gemini-2.0-flash-exp. - Credential Required: Connect your googlePalmApi credentials in both Gemini Model for Topic AI and Gemini Model for Trend AI (credentials are added to the model nodes, not the extractor nodes).
Structured Topic Analyzer outputs to both Notify Webhook: Topic Results and Build Topic File Binary in parallel, while Cluster Trends by Region outputs to both Notify Webhook: Trend Results and Build Trends File Binary in parallel.
Step 5: Configure Webhook Notifications and File Outputs
Send results to webhooks and save the structured JSON outputs to disk.
- In Notify Webhook: Text Output, set URL to
https://webhook.site/3c36d7d1-de1b-4171-9fd3-643ea2e4dd76and map content to{{ $json.text }}. - In Notify Webhook: Topic Results, set URL to the same webhook and map summary to
{{ $json.output }}. - In Notify Webhook: Trend Results, set URL to the same webhook and map summary to
{{ $json.output }}. - In Build Topic File Binary and Build Trends File Binary, keep the function code as-is to build the binary file payload.
- In Save Topics to Disk, set Operation to
writeand File Name tod:\topics.json. - In Save Trends to Disk, set Operation to
writeand File Name tod:\trends.json.
⚠️ Common Pitfall: Ensure the workflow has permission to write to d:\ on your host. Update paths if you are running n8n on Linux or in a container.
Step 6: Test and Activate Your Workflow
Run a manual test to verify the parallel branches and outputs, then activate for production use.
- Click Execute Workflow to run Manual Launch Trigger.
- Verify Execute Bright Data Fetch returns markdown data and Convert Markdown to Text emits plain text.
- Confirm parallel execution: Structured Topic Analyzer, Notify Webhook: Text Output, and Cluster Trends by Region should all run at the same time.
- Check webhook responses and ensure files are written to
d:\topics.jsonandd:\trends.json. - When results look correct, toggle the workflow to Active for production use.
Common Gotchas
- Bright Data credentials can expire or need specific permissions. If things break, check your Web Unlocker token and zone name in the Bright Data dashboard first.
- If you’re using Wait nodes or external processing, runtimes vary. Bump up the wait duration if downstream nodes fail because they received an empty or partial response.
- Default prompts in Gemini are generic. Add your categories, regions, and “output as JSON” rules early or you will be editing outputs forever.
Frequently Asked Questions
About 30-60 minutes once you have your Bright Data and Gemini keys.
No coding required. You’ll mostly update credentials, the source URL/zone, and a couple of prompt fields.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Bright Data usage and Gemini API costs depending on how many URLs you process.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
Yes, but you’ll want to change the input. Replace the “Configure Source URL & Zone” step with a Google Sheets read (one URL per row), then loop through URLs and write the “Topic Results” and “Trend Results” JSON back to the sheet. Common tweaks include forcing a fixed category list, adding a “trend_score” field, and expanding the region grouping beyond country to city or state if your sources support it.
Usually it’s an invalid Web Unlocker token or the wrong zone name. Check the Bright Data dashboard, regenerate the token if needed, then update the Header Auth credential in n8n. Also watch for blocked target URLs that require different unlocker settings, and rate limits if you fire a big batch all at once.
On n8n Cloud Starter, you can handle a few thousand runs per month comfortably, and self-hosting removes execution caps (your server becomes the limit). Practically, most teams start with batches of 20-50 URLs per run, then scale once the prompts and outputs are stable.
Often, yes. This workflow relies on multi-step parsing, structured extraction, and saving audit files, which is where n8n tends to feel more flexible and less expensive as logic grows. Zapier and Make can still work if you only need “URL in, summary out” and don’t care about stable JSON. Frankly, the moment you want topic clusters by region and repeatable formatting, visual branching becomes your friend. Talk to an automation expert if you want help picking the simplest option.
Once this is running, trend research stops being a weekly cleanup ritual. You’ll have structured notes, saved audit files, and a sheet you can actually build decisions on.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.