Decodo + Google Sheets: forum news tracked for you
Keeping up with forums and niche news sounds simple until you’re juggling 12 tabs, two Slack threads, and a “temporary” spreadsheet that never stays clean. The real problem isn’t finding posts. It’s turning messy pages into consistent, usable rows you can trust.
This Decodo Sheets tracking automation hits market researchers first, honestly, but content strategists and founders monitoring competitors feel it too. You get a running Google Sheet of titles, links, authors, and engagement stats without doing the daily sweep.
Below, you’ll see how the workflow collects forum content, uses AI to structure it, and logs everything into Google Sheets so you can scan trends in minutes, not hours.
How This Automation Works
See how this solves the problem:
n8n Workflow Template: Decodo + Google Sheets: forum news tracked for you
flowchart LR
subgraph sg0["Schedule Flow"]
direction LR
n0@{ icon: "mdi:play-circle", form: "rounded", label: "Schedule Trigger", pos: "b", h: 48 }
n1@{ icon: "mdi:swap-vertical", form: "rounded", label: "Split Forums", pos: "b", h: 48 }
n2@{ icon: "mdi:swap-vertical", form: "rounded", label: "Iterate Forums", pos: "b", h: 48 }
n3@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Model", pos: "b", h: 48 }
n4@{ icon: "mdi:robot", form: "rounded", label: "Extract Structured News Data", pos: "b", h: 48 }
n5@{ icon: "mdi:robot", form: "rounded", label: "Parse JSON Output", pos: "b", h: 48 }
n6@{ icon: "mdi:swap-vertical", form: "rounded", label: "Split News Items", pos: "b", h: 48 }
n7@{ icon: "mdi:cog", form: "rounded", label: "Generate Unique Key", pos: "b", h: 48 }
n8@{ icon: "mdi:cog", form: "rounded", label: "Wait Between Scrapes", pos: "b", h: 48 }
n9@{ icon: "mdi:database", form: "rounded", label: "Update Google Sheet (News)", pos: "b", h: 48 }
n10@{ icon: "mdi:swap-vertical", form: "rounded", label: "Workflow Config", pos: "b", h: 48 }
n11@{ icon: "mdi:cog", form: "rounded", label: "Scrape Forum Data", pos: "b", h: 48 }
n12@{ icon: "mdi:database", form: "rounded", label: "Log Scrape Results", pos: "b", h: 48 }
n1 --> n2
n2 --> n11
n10 --> n1
n0 --> n10
n6 --> n7
n5 -.-> n4
n11 --> n4
n12 --> n8
n7 --> n9
n3 -.-> n4
n8 --> n2
n4 --> n6
n4 --> n12
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n0 trigger
class n4,n5 ai
class n3 aiModel
class n9,n12 database
The Challenge: Forum and news monitoring turns into tab chaos
If you monitor a few forums, creator communities, or industry news sources, you already know the grind. You open the same sites, hunt for what changed, copy links into a doc, then try to remember why a post mattered two days later. And when you finally want to compare sources (which forum is spiking, which author keeps showing up, which topic is gaining steam), you realize you didn’t capture the same fields each time. It’s not just tedious. It’s unreliable, which makes your insights shaky.
The friction compounds. Here’s where it breaks down.
- Manually checking even 10 sources can burn about 1–2 hours a day once you include clicking, filtering, and notes.
- Different sites show “engagement” differently, so your tracking ends up inconsistent and hard to compare later.
- Copy-paste introduces small errors (wrong URL, wrong title), and those errors spread when you share the sheet.
- When you skip a day, you lose context and miss the early signals that matter most.
The Fix: Decodo scrapes sources and Google Sheets becomes your feed
This workflow runs on a schedule and does the annoying part for you. First, it loads a curated list of forum or news URLs you care about (plus any settings like geolocation). Then Decodo scrapes the raw content from each source, even when the page formatting is messy or inconsistent. After that, an AI step (Google Gemini in the workflow) reads the scraped text and pulls out the fields you actually need, like title, URL, author, and engagement stats. Each item is then structured into clean JSON, assigned a unique identifier, and written into Google Sheets so you can sort, filter, and build reporting on top. It also logs what happened each run, so you can tell what succeeded and what didn’t without guesswork.
The workflow starts on a schedule, batches through your list of sources, and pauses between batches so you don’t overwhelm anything. Once AI has extracted clean fields, Google Sheets is updated with consistent rows and a separate log sheet captures the run details.
What Changes: Before vs. After
| What This Eliminates | Impact You’ll See |
|---|---|
|
|
Real-World Impact
Say you track 15 forums and niche news pages. If you spend only 6 minutes per source to scan, click into a thread, and copy the basics, that’s about 90 minutes every run. Run it five times a week and you’re near 7 hours of repetitive checking. With this workflow, you spend maybe 10 minutes up front maintaining the URL list, then you just review the Sheet after it runs. The AI and scraping take their own time in the background, but your time isn’t tied up in tabs.
Requirements
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Decodo for scraping forum/news pages via API
- Google Sheets to store structured rows and logs
- Google Gemini API key (get it from Google AI Studio/Cloud Console)
Skill level: Beginner. You’ll paste credentials, set a Sheet ID, and tweak a list of URLs.
Need help implementing this? Talk to an automation expert (free 15-minute consultation).
The Workflow Flow
A scheduled run kicks things off. The automation starts at whatever interval you choose, so your tracking stays fresh without someone remembering to “go check.”
Your configuration is loaded. A setup step defines the forum/news URLs and options like geolocation, then expands that list into individual items so each source can be handled consistently.
Decodo scrapes, then AI structures. The workflow batches through sources, uses Decodo to collect raw content, and passes that text to Gemini to extract the fields you care about (title, URL, author, engagement). A parser converts the AI output into clean JSON so downstream steps don’t have to “guess” what’s what.
Google Sheets is updated and a log is written. Each extracted item gets a unique identifier before it is appended or updated in your spreadsheet, and a separate Google Sheets log captures the scrape result. A short wait helps smooth out batching so the run stays stable.
You can easily modify the forum URL list to track different domains based on your needs. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Schedule Trigger
Set the workflow to run on a schedule and pass control into the configuration node.
- Add the Scheduled Run Start node and keep its default schedule rule unless you need a custom interval.
- Connect Scheduled Run Start to Configuration Setup as the first execution step.
Step 2: Connect Google Sheets
Prepare the spreadsheet connections for both the news output and the scrape logs.
- Open Update News Spreadsheet and set Operation to
appendOrUpdate. - Confirm Document ID uses
{{ $('Configuration Setup').item.json.sheet_id }}and Sheet Name is the cachedNewssheet. - Open Record Scrape Log and set Operation to
append. - Confirm Document ID uses
{{ $('Configuration Setup').item.json.sheet_id }}and Sheet Name is the cachedLogssheet. - Credential Required: Connect your Google Sheets credentials in both Update News Spreadsheet and Record Scrape Log (credentials are required but not configured).
Step 3: Set Up Forum Configuration and Batching
Define the forum sources and loop through each forum URL in batches.
- Open Configuration Setup and set the forums array to
{{[ "https://news.ycombinator.com/from?site=openai.com", "https://news.ycombinator.com/from?site=anthropic.com", ]}}. - Set geo to
United Statesand sheet_id to{{ '[YOUR_ID]' }}in Configuration Setup. - In Expand Forum List, set Field to Split Out to
forumsand Fields to Include togeo. - In Expand Forum List, keep Include as
selectedOtherFieldsand Destination Field Name asurl. - Ensure Expand Forum List outputs to Batch Through Forums for iteration.
Step 4: Collect and Parse Forum Content with AI
Scrape each forum page and use Gemini to extract structured news entries.
- In Collect Forum Content, set Geo to
{{ $json.geo }}and URL to{{ $json.url }}. - Credential Required: Connect your Decodo credentials in Collect Forum Content (credentials are required but not configured).
- In Structure News Extraction, keep Prompt Type as
defineand ensure Has Output Parser is enabled. - Confirm the final prompt message uses
{{ $json.data.results.first().content }}to pass scraped text into Structure News Extraction. - Connect Gemini Chat Model as the language model for Structure News Extraction and add Gemini credentials to Gemini Chat Model (credentials are required but not configured).
- Decode JSON Result is the output parser for Structure News Extraction; add credentials to the parent AI node (Gemini Chat Model), not the parser sub-node.
Structure News Extraction outputs to both Separate News Entries and Record Scrape Log in parallel.
Step 5: Create Unique Keys and Write Outputs
Split the AI output into individual items, generate a unique identifier, and update your spreadsheet.
- In Separate News Entries, set Field to Split Out to
output. - In Create Unique Identifier, set Value to
{{ `${$json.url}+${$json.author}` }}and Data Property Name tokey. - In Update News Spreadsheet, map columns to expressions like
{{ $json.title }},{{ $json.url }}, and{{ $now }}for last_updated. - Ensure Record Scrape Log logs news_count using
{{ $json.output.length }}and time using{{ $now }}.
Step 6: Configure Batch Pausing and Looping
Add a cooldown between batches to avoid throttling and ensure continuous processing.
- In Pause Between Batches, set Unit to
minutesand Amount to1. - Connect Record Scrape Log to Pause Between Batches, then back to Batch Through Forums to continue the loop.
Step 7: Test and Activate Your Workflow
Verify that scraping, AI parsing, and spreadsheet updates work end-to-end before enabling the schedule.
- Click Execute Workflow and confirm that Collect Forum Content returns scraped text for each URL.
- Check Structure News Extraction for a JSON array and verify Separate News Entries creates individual items.
- Confirm new rows appear in the News and Logs sheets from Update News Spreadsheet and Record Scrape Log.
- When successful, toggle the workflow to Active so Scheduled Run Start runs on schedule.
Watch Out For
- Decodo credentials can expire or need specific permissions. If things break, check your Decodo dashboard and the n8n credential test first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.
Common Questions
About 30 minutes if your accounts and APIs are ready.
Yes. No coding required, but you do need to paste API keys and update a few settings like the forum URL list and Google Sheet ID.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Gemini API usage and your Decodo plan for scraping.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
You can. The usual starting point is the Configuration Setup step where the forum URLs, geolocation, and Sheet ID are defined. If you want different fields (like tags, product names, or sentiment), adjust the prompt in Structure News Extraction and keep the output consistent for Decode JSON Result. Some teams also change how “uniqueness” works by incorporating author + date into Create Unique Identifier so updates don’t create duplicates.
Most of the time it’s an invalid or expired API key in n8n credentials. If the key is fine, it can be blocked targets, a source URL that changed layout, or request limits when you run too many sources too quickly. Check the scrape log row written during the run, because it usually shows the error message you need to act on.
It scales to dozens of sources per run for most small teams, and batching plus the wait step helps keep it stable.
Often, yes, because this workflow leans on batching, structured parsing, and “loop until done” behavior. That kind of logic tends to get awkward (and expensive) in Zapier and Make once you go beyond a couple sources. n8n also gives you a self-hosting path, which means you can run frequent scheduled jobs without worrying about per-task pricing. On the flip side, if you only track one or two RSS feeds and don’t need scraping or AI structuring, Zapier or Make might be quicker to set up. Talk to an automation expert if you want help choosing.
Once this is running, your “research” becomes a quick review of a clean Sheet instead of a daily scavenger hunt. The workflow handles the repetitive stuff, so you can focus on what the patterns actually mean.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.