Hacker News to Airtable, job leads logged clean
Copying job leads out of “Ask HN: Who is hiring?” sounds simple until you’ve done it twice. The formatting is chaotic, details are buried in walls of text, and the minute you try to compare roles, you’re stuck in tabs-and-spreadsheets purgatory.
This HN Airtable automation hits recruiters and sourcers first, but founders building a lightweight hiring pipeline and agency operators tracking market demand feel the drag too. You end up spending about 2 hours just turning messy posts into something you can search, filter, and act on.
This workflow pulls the latest hiring thread, extracts individual listings, has Google Gemini structure them into clean fields, then saves everything into Airtable so you can review and reuse fast.
How This Automation Works
See how this solves the problem:
n8n Workflow Template: Hacker News to Airtable, job leads logged clean
flowchart LR
subgraph sg0["Schedule Flow"]
direction LR
n0@{ icon: "mdi:swap-vertical", form: "rounded", label: "Split Out", pos: "b", h: 48 }
n1@{ icon: "mdi:robot", form: "rounded", label: "Structured Output Parser", pos: "b", h: 48 }
n2["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Search for Who is hiring posts"]
n3@{ icon: "mdi:swap-vertical", form: "rounded", label: "Get relevant data", pos: "b", h: 48 }
n4@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Get latest post", pos: "b", h: 48 }
n5@{ icon: "mdi:swap-vertical", form: "rounded", label: "Split out children (jobs)", pos: "b", h: 48 }
n6@{ icon: "mdi:robot", form: "rounded", label: "Trun into structured data", pos: "b", h: 48 }
n7@{ icon: "mdi:swap-vertical", form: "rounded", label: "Extract text", pos: "b", h: 48 }
n8["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Clean text"]
n9@{ icon: "mdi:cog", form: "rounded", label: "Limit for testing (optional)", pos: "b", h: 48 }
n10["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/airtable.svg' width='40' height='40' /></div><br/>Write results to airtable"]
n11["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>HI API: Get the individual j.."]
n12["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>HN API: Get Main Post"]
n13["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/code.svg' width='40' height='40' /></div><br/>Code"]
n14@{ icon: "mdi:brain", form: "rounded", label: "Google Gemini Chat Model", pos: "b", h: 48 }
n15@{ icon: "mdi:play-circle", form: "rounded", label: "Schedule Trigger", pos: "b", h: 48 }
n13 --> n10
n0 --> n3
n8 --> n9
n7 --> n8
n4 --> n12
n15 --> n2
n3 --> n4
n12 --> n5
n14 -.-> n6
n1 -.-> n6
n5 --> n11
n6 --> n13
n9 --> n6
n2 --> n0
n11 --> n7
end
%% Styling
classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef disabled stroke-dasharray: 5 5,opacity: 0.5
class n15 trigger
class n1,n6 ai
class n14 aiModel
class n4 decision
class n10 database
class n2,n11,n12 api
class n8,n13 code
classDef customIcon fill:none,stroke:none
class n2,n8,n10,n11,n12,n13 customIcon
The Challenge: Turning “Who is hiring?” chaos into usable leads
HN hiring threads are valuable because they’re raw and immediate. They’re also a mess. A single post can include role, location, visa notes, salary hints, tech stack, and contact info, but not in the same order twice. If you’re tracking opportunities or building a pipeline, the manual process becomes the job: open thread, open comments, open each listing, copy text, clean it, paste it somewhere, then try to standardize fields after the fact. And honestly, you’ll miss things, because your brain starts skimming after the tenth wall of text.
It adds up fast. Here’s where it breaks down.
- You waste time reformatting posts just to answer basic questions like “remote?” or “US only?”
- Important details get lost in copy-paste, which leads to bad filtering and sloppy follow-up.
- Once listings are in a spreadsheet, everyone invents their own column names, so reporting turns into a cleanup project.
- You can’t reliably compare posts across weeks because each thread gets captured differently (or not at all).
The Fix: Auto-parse HN hiring posts into Airtable records
This workflow runs on a schedule and goes straight to the source. It queries Hacker News (via the Algolia API) to find the latest “Ask HN: Who is hiring?” thread, then pulls the main thread and expands it into individual job items. For each job post, it fetches the full text, sanitizes it (so you’re not feeding garbage formatting into your database), and then uses Google Gemini to convert that raw text into structured fields. Finally, it computes a few useful values (like a salary value when possible) and logs each job as a clean, searchable Airtable record. You end up with a table you can actually use, not a paste dump you dread touching.
The workflow starts with a scheduled trigger, then it collects the newest hiring thread and splits it into individual listings. AI does the heavy lifting of extracting structured JSON, and Airtable becomes your living job lead database that updates automatically.
What Changes: Before vs. After
| What This Eliminates | Impact You’ll See |
|---|---|
|
|
Real-World Impact
Say you review about 40 job posts from each HN hiring thread. Manually, it’s maybe 3 minutes to open the post, copy the text, clean it, and paste it into a sheet, which is about 2 hours per thread. With this workflow, you spend roughly 10 minutes setting filters and reviewing the Airtable records after it runs, while the fetching and AI parsing happens in the background. That’s a big chunk of time back every week, and the data is far more consistent.
Requirements
- n8n instance (try n8n Cloud free)
- Self-hosting option if you prefer (Hostinger works well)
- Airtable to store structured job lead records.
- Google Gemini API for AI extraction into structured fields.
- Algolia API credentials (get them from Algolia dashboard for HN search access)
Skill level: Intermediate. You’ll connect a few accounts, add API keys, and be comfortable testing runs with sample items.
Need help implementing this? Talk to an automation expert (free 15-minute consultation).
The Workflow Flow
Scheduled run kicks it off. The workflow triggers automatically on a schedule, so you’re not relying on someone remembering to “go grab the latest thread.”
HN thread discovery and filtering. It queries Hacker News using the Algolia API, selects the key fields it needs, then filters for the most recent relevant “Who is hiring?” post before moving on.
Job items get expanded and cleaned. The main thread is fetched, split into individual job items, and each post is pulled in full. A code step sanitizes the body text so AI isn’t trying to interpret weird spacing, headers, or leftover markup.
Gemini structures the data, then Airtable stores it. The AI agent produces a structured JSON output (parsed with a structured output parser), salary value is computed when possible, and each job becomes a normalized Airtable record you can sort, tag, and share.
You can easily modify the Airtable fields and the AI extraction prompt to match how you qualify leads. See the full implementation guide below for customization options.
Step-by-Step Implementation Guide
Step 1: Configure the Scheduled Automation Trigger
Set the workflow to run on a schedule so it can regularly poll Hacker News hiring threads.
- Add Scheduled Automation Trigger and set the interval rule to run every
2minutes (field: minutes, value:2). - Connect Scheduled Automation Trigger to Query Hiring Threads.
Step 2: Connect the Hacker News Data Source
Fetch the latest “Ask HN: Who is hiring” threads from Algolia and extract key fields for filtering.
- In Query Hiring Threads, set URL to
https://uj5wyc0l7x-dsn.algolia.net/1/indexes/Item_dev_sort_date/queryand Method toPOST. - Set JSON Body to the provided query payload (keep the “Ask HN: Who is hiring” search and
hitsPerPageas30). - Credential Required: Connect your httpHeaderAuth credentials in Query Hiring Threads.
- In Separate Results, set Field to Split Out to
hits. - In Select Key Fields, map fields using expressions: title =
{{ $json.title }}, createdAt ={{ $json.created_at }}, updatedAt ={{ $json.updated_at }}, storyId ={{ $json.story_id }}.
Step 3: Set Up Filtering and Thread Expansion
Filter to recent posts, fetch the main thread, and expand job item IDs.
- In Filter Recent Post, add a date condition: left value
{{ $json.createdAt }}, operatorafter, right value{{$now.minus({days: 30})}}. - In Fetch Main Thread, set URL to
=https://hacker-news.firebaseio.com/v0/item/{{ $json.storyId }}.json?print=pretty. - In Expand Job Items, set Field to Split Out to
kids. - In Fetch Job Detail, set URL to
=https://hacker-news.firebaseio.com/v0/item/{{ $json.kids }}.json?print=pretty.
createdAt (from Select Key Fields) rather than created_at, or the filter will drop all items.Step 4: Clean and Prepare Job Text
Extract text content and sanitize it before sending to the LLM.
- In Pull Text Content, map text to
{{ $json.text }}. - In Sanitize Text Body, keep the JavaScript as-is to clean HTML, entities, and whitespace.
- In Limit Sample Items, set Max Items to
5for controlled processing volume.
Step 5: Set Up AI Extraction and Salary Processing
Use Gemini to extract structured job data and compute a numeric salary field.
- In Generate Structured Output, set Text to
{{ $json.cleaned_text }}and keep Prompt Type asdefine. - Attach Gemini Chat Model as the language model for Generate Structured Output and set Model Name to
models/gemini-2.0-flash. - Credential Required: Connect your googlePalmApi credentials in Gemini Chat Model.
- Attach Structured Response Parser to Generate Structured Output and keep the manual schema JSON provided.
- In Compute Salary Value, keep the JavaScript that parses salary and writes
salary_numericto the item.
Step 6: Configure the Airtable Output
Store the extracted job data in your Airtable base.
- In Record to Airtable, set Operation to
create. - Select your Airtable Base and Table (replace
[YOUR_ID]with your real base and table IDs). - Map fields to expressions: Type =
{{ $json.output.type }}, Title ={{ $json.output.title }}, Salary ={{ $json.output.salary }}, Company ={{ $json.output.company }}, Location ={{ $json.output.location }}, Apply_url ={{ $json.output.apply_url }}, Description ={{ $json.output.description }}, company_url ={{ $json.output.company_url }}, work_location ={{ $json.output.work_location }}. - Credential Required: Connect your airtableTokenApi credentials in Record to Airtable.
Step 7: Test and Activate Your Workflow
Run a manual test to confirm data flows from Hacker News to Airtable, then activate the schedule.
- Click Execute Workflow and confirm items move from Query Hiring Threads through Record to Airtable.
- Verify that new rows appear in Airtable with populated fields like Title, Company, and Apply_url.
- Once confirmed, toggle the workflow to Active so Scheduled Automation Trigger runs in production.
Watch Out For
- Airtable tokens and base permissions matter more than people expect. If records aren’t appearing, check the Airtable Personal Access Token scopes and the target base/table IDs first.
- If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
- Gemini output quality depends on your prompt and your “sanitize text” step. Default prompts in AI nodes are generic, so add your fields and examples early or you will be editing outputs forever.
Common Questions
About an hour if your API keys are ready.
Yes, but someone should be comfortable pasting API keys and running a few tests. No coding is required unless you want to change the text-cleaning rules.
Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Google Gemini API usage and Airtable usage depending on your plan.
Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.
You can adjust the “Generate Structured Output” prompt to match the exact fields you care about (remote policy, visa, seniority, tech stack, contact method). If your Airtable schema differs, update the mapping in the “Record to Airtable” node so the JSON fields land in the right columns. Common tweaks include adding a “source week” field, tagging keywords like “React” or “Rust,” and storing the original raw text for audit.
Usually it’s token scopes or the token was rotated. Update the Airtable Personal Access Token in n8n, then confirm it has access to the right base and the table still exists with the same name. If it fails only sometimes, you may be hitting Airtable rate limits when you process a big thread; batching or limiting items helps.
If you self-host, capacity mostly depends on your server and API limits, not n8n. On n8n Cloud, your monthly execution quota sets the ceiling, and each job post processed counts toward that. Practically, processing a thread with 30–80 posts is fine for most setups, but AI calls and Airtable writes are the bottlenecks. If you want to scale, run it weekly, keep the “Limit Sample Items” node during testing, and batch Airtable writes once you’re confident.
Often, yes. This workflow relies on multi-step fetching, splitting items, cleaning text, structured AI parsing, and conditional logic, which tends to get expensive or awkward in Zapier and sometimes clunky in Make. n8n also gives you a self-host route, which is handy when you’re processing lots of job posts. The tradeoff is setup: you’ll spend a bit more time upfront connecting APIs and testing. If you’re unsure which tool fits your situation, Talk to an automation expert.
Once this is running, your weekly HN sourcing stops being a copy-paste task and turns into a quick review step. The workflow handles the repetitive stuff, so you can focus on deciding what’s worth pursuing.
Need Help Setting This Up?
Our automation experts can build and customize this workflow for your specific needs. Free 15-minute consultation—no commitment required.