Apify to Google Sheets, YC leads ready to use

You find a great Y Combinator search page, then the slow part starts. Tabs everywhere, copied links pasted into the wrong row, and “I’ll clean it later” turning into a messy sheet you don’t want to touch.

This Apify Google Sheets setup hits SDRs first, honestly. But VC analysts building sourcing lists and founders doing partnership outreach feel the same drag. You want a prospect list you can trust, without spending your best hours doing admin work.

This workflow pulls YC company and founder data through Apify, then drops it into Google Sheets in a clean, usable format. You’ll see what it automates, what results to expect, and what you need to run it reliably.

How This Automation Works

The full n8n workflow, from trigger to final output:

n8n Workflow Template: Apify to Google Sheets, YC leads ready to use

Click to explore

flowchart LR

    subgraph sg0["Start Workflow Flow"]
        direction LR
        n0@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Run an Actor", pos: "b", h: 48 }
        n1@{ icon: "mdi:swap-horizontal", form: "rounded", label: "Get dataset items", pos: "b", h: 48 }
        n2@{ icon: "mdi:play-circle", form: "rounded", label: "Start Workflow", pos: "b", h: 48 }
        n3@{ icon: "mdi:database", form: "rounded", label: "Add data to Google Sheet", pos: "b", h: 48 }
        n0 --> n1
        n2 --> n0
        n1 --> n3
    end

    %% Styling
    classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
    classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
    classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
    classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
    classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef disabled stroke-dasharray: 5 5,opacity: 0.5
    class n2 trigger
    class n0,n1 decision
    class n3 database

The Problem: YC lead research turns into spreadsheet busywork

YC is a goldmine, but turning it into a working lead list is where momentum dies. You open a filtered directory page, click into profiles, copy a website, grab LinkedIn, try to find founder names, then paste it all into a sheet that slowly becomes inconsistent. One row has “YC S21,” another has “Summer 2021,” and half the LinkedIn fields are blank because you got interrupted. After a couple runs, you don’t even trust your own list, so you re-check everything. That’s the worst part.

None of this feels hard in the moment. It just keeps happening, and the friction compounds.

You burn about 2 hours turning “interesting companies” into a real sheet.
Manual copy-paste creates silent errors, like the wrong founder attached to the wrong company.
Inconsistent formatting makes outreach personalization harder, because you’re always cleaning before you start.
When someone asks for “the same list, but for a different batch,” you start from scratch.

The Solution: Scrape YC with Apify and keep a live Google Sheet

This n8n workflow gives you a repeatable way to turn any YC directory search into structured data you can actually use. You trigger it manually when you want fresh leads. n8n tells Apify to run a Y Combinator Directory Scraper actor against your exact search URL (batch, industry, region, whatever filters you picked). When Apify finishes, the workflow pulls the dataset records back into n8n, maps the fields into the columns you care about, and updates your Google Sheet with new rows. The output is not a “data dump.” It’s a prospecting sheet that looks like you built it on purpose.

The workflow starts with a manual launch in n8n. Apify does the heavy lifting by scraping and structuring the YC results. Google Sheets becomes the final destination, so your list is easy to sort, dedupe, and hand off for outreach.

What You Get: Automation vs. Results

What This Workflow Automates

Results You’ll Get

Runs an Apify actor against your chosen YC search URL.
Pulls back the structured dataset records after the scrape completes.
Maps company and founder fields into consistent spreadsheet columns.
Writes rows into Google Sheets so your list stays usable.

Most teams get back about 2 hours per sourcing run.
A consistent sheet layout you can reuse across batches and industries.
Fewer “mystery rows” that need re-checking before you email someone.
Faster handoff to outreach, enrichment, or CRM import.
A prospect list you can refresh on demand instead of rebuilding.

Example: What This Looks Like

Say you want a list of 100 YC companies from a specific batch and industry. Manually, if you spend maybe 2 minutes per company to copy the basics (site, location, description) and another minute chasing founder details, that’s about 5 hours of tedious work. With this workflow, you paste your filtered YC search URL into the Apify actor input, click execute, and wait for the run to finish. The actual “human time” is closer to 10 minutes, and the sheet fills in automatically.

What You’ll Need

n8n instance (try n8n Cloud free)
Self-hosting option if you prefer (Hostinger works well)
Apify to run the YC directory scraper actor
Google Sheets to store and share the lead list
Apify API key (get it from Apify Console → Integrations)

Skill level: Intermediate. You’ll mainly connect accounts and match fields to the right sheet columns.

Don’t want to set this up yourself? Talk to an automation expert (free 15-minute consultation).

How It Works

Manual run from n8n. You click “Execute workflow” when you want a fresh pull of YC companies (weekly, daily, whenever your pipeline needs it).

Apify scrapes the YC directory page you chose. In the Apify “Run an Actor” step, you provide a YC search URL with your filters already applied. Apify visits each listing and returns structured fields instead of raw HTML.

n8n retrieves the dataset records. Once the actor run completes, the workflow fetches all dataset items, which typically include company info plus founder details when available.

Google Sheets gets updated. The final step writes the mapped values into your spreadsheet, so your columns stay consistent and your list is ready for enrichment, outreach, or import.

You can easily modify the YC search URL to target a new batch or industry based on your needs. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Manual Trigger

Start the workflow manually so you can validate the Apify scrape and Google Sheets updates before automating further.

Add the Manual Launch Trigger node as the workflow trigger.
Keep default settings for Manual Launch Trigger since no parameters are required.

The Flowpast Branding sticky note is purely informational and does not affect execution.

Step 2: Connect Apify and Launch the Actor

Configure the Apify actor run that scrapes the Y Combinator directory based on your filters.

Add the Execute Apify Actor node and connect it to Manual Launch Trigger.
Credential Required: Connect your apifyApi credentials.
Select your actor in Actor (currently set to [YOUR_ID]).
Set Custom Body to { "maxCompanies": 5, "startUrls": "{https://www.ycombinator.com/companies?industry=Fintech®ions=America%20%2F%20Canada&team_size=%5B%221%22%2C%2225%22%5D}", "proxyConfiguration": { "useApifyProxy": true } }.

⚠️ Common Pitfall: Replace [YOUR_ID] with your actual Apify Actor ID, or the run will fail.

Step 3: Retrieve the Apify Dataset Records

Pull the dataset output from the actor run so it can be written to Google Sheets.

Add the Retrieve Dataset Records node and connect it to Execute Apify Actor.
Credential Required: Connect your apifyApi credentials.
Set Resource to Datasets.
Set Dataset ID to {{ $json.defaultDatasetId }} so it uses the dataset created by the actor run.

Step 4: Configure the Google Sheets Output

Append or update startup records in your spreadsheet using the dataset fields.

Add the Update Spreadsheet Rows node and connect it to Retrieve Dataset Records.
Credential Required: Connect your googleSheetsOAuth2Api credentials.
Set Operation to appendOrUpdate.
Select your Document (currently [YOUR_ID]) and Sheet (currently gid=0 / Sheet1).
Map the column values as defined:
- Company → {{ $json.company_name }}
- Founded → {{ $json.year_founded }}
- Website → {{ $json.website }}
- LinkedIn → {{ $json.company_linkedin }}
- Location → {{ $json.company_location }}
- Description → {{ $json.long_description }}
- Industry Tags → {{ $json['tags/0'] }} {{ $json['tags/1'] }} {{ $json['tags/2'] }} {{ $json['tags/3'] }}
- Founder 1 Name → {{ $json['founders/0/name'] }}
- Founder 2 Name → {{ $json['founders/1/name'] }}
- Founder 1 LinkedIn → {{ $json['founders/0/linkedin'] }}
- Founder 2 LinkedIn → {{ $json['founders/1/linkedin'] }}
Ensure Matching Columns includes Company to update existing rows by company name.

⚠️ Common Pitfall: If your sheet headers don’t exactly match the column names (e.g., Company, Founded), the append/update may fail or insert blank fields.

Step 5: Test and Activate Your Workflow

Run a manual execution to verify the scrape and spreadsheet update, then activate for production use.

Click Execute Workflow to run Manual Launch Trigger and start the flow.
Confirm Execute Apify Actor completes and that Retrieve Dataset Records outputs a list of startup objects.
Check your Google Sheet to verify new or updated rows in Update Spreadsheet Rows.
When satisfied, toggle the workflow Active to enable production use.

🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Common Gotchas

Apify credentials can expire or need specific permissions. If things break, check your Apify token in n8n’s Credentials panel first.
If you’re using Wait nodes or external rendering, processing times vary. Bump up the wait duration if downstream nodes fail on empty responses.
Default prompts in AI nodes are generic. Add your brand voice early or you’ll be editing outputs forever.

Frequently Asked Questions

How long does it take to set up this Apify Google Sheets automation?

About 30 minutes if your sheet columns are already created.

Do I need coding skills to automate YC lead scraping?

No coding required. You’ll connect Apify and Google Sheets, then paste your YC search URL and map fields once.

Is n8n free to use for this Apify Google Sheets workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Apify usage, since the YC scraper runs on Apify credits.

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

Can I customize this Apify Google Sheets workflow for a different YC filter or batch?

Yes, and it’s the whole point. You change the YC directory search URL inside the Apify “Run an Actor” node, then adjust maxCompanies if you want a smaller or bigger pull. If your spreadsheet has extra columns, update the Google Sheets mapping to fill them. Some teams also add a “Batch” or “Source URL” column so they can trace exactly where each row came from later.

Why is my Apify connection failing in this workflow?

Usually it’s an expired or wrong Apify API token saved in n8n credentials. Regenerate the token in Apify, update the n8n credential, and try again. If the actor starts but returns empty data, double-check the YC search URL you pasted and confirm the actor still supports that page structure. Rate limits and low Apify credits can also cause runs to fail halfway through.

How many companies can this Apify Google Sheets automation handle?

A few hundred per run is typical, and you can control it with the actor’s maxCompanies setting.

Is this Apify Google Sheets automation better than using Zapier or Make?

For scraping-driven workflows, n8n is usually a better fit because it handles multi-step logic cleanly and you can self-host to avoid per-task pricing. Zapier and Make can work, but scraping often needs “run job → wait → fetch dataset → loop items → write rows,” and those platforms can get expensive or fiddly with that pattern. Another practical difference is control. In n8n you can add checks (skip blank websites, tag rows by batch, stop if the dataset is empty) without fighting the tool. If you only need a simple two-step sync, Zapier might be faster. If you’re not sure, Talk to an automation expert.

Once this is in place, YC sourcing stops being a “research day” and becomes a button you click. The workflow handles the repetitive stuff, and your sheet stays clean enough to actually use.

Apify to Google Sheets, YC leads ready to use

How This Automation Works

n8n Workflow Template: Apify to Google Sheets, YC leads ready to use

The Problem: YC lead research turns into spreadsheet busywork