Nvidia API + Google Sheets, compare models side by side

Testing prompts across multiple models sounds simple until you do it all week. You end up with four browser tabs, mismatched settings, lost “best answer” examples, and a spreadsheet that’s never quite up to date.

This hits growth marketers running copy experiments hardest, but product teams and agency leads feel it too. Nvidia model comparison becomes a time sink when every run is manual, inconsistent, and hard to audit later.

This workflow sends one prompt to several Nvidia-hosted models in parallel, returns a clean JSON response, and (with a small extension) logs every run to Google Sheets so you can compare side by side and actually make a decision.

How This Automation Works

See how this solves the problem:

n8n Workflow Template: Nvidia API + Google Sheets, compare models side by side

Click to explore

flowchart LR

    subgraph sg0["Flow 1"]
        direction LR
        n0["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Webhook Trigger"]
        n1@{ icon: "mdi:swap-vertical", form: "rounded", label: "Format Response", pos: "b", h: 48 }
        n2["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/webhook.dark.svg' width='40' height='40' /></div><br/>Send Aggregated AI Model Res.."]
        n3["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/merge.svg' width='40' height='40' /></div><br/>Merge AI Model"]
        n4@{ icon: "mdi:swap-horizontal", form: "rounded", label: "AI Model Router", pos: "b", h: 48 }
        n5["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Query Qwen3-next-80b-a3b-thi.."]
        n6["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Query Bytedance/seed-oss-36b.."]
        n7["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Query Nvidia-nemotron-nano-9.."]
        n8["<div style='background:#f5f5f5;padding:10px;border-radius:8px;display:inline-block;border:1px solid #e0e0e0'><img src='https://flowpast.com/wp-content/uploads/n8n-workflow-icons/httprequest.dark.svg' width='40' height='40' /></div><br/>Query DeepSeekv3_1"]
        n3 --> n1
        n4 --> n5
        n4 --> n6
        n4 --> n8
        n4 --> n7
        n1 --> n2
        n0 --> n4
        n8 --> n3
        n7 --> n3
        n5 --> n3
        n6 --> n3
    end

    %% Styling
    classDef trigger fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
    classDef ai fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    classDef aiModel fill:#e8eaf6,stroke:#3f51b5,stroke-width:2px
    classDef decision fill:#fff8e1,stroke:#f9a825,stroke-width:2px
    classDef database fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    classDef api fill:#fff3e0,stroke:#e65100,stroke-width:2px
    classDef code fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef disabled stroke-dasharray: 5 5,opacity: 0.5
    class n4 decision
    class n0,n2,n5,n6,n7,n8 api
    classDef customIcon fill:none,stroke:none
    class n0,n2,n3,n5,n6,n7,n8 customIcon

The Challenge: Comparing AI model outputs without chaos

If you’re evaluating models for content, support replies, code review, or research, the hardest part is rarely “getting an answer.” It’s keeping the test fair and the evidence organized. One run uses a different temperature. Another run quietly hits a different model version. Then someone asks, “Which model did we pick for that onboarding flow?” and you have nothing but a Slack thread and a half-finished doc. The mental load is real, and it slows shipping because you can’t trust your own comparisons.

It adds up fast. Here’s where it usually breaks down.

You lose hours each week re-running “the same” prompt because the earlier results aren’t stored in a consistent format.
Side-by-side comparisons get biased because settings drift between models, especially when you’re testing quickly.
Copy-paste logging into Sheets introduces mistakes, and those mistakes quietly ruin the conclusion you thought you proved.
When a model times out, the whole experiment stalls, so you postpone decisions and keep debating in circles.

The Fix: Parallel model runs + clean outputs you can log

This n8n workflow gives you a repeatable way to compare several Nvidia-hosted models from one prompt. It starts with a webhook, so your team can trigger tests from a simple HTTP call (or a lightweight internal form). Based on your “model choice” rules, n8n routes the request into parallel branches and calls each model through Nvidia’s API using HTTP Request nodes. A Merge node then pulls the responses together, even if one model is slow or times out, and the workflow shapes the final payload into a predictable structure. The webhook responds with JSON containing each model’s output, ready for review or logging.

The flow begins when you send a prompt to the webhook. Then the Switch node fans the request out to Qwen, Seed-OSS, DeepSeek, and Nemotron in parallel. Finally, the workflow merges the results and returns one combined response that’s easy to store in Google Sheets.

What Changes: Before vs. After

What This Eliminates

Impact You’ll See

Running the same prompt four times in four different UIs.
Guessing which settings you used last time because nothing was standardized.
Waiting on slow models before you can see any output at all.
Manual copy-paste into Sheets that slowly corrupts your test history.

Four model answers returned in about 2–3 seconds for most prompts.
Clean, comparable outputs that you can score, tag, and filter later.
Fewer “redo” runs because your inputs and metadata are consistent.
Faster model decisions for real use cases like support macros and content briefs.
A shared record in Google Sheets that your team can reference without hunting.

Real-World Impact

Say you evaluate 20 prompts a week and compare 4 models each time. Manually, you might spend about 3 minutes per model (open, paste, run, copy results), which is roughly 4 hours weekly just on busywork. With this workflow, you submit one request (about 1 minute), wait 2–3 seconds for parallel responses, then log the combined output to Google Sheets in one shot. Most teams get about 3 hours back a week and end up with cleaner evidence for the final pick.

Requirements

n8n instance (try n8n Cloud free)
Self-hosting option if you prefer (Hostinger works well)
Nvidia API to query Qwen, DeepSeek, Nemotron, Seed-OSS
Google Sheets to store results and compare runs
Nvidia API key (get it from build.nvidia.com)

Skill level: Intermediate. You’ll mostly paste API credentials, adjust a few fields, and validate the response format.

Need help implementing this? Talk to an automation expert (free 15-minute consultation).

The Workflow Flow

A webhook receives your prompt. You send a simple request that includes the prompt text and any test metadata you care about (like use case, temperature, max tokens, or a “run name”).

The workflow routes the request to the right model branches. A Switch node decides which model calls to execute, so you can run all four for comparisons or only one when you just need a fast answer.

Nvidia API calls run in parallel. Four HTTP Request nodes hit Qwen, Seed-OSS, DeepSeek R1, and Nemotron Nano at the same time, which is why you see results in seconds instead of waiting sequentially.

Results are merged and shaped for storage. Merge collects whatever returns successfully (partial results can still continue), then Set formats a clean payload and Respond to Webhook returns it as JSON. If you add the Google Sheets node after formatting, each run becomes one row with four output columns.

You can easily modify which models run for a given request to match your testing style. See the full implementation guide below for customization options.

Step-by-Step Implementation Guide

Step 1: Configure the Webhook Trigger

Set up the inbound webhook that initiates the workflow and hands the incoming payload to the router.

Add and open Incoming Webhook Start.
Set HTTP Method to POST.
Set Path to 6737b4b1-3c2f-47b9-89ff-a012c1fa4f29.
Set Response Mode to responseNode to pass control to Return Combined Reply.

Use a POST client (curl/Postman) and include fields like AI Model and Insert your Query in the JSON body to match downstream expressions.

Step 2: Route Requests to the Correct Model

Use the switch node to choose which AI model request node should run based on the incoming AI Model value.

Open Route Model Choice.
Confirm the first rule compares ={{ $json['AI Model'] }} to 1.
Confirm the second rule compares ={{ $json['AI Model'] }} to 2.
Confirm the third rule compares ={{ $json['AI Model'] }} to 3.
Confirm the fourth rule compares ={{ $json['AI Model'] }} to 4.
Confirm the fifth rule compares ={{ $json['AI Model'] }} to 5.

⚠️ Common Pitfall: If your webhook payload uses a different field name (e.g., model instead of AI Model), the switch will not match any rule.

Step 3: Configure the Model Request Nodes

Each selected branch makes an HTTP request to NVIDIA’s chat completions API with a model-specific payload.

Open Request Qwen3 Thinking and set URL to https://integrate.api.nvidia.com/v1/chat/completions and Method to POST.
Set JSON Body in Request Qwen3 Thinking to ={ "model": "qwen/qwen3-next-80b-a3b-thinking", "messages": [ { "role": "user", "content": "{{ $('On form submission').item.json['Insert your Query'] }}" } ], "temperature": 0.7, "max_tokens": 1024 }.
Credential Required: Connect your httpBearerAuth credentials in Request Qwen3 Thinking.
Open Request Seed-OSS Response and set JSON Body to ={ "model": "bytedance/seed-oss-36b-instruct", "messages": [ { "role": "user", "content": "{{ $json['Insert your Query'] }}" } ], "temperature": 1.1, "top_p": 0.95, "max_tokens": 4096, "thinking_budget": -1, "frequency_penalty": 0, "presence_penalty": 0, "stream": false }.
Credential Required: Connect your httpBearerAuth credentials in Request Seed-OSS Response.
Open Request DeepSeek R1 and set JSON Body to ={ "model": "deepseek-ai/deepseek-r1", "messages": [ { "role": "user", "content": "{{ $('On form submission').item.json['Insert your Query'] }}" } ], "temperature": 0.6, "top_p": 0.7, "frequency_penalty": 0, "presence_penalty": 0, "max_tokens": 4096, "stream": true }.
Credential Required: Connect your httpBearerAuth credentials in Request DeepSeek R1.
Open Request Nemotron Nano and set JSON Body to { "model": "nvidia/nvidia-nemotron-nano-9b-v2", "messages": [ { "role": "system", "content": "/think" } ], "temperature": 0.6, "top_p": 0.95, "max_tokens": 2048, "min_thinking_tokens": 1024, "max_thinking_tokens": 2048, "frequency_penalty": 0, "presence_penalty": 0, "stream": true }.
Credential Required: Connect your httpBearerAuth credentials in Request Nemotron Nano.

⚠️ Common Pitfall: The expressions reference $('On form submission') in two nodes. Ensure your input data exists at runtime or update the expression to match Incoming Webhook Start payload.

Step 4: Merge and Shape the Response

Combine the model outputs, standardize the structure, and return the response to the webhook caller.

Open Combine Model Outputs and set Number of Inputs to 4.
Open Shape Output Payload and add an assignment with Name choices[0].message.content.
Set the Value to ={{ $json.choices[0].message.content }} in Shape Output Payload.
Confirm Shape Output Payload outputs to Return Combined Reply.
Open Return Combined Reply and leave default Options unless you want custom headers or status codes.

If you want to return all model outputs, expand Shape Output Payload to include additional fields from the merged data instead of only choices[0].message.content.

Step 5: Test and Activate Your Workflow

Verify end-to-end execution from the webhook to the combined response and then enable it for production use.

Click Execute Workflow and send a POST request to the Incoming Webhook Start URL with JSON containing AI Model and Insert your Query.
Confirm the selected request node (Request Qwen3 Thinking, Request Seed-OSS Response, Request DeepSeek R1, or Request Nemotron Nano) runs and outputs data to Combine Model Outputs.
Verify the response payload returned by Return Combined Reply contains choices[0].message.content.
Toggle the workflow to Active to enable production webhook handling.

🔒

Unlock Full Step-by-Step Guide

Get the complete implementation guide + downloadable template

Watch Out For

Nvidia API credentials can expire or lack model permissions. If things break, check your model access and API key status in the Nvidia dashboard first.
If you’re using parallel branches and one model is slower, Merge may continue with partial results. That’s useful, but it can confuse reporting unless your Google Sheets row includes a “missing/timeout” flag.
Default prompts and parameters are usually too generic for real evaluation. Lock temperature/max tokens early, or your comparisons will be apples-to-oranges and you’ll end up editing outputs forever.

Common Questions

How quickly can I implement this Nvidia model comparison automation?

About 30 minutes if you already have your Nvidia API key and a test Sheet ready.

Can non-technical teams implement this model comparison?

Yes, but someone should be comfortable pasting API keys and testing a webhook request. Once it’s set up, running comparisons is as easy as submitting a prompt.

Is n8n free to use for this Nvidia model comparison workflow?

Yes. n8n has a free self-hosted option and a free trial on n8n Cloud. Cloud plans start at $20/month for higher volume. You’ll also need to factor in Nvidia API usage (free tier is available, then pay-as-you-go).

Where can I host n8n to run this automation?

Two options: n8n Cloud (managed, easiest setup) or self-hosting on a VPS. For self-hosting, Hostinger VPS is affordable and handles n8n well. Self-hosting gives you unlimited executions but requires basic server management.

How do I adapt this Nvidia model comparison solution to my specific challenges?

You can tweak the Switch rules in “Route Model Choice” to run only the models you care about for a given prompt. Most teams also customize the HTTP Request nodes to lock temperature/max_tokens, then adjust “Shape Output Payload” to add columns like use case, grader notes, or a simple winner field for Google Sheets.

Why is my Nvidia API connection failing in this workflow?

Usually it’s an invalid or expired Bearer token, or your account doesn’t have access to one of the selected models. Check the Authorization header in each HTTP Request node, then confirm model access in the Nvidia dashboard. If it fails only under load, you may be hitting rate limits, so reduce concurrency or add retries.

What’s the capacity of this Nvidia model comparison solution?

On n8n Cloud, capacity depends on your plan’s executions per month, and each prompt run counts as one workflow execution even though it calls multiple models. If you self-host, there’s no execution cap, but your server and Nvidia rate limits become the real constraint. Practically, many teams run dozens or a few hundred comparisons a day without issues. If you want sustained high volume, add a queue and store results asynchronously so timeouts don’t pile up.

Is this Nvidia model comparison automation better than using Zapier or Make?

Often, yes. Parallel branching and merge behavior are much easier to control in n8n, which matters when you’re hitting four models at once and want partial results on timeout. You also get self-hosting for unlimited executions, which can be a big deal if you’re logging lots of runs to Google Sheets. Zapier or Make can work, but multi-branch LLM testing tends to get expensive and awkward as soon as you add retries, routing, and structured logging. Talk to an automation expert if you’re weighing the tradeoffs.

Once this is running, model testing stops being a debate and starts being a record. You’ll have cleaner comparisons, faster picks, and a Sheet you can trust next month too.

Nvidia API + Google Sheets, compare models side by side

How This Automation Works

n8n Workflow Template: Nvidia API + Google Sheets, compare models side by side

The Challenge: Comparing AI model outputs without chaos