← All posts
May 24, 2026 · 5 min read

The Kaggle + Ollama + ngrok Setup Everyone Is Sharing Has One Problem

You've seen the tutorials. Spin up a Kaggle notebook. Install Ollama. Run a heavy open-source LLM for free. Expose it via ngrok. A bit janky, but it works — and it costs nothing.

Here's what nobody mentions upfront: your "free" inference isn't running on your hardware. It's running on Google's servers. Kaggle is owned by Google. Every query you send, every response your notebook generates, passes through Google infrastructure. That's not inherently malicious — but it means you've made a privacy tradeoff without choosing it.

What "Free" Actually Costs You

Privacy isn't what you pay at signup. It's what the platform operator can access. On Kaggle, that means plaintext queries and responses flowing through Google infrastructure — logged, indexed, subject to Google's data practices.

Ollama running on your own machine gives you real privacy. Ollama running on Kaggle gives you Google's servers with a different interface.

There are no Trusted Execution Environment guarantees. No hardware-level isolation preventing Google from reading your queries. No attestation you can verify. No on-chain proof of anything.

If you're running sensitive queries — health data, legal documents, financial records, personal research — "it's technically free" is the wrong thing to be optimising for.

The Hidden Friction

Kaggle's free tier comes with constraints most tutorials don't mention up front: 30 hours per week of GPU compute, and individual sessions cap at 12 hours. If you hit that wall, you're managing restarts. ngrok adds its own overhead — token management, reconnection logic, occasional dropouts.

The setup works. But it's a workaround, not a platform. And you're building on borrowed infrastructure with no service guarantees. Google can change quotas, sunset the kernel environment, or restrict model availability. You have no recourse.

What Private AI Inference Actually Looks Like

Hardware-level privacy is a verifiable property, not a marketing claim. Intel TDX (Trusted Domain eXtensions) creates hardware-isolated enclaves on the CPU. Code running inside a TDX enclave cannot be read or modified by the hypervisor, the OS, or the cloud provider — even with physical access to the machine.

Every query sent to a TEE-isolated inference node happens inside that enclave. The operator cannot see your prompt. The response is encrypted before it leaves the enclave. Each inference generates a cryptographic attestation proof stored permanently on Arweave. If you need to verify that your query ran inside a TEE, you get a public transaction ID you can inspect.

That's a different layer of the stack from anything Kaggle offers.

ZDrive: The Comparison

Kaggle + OllamaZDrive
CostFree (Google-owned infra)Free tier + ~$0.02/query paid
PrivacyPlaintext through GoogleHardware TEE — operator can't read queries
Setup~20 min (notebook, ngrok, Ollama config)Zero — load the app, start typing
Session limits30h/week quota, 12h max per sessionNone
PersistenceNotebook-dependent, Google-controlledPermanent encrypted vault on Arweave
AttestationNoneVerifiable on-chain per inference
Model varietyMany (via Ollama registry)6 curated TEE-verified models

Kaggle is genuinely useful if you want to experiment with models and privacy isn't a concern. ZDrive is for when you need privacy you can verify and don't want to manage GPU quotas or ngrok tunnels.

Try It

zdrive.io. Ten free queries, no account needed. If you like it, connect a wallet for 25 per day. If you need more, credits cost less than a coffee.

10 free queries. No account needed. Connect a wallet for 25/day.

Try ZDrive free →