You've seen the tutorials. Spin up a Kaggle notebook. Install Ollama. Run a heavy open-source LLM for free. Expose it via ngrok. A bit janky, but it works — and it costs nothing.
Here's what nobody mentions upfront: your "free" inference isn't running on your hardware. It's running on Google's servers. Kaggle is owned by Google. Every query you send, every response your notebook generates, passes through Google infrastructure. That's not inherently malicious — but it means you've made a privacy tradeoff without choosing it.
What "Free" Actually Costs You
Privacy isn't what you pay at signup. It's what the platform operator can access. On Kaggle, that means plaintext queries and responses flowing through Google infrastructure — logged, indexed, subject to Google's data practices.
Ollama running on your own machine gives you real privacy. Ollama running on Kaggle gives you Google's servers with a different interface.
There are no Trusted Execution Environment guarantees. No hardware-level isolation preventing Google from reading your queries. No attestation you can verify. No on-chain proof of anything.
If you're running sensitive queries — health data, legal documents, financial records, personal research — "it's technically free" is the wrong thing to be optimising for.
The Hidden Friction
Kaggle's free tier comes with constraints most tutorials don't mention up front: 30 hours per week of GPU compute, and individual sessions cap at 12 hours. If you hit that wall, you're managing restarts. ngrok adds its own overhead — token management, reconnection logic, occasional dropouts.
The setup works. But it's a workaround, not a platform. And you're building on borrowed infrastructure with no service guarantees. Google can change quotas, sunset the kernel environment, or restrict model availability. You have no recourse.
What Private AI Inference Actually Looks Like
Hardware-level privacy is a verifiable property, not a marketing claim. Intel TDX (Trusted Domain eXtensions) creates hardware-isolated enclaves on the CPU. Code running inside a TDX enclave cannot be read or modified by the hypervisor, the OS, or the cloud provider — even with physical access to the machine.
Every query sent to a TEE-isolated inference node happens inside that enclave. The operator cannot see your prompt. The response is encrypted before it leaves the enclave. Each inference generates a cryptographic attestation proof stored permanently on Arweave. If you need to verify that your query ran inside a TEE, you get a public transaction ID you can inspect.
That's a different layer of the stack from anything Kaggle offers.
ZDrive: The Comparison
| Kaggle + Ollama | ZDrive | |
|---|---|---|
| Cost | Free (Google-owned infra) | Free tier + ~$0.02/query paid |
| Privacy | Plaintext through Google | Hardware TEE — operator can't read queries |
| Setup | ~20 min (notebook, ngrok, Ollama config) | Zero — load the app, start typing |
| Session limits | 30h/week quota, 12h max per session | None |
| Persistence | Notebook-dependent, Google-controlled | Permanent encrypted vault on Arweave |
| Attestation | None | Verifiable on-chain per inference |
| Model variety | Many (via Ollama registry) | 6 curated TEE-verified models |
Kaggle is genuinely useful if you want to experiment with models and privacy isn't a concern. ZDrive is for when you need privacy you can verify and don't want to manage GPU quotas or ngrok tunnels.
Try It
zdrive.io. Ten free queries, no account needed. If you like it, connect a wallet for 25 per day. If you need more, credits cost less than a coffee.