Running Gemma 4 in my Homelab

I started experimenting with local LLMs on my Proxmox cluster through Ollama. After some trial and error, and giving the LXC more RAM than I first expected, I got Gemma 4 running.

It is not a replacement for a full cloud model. The slower speed and lower capability are noticeable, especially on my homelab hardware. But that is not really the goal.

I use OpenClaw as an orchestration layer. Because different agents can use different models, I can route specific tasks to the local Gemma instance instead of burning cloud tokens for everything. The same idea also fits nicely with Opencode, where small tasks can run locally while stronger cloud models handle the heavier work.

For example: I have an OpenClaw agent that helps me keep track of the household chores I need to do each week. It is useful, recurring, and personal, but it does not need a premium cloud model. That makes it a perfect task to offload to my local Gemma 4 instance.

In general, you can look into offloading things like:

Background tasks that do not need instant replies
Lightweight summarization
Preprocessing or categorization
Simple automation flows
Tasks where “good enough” is actually good enough

The interesting part is how this changes the setup. Instead of picking a cloud model for every task, I can split the work into layers: local models for cheap background work, stronger cloud models when the extra intelligence is actually worth paying for. I am not fully convinced this is the final setup yet, but with premium models getting more expensive it feels worth exploring.