I Tried Running OpenClaw Locally on My Mac Studio // famstack.dev

Around that time, the OpenClaw hype had just started. I installed it on my laptop, connected it to Google Gemini Flash, and I was gone. Chatted with this thing until 3 AM. It could browse the web, run tasks, hold context across the conversation. Bizarre, honestly, because at some point I could not tell anymore if I was talking to a person or an inference engine. The responses were that good.

That raises some interesting questions about where the line is between human and machine when the machine simulates a person well enough that you start developing feelings toward it. But that’s a topic for another day. Back to the server.

Running it at home

So of course my brain goes: what if I run this locally? An AI agent, in my basement, doing actual things. No cloud, no subscriptions, everything private. I had already pictured myself with a personal Jarvis at home.

I fired up OpenClaw on the Mac Studio, hooked up a 32B model, sent the first message. And then I waited. And waited. About two minutes until I got a response. Two minutes, for a single message in an instant messenger. And the answer was garbage on top of it. Didn’t address what I asked, hallucinated stuff, complete nonsense.

I had expected a Jarvis. What I got was a brain-amputated butler who needs two minutes to misunderstand you.

Why it actually failed

I figured this out much later, after reading way too many GitHub issues and benchmark threads. At the time I just thought the model was bad. It wasn’t. The problem is how OpenClaw works under the hood.

When you chat through Open WebUI, you send a message, maybe a short system prompt, and the model responds. A few hundred tokens in, a few hundred out. A 32B model on the M1 Max does 15 to 20 tokens per second like that. Feels snappy enough.

OpenClaw is a different beast. It’s an agent framework, not a chatbot. Every message you send gets wrapped in a massive context payload: system prompt with agent rules, every tool definition as JSON schemas, workspace files, memory files, skill metadata, and the full conversation history including all tool results. People measured this on GitHub. Saying “hello” to a fresh session sends 12,000 to 14,000 tokens. With tools and workspace files loaded, past 20,000. One user found their config schema alone eating over 100K tokens. And the API is stateless, so every single message resends the entire context from scratch.

On Gemini Flash running on Google’s TPU clusters, processing 20,000 tokens takes milliseconds. On a Mac Studio M1 Max, the model has to prefill all of that locally before producing a single word. For a 32B model with 15K to 20K tokens of context, prefill alone takes 10 to 20 seconds. Then there’s memory pressure. The model in Q4 occupies about 20 GB. macOS takes its cut. The KV cache grows with every token. Add Docker containers and Immich indexing in the background, and the Mac starts swapping to disk. Two minutes for a response? The math checks out.

I’m not the only one who ran into this. The r/ClawdBot subreddit is full of people disappointed by the same thing: massive token waste, surprise bills of $300 to $700 per month, and a context management system that sends 120,000 tokens for a simple heartbeat check. On cloud APIs that’s a wallet problem. Locally it’s a performance problem. Same root cause.

The Mac Mini trap

There’s a trend right now where people buy Mac Minis to run OpenClaw. YouTube is full of it. “Your own AI agent on a Mac Mini!” But OpenClaw itself is a Node.js process. It sends messages to an LLM and executes tool calls. You could run it on a ten year old ThinkPad. The agent is not the heavy part. The LLM is.

If you connect OpenClaw to a cloud API like Claude or Gemini, the Mac Mini sits there doing almost nothing. You’re paying $600+ for a machine that forwards API calls. Any laptop with WiFi does the same job.

If you want to run the LLM locally too, a Mac Mini doesn’t have enough memory. Even the 36 GB configuration can barely fit a quantized 32B model, and as I just found out, a 32B model with an agent framework on top doesn’t produce usable results. You need 64 GB minimum for enough headroom, and even that is tight. That’s Mac Studio territory.

If you already bought one though, don’t feel bad. A Mac Mini is a perfectly capable home server. It just shouldn’t be running only OpenClaw connected to Claude’s API. Run Immich on it. Run AdGuard, Paperless, a family chat bot, backup automation. Run actual services that benefit from local hardware and make your family’s life better. That’s what famstack is about. The Mac Mini isn’t the problem. Using it as a $600 API relay is.

For simple chat, local models are great. But agent frameworks add so much context overhead that local hardware can’t keep up. Not yet, at least.

I ended up going a very different route for famstack: nanobot, an ultra-lightweight agent framework that’s about 4,000 lines of Python instead of OpenClaw’s 430,000. It actually works with local models because it doesn’t drown them in context. Still experimenting with it, but the first results are promising. More on that in a future post.

I uninstalled OpenClaw that same evening for famstack.

Back to the actual plan

Right. The photo library. That was the reason I bought this machine in the first place, or one of them anyway. My wife and I have years of photos of our kids. Vacations, everyday stuff, the kind of things you absolutely don’t want to lose. And we had been sending them back and forth over WhatsApp and Signal, like in the stone age. “Here are the 47 photos from Sunday, have fun with the compression artifacts.”

So I set up Immich. Self-hosted Google Photos replacement, runs in Docker. Had it running within a few minutes on the Mac Studio. Upload worked, mobile app connected, face recognition started doing its thing. Technically, done. (I wrote up the full Immich setup on Mac and family sharing configuration as separate guides if you want to do the same.)

But then you realize: technically done and actually usable are not the same thing at all.

192.168.1.42:2283

That’s what the Immich URL looked like. Some IP address with a random port number. I had to type that into my wife’s phone, into my phone, bookmark it everywhere. Every service had its own port. Immich on 2283, Open WebUI on 3000. It just didn’t look right. More like a developer’s localhost than something I’d ask my wife to use.

My bar is higher than that, even for something running at home. Clean URLs, no ports, HTTPS. The same thing you’d expect from any cloud service, just running on a Mac in the living room.

The next rabbit hole

So I installed AdGuard Home. It’s a DNS server with ad blocking, but what I actually needed was DNS rewrites. So instead of that ugly IP and port, I could have photos.home resolve to the right machine. Combined with Caddy as a reverse proxy, every service got its own clean URL. No more port numbers.

And then one thing led to the next. Clean URLs need a reverse proxy. A reverse proxy needs a domain. And then of course you want HTTPS, because most mobile apps expect it and browsers throw warnings without it.

I went from “I just want to share photos with my wife” to running my own DNS server, a reverse proxy, managing TLS certificates, and setting up domain routing. Over a single weekend.

That escalated quickly.

What’s next

The whole DNS and HTTPS setup is material for its own guide. Once you understand how the pieces fit together, it makes your home server feel like an actual product. Clean URLs with valid HTTPS certificates, on your local network. I’ll write that up separately.

The photo library was live. Clean URL, apps connected, my wife could browse all our photos from her phone. First real win. But the server was already running way more services than I had originally planned, and I was nowhere near done.

Previous: Local LLMs on a Mac: From Magic to Disappointment