Step-by-step. Tested on real hardware. Written after we hit every wall so you don't have to.
Gemma 12B and Qwen3 across five runtimes on M1 Max. After the bf16 fix, speed is close. GGUF wins on quantization quality and ecosystem.
→ Base SetupYour Mac Mini can run local AI, auto-file documents, back up family photos, and talk to you. For $3/month in electricity.
→ LabMLX reports nearly 2x the generation speed of GGUF on Apple Silicon. The truth is more nuanced. I benchmarked both across three real workloads.
→ Base Setup16 days of wattmeter data from a Mac Studio M1 Max running 25 containers and local AI. 12W average, 50W peak. Under €40/year in Germany.
→ LabYou pulled a model, typed a question, got an answer. But what actually happened? Tokens, quantization, inference phases, and why Apple Silicon is so good at this.
→ MediaSet up Immich for your whole family. Partner sharing that merges two photo timelines, shared albums, account setup, and the one trick that actually gets your family using it.
→ MediaReplace Google Photos with Immich running on your Mac. Docker Compose setup, automatic iPhone backup, face recognition, ML tuning for Apple Silicon, and backup strategy.
→ Local AISet up Open WebUI with Ollama on your Mac and give every device a private ChatGPT-like interface. Docker setup, model picks for Apple Silicon, and day-to-day commands.
→ Local AIOllama explained from scratch for Apple Silicon Macs: what it is, how to install it, and how to run your first local AI model. No cloud, no account, no data leaving your machine.
→ Base SetupTurn your Mac Mini or Mac Studio into a 24/7 home server. macOS settings, Docker setup, static IP, backups, and what to buy used.
→Everything you need before installing your first service.
Your Mac Mini can run local AI, auto-file documents, back up family photos, and talk to you. For $3/month in electricity.
→16 days of wattmeter data from a Mac Studio M1 Max running 25 containers and local AI. 12W average, 50W peak. Under €40/year in Germany.
→Turn your Mac Mini or Mac Studio into a 24/7 home server. macOS settings, Docker setup, static IP, backups, and what to buy used.
→Run language models on your own hardware. No cloud, no API keys.
Set up Open WebUI with Ollama on your Mac and give every device a private ChatGPT-like interface. Docker setup, model picks for Apple Silicon, and day-to-day commands.
→Ollama explained from scratch for Apple Silicon Macs: what it is, how to install it, and how to run your first local AI model. No cloud, no account, no data leaving your machine.
→Photos, videos, and getting your family to actually use it.
Set up Immich for your whole family. Partner sharing that merges two photo timelines, shared albums, account setup, and the one trick that actually gets your family using it.
→Replace Google Photos with Immich running on your Mac. Docker Compose setup, automatic iPhone backup, face recognition, ML tuning for Apple Silicon, and backup strategy.
→Benchmarks, experiments, and the numbers behind our decisions.
Gemma 12B and Qwen3 across five runtimes on M1 Max. After the bf16 fix, speed is close. GGUF wins on quantization quality and ecosystem.
→MLX reports nearly 2x the generation speed of GGUF on Apple Silicon. The truth is more nuanced. I benchmarked both across three real workloads.
→You pulled a model, typed a question, got an answer. But what actually happened? Tokens, quantization, inference phases, and why Apple Silicon is so good at this.
→Want the story behind these guides? Read the Build Log →
Get notified when the repo goes online. One mail. Promise.