← All guides

Open WebUI + Ollama on Mac: Local AI for Your Home Network

Set up Open WebUI with Ollama on your Mac and give every device a private ChatGPT-like interface. Docker setup, model picks for Apple Silicon, and day-to-day commands.

What you’ll build: Open WebUI running in Docker on your Mac, accessible from any phone, tablet, or laptop on your home network.

End state: A browser-based chat interface your whole family can use. Pick a model, start a conversation, no accounts or internet required. A private ChatGPT alternative running entirely on your hardware.

What you’ll understand: How to pick the right model for your hardware, what local AI is actually good at in 2026, and how to set up Open WebUI so your family can use it from any device.

Prerequisites: Ollama installed and working on your Mac. If you haven’t done that yet, start with What is Ollama and How Do You Run It on a Mac?

Picking the right model

Not all models are created equal, and what works depends on your hardware. Here’s what I’ve tested on a Mac Studio M1 Max with 64GB unified memory.

ModelSize on diskRAM neededGood forSpeed (M1 Max 64GB)
Llama 3.2 3B~2 GB~4 GBQuick tasks, testing, low overheadVery fast, near-instant
Llama 3.1 8B~4.7 GB~8 GBGeneral purpose, daily driverFast, comfortable for chat
Qwen 2.5 32B~20 GB~24 GBBest quality at reasonable speedMedium, ~15 tok/s
Qwen 2.5 Coder 7B~4.7 GB~8 GBCode generation, reviewFast
Qwen 2.5 Coder 32B~20 GB~24 GBComplex coding tasksMedium, ~15 tok/s
Mistral 7B~4.1 GB~8 GBCompact, good European languagesFast
DeepSeek Coder V2~8.9 GB~12 GBCode specialist, fill-in-the-middleModerate

To pull any of these:

ollama pull llama3.2:3b
ollama pull qwen2.5:32b
ollama pull qwen2.5-coder:7b
ollama pull qwen2.5-coder:32b
ollama pull mistral:7b
ollama pull deepseek-coder-v2:latest

A few notes on picking models:

Start with Llama 3.1 8B. It’s the safest default. Fast, capable enough for most things, leaves plenty of RAM for your other services.

Qwen 2.5 32B is the sweet spot for 64GB machines. Noticeably better answers than the 8B models. Uses a big chunk of memory but leaves enough room for Docker services running alongside it. This is what I reach for when quality matters.

3B models are useful, not just toys. Llama 3.2 3B handles summarization, simple Q&A, and text reformatting well enough. If you’re building automations that make many small requests, the speed advantage matters more than the quality gap.

Ollama loads models into memory on first request and unloads them after 5 minutes of inactivity (configurable). You don’t need to worry about memory management. Pull several models and switch between them as needed.

For more on model names, quantization, and browsing the full model ecosystem, see What is Ollama → What else can you run?

What local models are good at (and not)

Be realistic about what these can do. Local models in the 7-32B parameter range are not ChatGPT or Claude replacements. What works depends heavily on which model and how many parameters you throw at it.

Works well, even with smaller models (3-8B):

Works with larger models (32B+):

Not there yet:

The quality gap compared to cloud models is real. What makes up for it is privacy, which we’ll get to once Open WebUI is running.

Install Open WebUI

Open WebUI gives you a browser-based chat interface that talks to Ollama. Think of it as a self-hosted ChatGPT alternative you can run on your Mac.

Create a directory for the stack and a docker-compose.yml inside it:

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "3002:8080"
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - open-webui-data:/app/backend/data
    environment:
      - OLLAMA_BASE_URL=http://host.docker.internal:11434
      - WEBUI_AUTH=false

volumes:
  open-webui-data:

A quick breakdown of what each piece does:

Start it:

docker compose up -d

You should see Docker pull the image (first time only, about 2GB) and start the container:

[+] Running 1/1
 ✔ Container open-webui  Started

Open http://localhost:3002 in your browser. On first visit, create an admin account (this is local, the account only exists on your machine). Pick a model from the dropdown at the top, and start chatting. If the dropdown is empty, Open WebUI can’t reach Ollama. Check that Ollama is running (curl http://localhost:11434/api/version) and that the OLLAMA_BASE_URL in your compose file is correct.

Once Open WebUI is running, you can also pull models directly from its built-in model browser without touching the terminal. Convenient when you want to try something mid-conversation.

Making it available on your network

Because we mapped port 3002, Open WebUI is accessible from any device on your local network at http://<your-mac-ip>:3002. Your family can bookmark it on their phones and laptops. It looks and feels like ChatGPT, so there’s no learning curve.

With WEBUI_AUTH=false, there’s no login screen. Anyone on your network can use it. For a home network behind a router, that’s fine. If you want per-user chat history or access control, remove that line and let each family member create an account on first visit.

Things to try

Once Open WebUI is running, paste in a school newsletter and ask for a summary. Paste a recipe and ask it to scale from 2 to 6 portions. Ask it to draft a reply to your landlord about the utility bill.

The interesting part is what happens when the whole family starts using it. My wife pastes in letters from the school or the insurance company and asks what they actually mean. I’ve checked employment contracts for notice periods. Asked about my kid’s rash at 10 PM. The kind of questions you wouldn’t type into Google or ChatGPT because they’re too personal, too specific to your family. With a local model, there’s nobody on the other end. No account, no history, no profile being built. It’s just your Mac.

One more thing worth knowing: the model you downloaded today will behave the same way in six months. No silent updates that change how it responds. If you’ve used ChatGPT long enough to notice a model getting worse after an update, you know how annoying that is. Local models are frozen. You update when you choose to.

What about LM Studio?

LM Studio is a popular alternative. It has a GUI, a CLI, can run as a headless daemon, and supports Apple’s MLX framework which runs 20-30% faster than Ollama’s GGUF backend on Apple Silicon. Worth looking at, especially if inference speed matters to you. We’ll cover LM Studio in a separate guide and compare the two in detail.

This guide focuses on Ollama because it has the broader ecosystem today. Open WebUI, n8n, Continue (VS Code), LangChain, and most other tools that integrate with local LLMs expect an Ollama endpoint.

Checklist


Frequently asked questions

Can I run ChatGPT locally without an account? Not ChatGPT itself, that’s OpenAI’s service and it stays on their servers. But Open WebUI with Ollama gives you the same chat interface running on your Mac. No account, no subscription, no data leaving your network. Your family won’t know the difference until they ask it something really hard.

How much RAM does Open WebUI need? The Open WebUI container uses about 500MB. The model is what eats memory. An 8B model needs ~8GB, a 32B model ~24GB. On a 64GB Mac you can run both alongside Immich and a dozen other containers without thinking about it.

Can my family use this from their phones? Open WebUI is a web app. Bookmark http://<your-mac-ip>:3002 on their phone, done. Looks and feels like ChatGPT. My wife started using it the same day I set it up, no tutorial needed.

Is this a good ChatGPT alternative for private questions? For the kind of questions you wouldn’t type into Google because they’re too personal, too specific, too embarrassing? A 32B local model handles those fine. Rashes at 10 PM, insurance letters you don’t understand, employment contract fine print. Cloud models are still ahead for complex research. But nothing here is logged, stored, or profiled.


Next steps: Want to manage your family’s photos too? Set up Immich on your Mac →

From the Build Log: I Bought a Used Mac Studio to Run Local LLMs and Local LLMs on a Mac: From Magic to Disappointment.

We invested the time to perfect the setup. So you don't have to.

Check out famstack.dev →

Try it with your local LLM

Copy this guide and paste it into Open WebUI or any local chat interface as a new conversation. Your local model becomes a setup assistant that walks you through each step, explains commands, and helps troubleshoot errors.

I'm making this reusable for you.

Get notified when the repo goes online. One mail. Promise.