Date Created: 2025-05-07
By: 16BitMiker
[ BACK.. ]
I picked up a refurbished Mac with Apple Silicon—specifically an M2 model with 96GB of RAM—after spotting it at a surprisingly reasonable price. It wasn’t the latest generation, but the specs were exactly what I was looking for: powerful enough to experiment with large language models locally, without relying on cloud infrastructure or remote GPUs.
It’s been an incredible experience. With this setup, I’m able to run LLMs right on the machine, talk to them through a clean browser interface, and keep everything containerized and isolated using Docker. In this guide, I’ll walk you through everything I did to get Ollama and Open WebUI working together on macOS, including a networking trick with socat
that bridges Docker to your local system.
Let’s get started. 🚀
A Mac with Apple Silicon (M2, M3, M4, etc.)
Ollama installed locally via Homebrew
Docker Desktop installed and running
Homebrew (for installing socat)
Familiarity with the terminal and the zsh shell
First, make sure you have the tools we’ll need:
xxxxxxxxxx
brew install socat
brew install --cask docker
brew install ollama
Now start the Ollama service:
xxxxxxxxxx
ollama serve
This runs a local HTTP server on port 11434 and makes Ollama ready to receive requests.
On macOS, Docker containers can’t directly reach services running on localhost
. To work around this, we use socat
to proxy traffic from an external port (11435) to Ollama’s internal service on localhost:11434
.
This method ensures the proxy starts automatically every time you log in.
🔹 Create a new .plist
file:
xxxxxxxxxx
mkdir -p ~/Library/LaunchAgents
nano ~/Library/LaunchAgents/com.ollama.proxy.plist
Paste the following content:
xxxxxxxxxx
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.ollama.proxy</string>
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/bin/socat</string>
<string>TCP-LISTEN:11435,fork,reuseaddr,bind=0.0.0.0</string>
<string>TCP:localhost:11434</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/ollama-proxy.log</string>
<key>StandardErrorPath</key>
<string>/tmp/ollama-proxy.err</string>
</dict>
</plist>
🔹 Load the launch agent:
xxxxxxxxxx
launchctl load ~/Library/LaunchAgents/com.ollama.proxy.plist
To unload it:
xxxxxxxxxx
launchctl unload ~/Library/LaunchAgents/com.ollama.proxy.plist
socat
via ~/.zprofile
If you’d rather not use the macOS launch agent system, you can simply start the proxy on shell login:
xxxxxxxxxx
echo 'nohup /opt/homebrew/bin/socat TCP-LISTEN:11435,fork,reuseaddr,bind=0.0.0.0 TCP:localhost:11434 > /tmp/ollama-proxy.log 2> /tmp/ollama-proxy.err &' >> ~/.zprofile
This adds a background job to your login shell that starts socat
and logs output to /tmp
.
.zshrc
To simplify your workflow, add a few helpful aliases to your ~/.zshrc
:
xxxxxxxxxx
# Check if the proxy is listening
alias check-ollama-proxy="lsof -i :11435"
# Start the proxy manually
alias start-ollama-proxy="nohup /opt/homebrew/bin/socat TCP-LISTEN:11435,fork,reuseaddr,bind=0.0.0.0 TCP:localhost:11434 > /tmp/ollama-proxy.log 2> /tmp/ollama-proxy.err &"
# Stop all socat processes
alias stop-ollama-proxy="killall socat"
Apply your changes:
xxxxxxxxxx
source ~/.zshrc
Next, let’s launch Open WebUI using Docker Compose. This gives you a clean browser-based frontend for interacting with your local Ollama models.
Create a folder (e.g., ollama-webui
) and save the following as docker-compose.yml
:
xxxxxxxxxx
version'3'
services
openwebui
image ghcr.io/open-webui/open-webui main
platform linux/arm64
ports
"3000:8080"
environment
OLLAMA_BASE_URL=http://host.docker.internal:11435
WEBUI_SECRET_KEY=b2be59d1fe71abd3c5828b01f5c0f5dac939cb1439303166560925b786096c9b
AIOHTTP_CLIENT_TIMEOUT=600
volumes
openwebui_data:/app/backend/data
restart unless-stopped
volumes
openwebui_data
🔹 Start the container:
xxxxxxxxxx
docker compose up -d
🔹 Stop it when done:
xxxxxxxxxx
docker compose down
Let’s test the full stack:
Start Ollama:
xxxxxxxxxx
ollama serve
Confirm the proxy is running:
xxxxxxxxxx
check-ollama-proxy
Open your browser to:
xxxxxxxxxx
http://localhost:3000
You should now see Open WebUI connected to your local Ollama instance.
This setup runs beautifully on Apple Silicon. Thanks to the M2’s performance and unified memory, even larger models respond quickly and don’t bog down the system.
Although my machine isn’t the newest on the block, its 96GB RAM and efficient chip architecture make it well-suited for serving and interacting with local AI models. The combination of Docker isolation, Ollama’s CLI server, and Open WebUI’s frontend makes for a clean and reliable environment.
If you're exploring local LLM workflows on Apple Silicon, this is a setup worth trying. It’s lightweight, private, and fast. With Docker for isolation, a socat
bridge for networking, and Open WebUI for interaction, it’s easy to spin up and maintain.
You don’t need the absolute latest hardware to get started. Just a well-equipped Mac and a little configuration, and you’ll be running your own local AI stack with no cloud dependencies.
Happy hacking! 🧪✨