Getting Started
Get Sabre running in under a minute. One install, one command, real answers.
1
Install Sabre
$
curl -fsSL https://getsabre.io/install.sh | bash
The installer automatically sets up Python, Ollama, and downloads the default model. No API keys or accounts required.
2
Run Sabre
Launch Sabre with the default model:
$
sabre
Use the --model flag to pick a specific model. See benchmarks for model recommendations.
$
sabre --model qwen3.6:35b-a3b
You can also pass a query directly:
$
sabre --model qwen3.6:35b-a3b "Scale my deployment to 5 replicas"
3
Your First Query
Ask Sabre about a real issue in your cluster:
$
sabre --model qwen3.6:35b-a3b "Why is my pod crashlooping?"
Here are more things you can try:
"Fix the image pull error on my deployment""Why are my pods stuck in pending?""Set up autoscaling for my-app to scale between 2-10 pods based on CPU""Create a role that allows reading pods in the dev namespace""What's consuming the most resources in my cluster?"
i
Data Storage
XDG Base Directories
Sabre follows the XDG Base Directory specification. All data stays on your machine:
~/.local/share/sabre/— Application data (conversation history, models)~/.local/state/sabre/logs/— Runtime logs~/.config/sabre/— Configuration files
5
Performance Tuning
Squeeze more performance out of Ollama with these environment variables:
| Variable | Value | Description |
|---|---|---|
OLLAMA_MULTIUSER_CACHE |
1 |
Enables prompt caching across requests. Reduces latency for repeated prefixes. |
OLLAMA_MLX |
1 |
Uses the MLX runner on macOS for faster Apple Silicon inference. |
Set these before starting Ollama:
$
OLLAMA_MULTIUSER_CACHE=1 OLLAMA_MLX=1 ollama serve