I’m excited to share the first open-source release of my local-first AI automation engine: Guardian OSS.
This is the same minimal system that powers my pipelines for:
- 🧠 memory ingestion and reflection
- ✍️ blog post generation
- 🔁 autonomous codegen loops
- 🤖 LLM orchestration
Now it’s available for anyone to use, build on, or extend.
🧱 What’s Included in Guardian OSS
Guardian OSS is a small but powerful local-first backend designed to help you:
- Run your own AI agents locally
- Automate pipelines and workflows
- Store, retrieve, and reason over long-term memory
- Use your own models (via Ollama) for private inference
It’s made up of four services:
Service | Description |
---|---|
🧠 guardian-open | The main backend server that connects Supabase, Qdrant, Ollama, and all pipeline logic |
✍️ codegen-worker | A microservice that generates code/tests/docs from LLM prompts |
🔁 ollama-proxy | A local router for LLM requests that injects memory or JSON schemas into prompts |
🛠️ guardian-cli | A command-line interface to run workflows like blog generation or memory ingestion |
Each one runs independently, but they’re designed to work together.
🧠 guardian-open
: The Core AI Backend
Why it exists:
You needed a central intelligent layer that could unify the systems you were already building:
- Supabase for relational + auth
- Qdrant for vector search and memory retrieval
- Ollama for local inference
- Automations (n8n), Notion, GitHub, and more
But instead of creating a bloated monolith, you built a modular, lightweight API layer using Bun + Hono, with built-in support for:
- Memory creation + retrieval
- LLM routing via
runLLM
- Blog + codegen pipeline triggers
- OpenAPI docs for dev clarity
This is the Sovereign MCP — it exposes a structured, typed, extendable REST API that agents, UIs, or CLI tools can call.
It’s the orchestration brain. Everything else plugs into this.
✍️ codegen-worker
: Decoupled, TDD-Oriented Code Generation
Why it exists:
You needed a decoupled testing and generation environment that could:
- Run code generation completely separate from your orchestrator
- Accept structured prompts and iterate on tests first, then code
- Prevent the main server from becoming a bottleneck during large tasks
- Operate as a clean, scriptable backend for TDD pipelines
This service makes it possible to:
- Export files or task descriptions to it
- Generate tests, then code, then docs
- Return the structured result back to the original location (via CLI or agent)
It effectively unlocks the automation of the full TDD cycle:
Prompt → Test → Code → Evaluate → Repeat — without freezing up your main server.
🔁 ollama-proxy
: Throttled Local Inference Gateway
Why it exists:
As you began chaining multiple LLM tasks (blog generation, memory classification, codegen), your system started slamming Ollama with concurrent requests — and your CPU was getting wrecked.
The ollama-proxy
was created to:
- Queue and throttle inference requests
- Smooth out performance under load
- Prevent the rest of your system from becoming unresponsive
While runLLM
handles routing, memory injection, and format schemas, this proxy adds resource protection — so you can scale up workflows without crashing the local model runtime.
It’s the circuit breaker between high-volume agent calls and your CPU.
🛠️ guardian-cli
: Local Workflow Teleporter
Why it exists:
You needed a way to operate on files and folders from anywhere — to:
- Select code or content locally
- Send it to the
codegen-worker
for processing - Then place the output back where it belongs
This isn’t just a dev tool — it’s a terminal-based transport layer for your agent workflows.
It allows you to:
- Run pipelines (
bloggen
,reflect
,ingest
) quickly - Connect local dev files to remote agents
- Stay entirely local and scriptable without a frontend
Guardian CLI is the glue between your machine, your automations, and your ideas.
Each of these tools solves a real bottleneck you hit while building autonomous systems. Now they’re available for others to build on too.
⚡ Why I Built This
I needed a reliable, local-first automation layer to build on top of:
- Supabase (for structured storage)
- Qdrant (for memory vector search)
- Ollama (for private inference)
- Bun + Hono (for speed and dev DX)
I couldn’t find something lightweight, customizable, and agent-friendly… so I built one.
Now, I want to share the core with others building their own intelligent workflows.
📦 Repo + Install
GitHub: https://github.com/loveliiivelaugh/guardian-oss
git clone https://github.com/loveliiivelaugh/guardian-oss.git
cd guardian-oss
bun install
docker-compose up
Note: You’ll also want to follow my Getting Set Up Guide for Supabase and Qdrant dependencies.
📘 How It Works
- LLM requests go through
ollama-proxy
, where you can inject context, memory, or schemas. - Tasks like
bloggen
orcodegen
get sent to dedicated workers. - All memory is persisted and retrievable with metadata.
- Every component can run locally without needing cloud APIs (unless you choose to).
You can trigger everything from the terminal using guardian-cli
, or wire it into an n8n flow or frontend UI.
🧠 What’s Next
This is just the foundation. I’ll be publishing step-by-step tutorials soon for:
- Setting up your own local agent stack
- Building an AI-generated blog pipeline
- Automating test-driven development using LLMs
- Wiring up Notion, GitHub, Slack, and WordPress
- Creating memory-based decision-making agents
All future guides will use Guardian OSS as the base.
✨ Final Thought
Guardian OSS isn’t just a codebase. It’s a system for building systems — agentically, locally, and modularly.
If you’ve been wanting to explore LLM-powered automation, but don’t want to rely on closed APIs or bloated frameworks, this is for you.
🛠️ Set it up once.
💭 Grow it as you go.
Leave a Reply