🚀 Introducing Guardian OSS: A Minimal AI Automation Core

I’m excited to share the first open-source release of my local-first AI automation engine: Guardian OSS.

This is the same minimal system that powers my pipelines for:

🧠 memory ingestion and reflection
✍️ blog post generation
🔁 autonomous codegen loops
🤖 LLM orchestration

Now it’s available for anyone to use, build on, or extend.

🧱 What’s Included in Guardian OSS

Guardian OSS is a small but powerful local-first backend designed to help you:

Run your own AI agents locally
Automate pipelines and workflows
Store, retrieve, and reason over long-term memory
Use your own models (via Ollama) for private inference

It’s made up of four services:

Service	Description
🧠 guardian-open	The main backend server that connects Supabase, Qdrant, Ollama, and all pipeline logic
✍️ codegen-worker	A microservice that generates code/tests/docs from LLM prompts
🔁 ollama-proxy	A local router for LLM requests that injects memory or JSON schemas into prompts
🛠️ guardian-cli	A command-line interface to run workflows like blog generation or memory ingestion

Each one runs independently, but they’re designed to work together.

🧠 `guardian-open`: The Core AI Backend

Why it exists:
You needed a central intelligent layer that could unify the systems you were already building:

Supabase for relational + auth
Qdrant for vector search and memory retrieval
Ollama for local inference
Automations (n8n), Notion, GitHub, and more

But instead of creating a bloated monolith, you built a modular, lightweight API layer using Bun + Hono, with built-in support for:

Memory creation + retrieval
LLM routing via runLLM
Blog + codegen pipeline triggers
OpenAPI docs for dev clarity

This is the Sovereign MCP — it exposes a structured, typed, extendable REST API that agents, UIs, or CLI tools can call.

It’s the orchestration brain. Everything else plugs into this.

✍️ `codegen-worker`: Decoupled, TDD-Oriented Code Generation

Why it exists:
You needed a decoupled testing and generation environment that could:

Run code generation completely separate from your orchestrator
Accept structured prompts and iterate on tests first, then code
Prevent the main server from becoming a bottleneck during large tasks
Operate as a clean, scriptable backend for TDD pipelines

This service makes it possible to:

Export files or task descriptions to it
Generate tests, then code, then docs
Return the structured result back to the original location (via CLI or agent)

It effectively unlocks the automation of the full TDD cycle:

Prompt → Test → Code → Evaluate → Repeat — without freezing up your main server.

🔁 `ollama-proxy`: Throttled Local Inference Gateway

Why it exists:
As you began chaining multiple LLM tasks (blog generation, memory classification, codegen), your system started slamming Ollama with concurrent requests — and your CPU was getting wrecked.

The ollama-proxy was created to:

Queue and throttle inference requests
Smooth out performance under load
Prevent the rest of your system from becoming unresponsive

While runLLM handles routing, memory injection, and format schemas, this proxy adds resource protection — so you can scale up workflows without crashing the local model runtime.

It’s the circuit breaker between high-volume agent calls and your CPU.

🛠️ `guardian-cli`: Local Workflow Teleporter

Why it exists:
You needed a way to operate on files and folders from anywhere — to:

Select code or content locally
Send it to the codegen-worker for processing
Then place the output back where it belongs

This isn’t just a dev tool — it’s a terminal-based transport layer for your agent workflows.

It allows you to:

Run pipelines (bloggen, reflect, ingest) quickly
Connect local dev files to remote agents
Stay entirely local and scriptable without a frontend

Guardian CLI is the glue between your machine, your automations, and your ideas.

Each of these tools solves a real bottleneck you hit while building autonomous systems. Now they’re available for others to build on too.

⚡ Why I Built This

I needed a reliable, local-first automation layer to build on top of:

Supabase (for structured storage)
Qdrant (for memory vector search)
Ollama (for private inference)
Bun + Hono (for speed and dev DX)

I couldn’t find something lightweight, customizable, and agent-friendly… so I built one.

Now, I want to share the core with others building their own intelligent workflows.

📦 Repo + Install

GitHub: https://github.com/loveliiivelaugh/guardian-oss

git clone https://github.com/loveliiivelaugh/guardian-oss.git
cd guardian-oss
bun install
docker-compose up

Note: You’ll also want to follow my Getting Set Up Guide for Supabase and Qdrant dependencies.

📘 How It Works

LLM requests go through ollama-proxy, where you can inject context, memory, or schemas.
Tasks like bloggen or codegen get sent to dedicated workers.
All memory is persisted and retrievable with metadata.
Every component can run locally without needing cloud APIs (unless you choose to).

You can trigger everything from the terminal using guardian-cli, or wire it into an n8n flow or frontend UI.

🧠 What’s Next

This is just the foundation. I’ll be publishing step-by-step tutorials soon for:

Setting up your own local agent stack
Building an AI-generated blog pipeline
Automating test-driven development using LLMs
Wiring up Notion, GitHub, Slack, and WordPress
Creating memory-based decision-making agents

All future guides will use Guardian OSS as the base.

✨ Final Thought

Guardian OSS isn’t just a codebase. It’s a system for building systems — agentically, locally, and modularly.

If you’ve been wanting to explore LLM-powered automation, but don’t want to rely on closed APIs or bloated frameworks, this is for you.

🛠️ Set it up once.
💭 Grow it as you go.

📂 View the Repository on GitHub

🚀 Introducing Guardian OSS: A Minimal AI Automation Core

🧱 What’s Included in Guardian OSS

🧠 guardian-open: The Core AI Backend

✍️ codegen-worker: Decoupled, TDD-Oriented Code Generation

🔁 ollama-proxy: Throttled Local Inference Gateway

🛠️ guardian-cli: Local Workflow Teleporter