Everything You Need to Build a Self-Hosted AI Automation System
Over the past year, I’ve been building a system that does something different.
Not just a chatbot. Not just an agent that runs a few tasks. I’ve built a real AI-powered infrastructure layer — a system that thinks, remembers, reasons, and acts on my behalf across software engineering, life automation, writing, health, and more.
I call it Guardian.
And now, for the first time, I’m going to open it up — not just as open-source tools, but as a full learning system.
This blog post is your roadmap to everything I’ll be teaching:
The Guardian Tutorial Series — a complete guide to building autonomous, memory-powered, locally-hosted AI systems that actually do work.
🎓 What This Series Covers
This series is not about building a toy chatbot or installing one LLM on your laptop.
It’s about:
- Designing a stack that lives alongside you
- Connecting tools like Supabase, Qdrant, Ollama, and Docker
- Building pipelines that generate blog posts, write code, or ingest real-world signals
- Creating a memory system that augments everything your AI does
- Building the future of agentic computing — privately, securely, and locally
Whether you’re just starting with LLMs or looking to build something deeper, this is the first structured curriculum built from a real system running in production.
🧱 Core Setup: Building the Base Stack
Before anything can run, we need the environment.
These tutorials cover the full foundational setup:
🧠 Getting Set Up: Tools I Use to Power My Stack
Overview of everything I run: local LLMs, databases, infra monitoring, networking, and agents.
blog.woodwardwebdev.com/post/getting-set-up-tools-i-use-to-power-my-stack
📦 Installing Supabase Locally
Supabase is the structured data backbone — I’ll walk you through setting up your own Postgres-powered system with auth, file storage, and realtime triggers.
supabase.com/docs/guides/self-hosting
🧭 Understanding Qdrant, Embeddings, and Memory
This one goes deep — I explain:
- What embeddings are
- Why vector size matters
- How semantic search works
- How to embed, classify, and store long-term memory in Qdrant
blog.woodwardwebdev.com/post/understanding-embeddings-and-vector-databases-with-qdrant
🔌 Connecting Supabase, Qdrant, and Ollama
How to wire up your local models to your memory systems — the core of context-aware AI.
blog.woodwardwebdev.com/post/connecting-supabase-qdrant-and-ollama-with-guardian
🧠 Running Local LLMs in Production (with Ollama)
Instead of paying for cloud tokens, I show you how to run LLaMA3, Dolphin, DeepSeek, and Whisper — and how I use them every day with no cost or API limits.
blog.woodwardwebdev.com/post/how-i-use-local-llms-in-production
🌐 Securing and Accessing Your System with Tailscale + Cloudflare Tunnel
This tutorial shows you how to route external traffic securely into your system without exposing anything to the open internet.
📊 Observability Stack (Grafana + Prometheus + Uptime Kuma)
If you’re going to run agents 24/7, you need insight. This tutorial shows how I monitor every service, endpoint, and memory log.
blog.woodwardwebdev.com/post/why-i-built-my-own-analytics-observability-stack-for-10-month
🧠 Guardian Core
This is the brain — the API and execution engine for the entire system.
🧷 Guardian Server: The Sovereign Automation Brain
Tutorial on setting up the Guardian server: Hono-based backend, memory router, LLM orchestration, and REST API layer.
Includes:
addMemory()
andrunLLM()
wiring- Pipeline dispatcher
- Notion, GitHub, Slack, WordPress integrations
blog.woodwardwebdev.com/post/introducing-guardian-oss-a-minimal-ai-automation-core
💻 guardian-cli: A Developer Interface to Your Agents
Build and install the CLI tool that lets you:
- Run pipelines from terminal
- Trigger test generation
- Query or add memory
- Run agents with arguments
blog.woodwardwebdev.com/post/introducing-guardian-oss-a-minimal-ai-automation-core
🧪 CodeGen Microservice
An agentic backend that generates code from tasks, creates tests, and opens GitHub PRs. This will get its own repo + tutorial.
blog.woodwardwebdev.com/post/codegen-microservice-ai-powered-agentic-code-writing-tdd-style
🧠 Ollama Proxy Service
Centralized LLM inference router — great for swapping models, injecting memory, and keeping agent calls uniform. Open-sourced as a general-purpose LLM interface.
🔁 Pipeline Tutorials
Each of these systems is a standalone tutorial with full walkthroughs, diagrams, and memory flow.
📝 BlogGen Pipeline
Convert a single idea into a full-length blog post.
- Injects memory
- Uses local LLM
- Saves to database
- Publishes to WordPress
- (Optional) Auto-generates thumbnail
→ Tutorial: “How to Auto-Write and Post Blogs with Your Own AI”
💻 CodeGen Pipeline
Give it a task spec — it returns:
- Working code
- Tests
- PR branch + GitHub commit
- Memory snapshot of the change
→ Tutorial: “Letting an Agent Write Code for You”
✅ TestGen Pipeline
Point it at a file or folder — it analyzes the code and generates:
- Test stubs
- Assertions
- Snapshot comparisons
- Coverage memory
→ Tutorial: “Autogenerating Tests with Guardian CLI”
🧠 Memory Ingest Pipeline
Feed Guardian:
- PDFs
- Notes
- Audio transcripts
- Email content
It classifies, embeds, and stores with memory tags and source traceability.
→ Tutorial: “Teaching Your System What Matters”
🔍 Task Classifier Pipeline
Classifies inbound tasks based on type, origin, urgency, and routes them:
- BlogGen
- CodeGen
- Slack Agent
- Notion PR bot
→ Tutorial: “Auto-Routing Tasks with Embedded Reasoning”
🔐 CommitGen Pipeline
Reads Git diffs → summarizes → generates semantic commit messages.
→ Tutorial: “Leveling Up Your Git Workflow with Agent Intelligence”
📬 Email Agent Pipeline
Scans inbound email, extracts metadata, classifies content, and optionally:
- Updates Notion
- Adds to memory
- Triggers BlogGen or CodeGen
→ Tutorial: “Ingesting Email into Your AI’s Brain”
📡 Reflect + Sync Pipelines
Once a task is complete:
- Guardian reflects using LLM
- Adds a memory summary
- Updates task boards (Notion, Supabase)
→ Tutorial: “Closing the Loop with Memory Reflections”
📹 AI Ingestion Pipelines
📼 YouTube Playlist Ingest
Takes an entire playlist:
- Transcribes every video
- Summarizes per-episode
- Links metadata
- Tags for future retrieval
→ Tutorial: “Turning YouTube Playlists Into Semantic Memory”
👁️ Vision Pipeline
Feed Guardian a screenshot or image:
- Runs OCR and/or LLaVA
- Extracts visible text
- Adds to memory or routes to a pipeline
→ Tutorial: “Using Visual Inputs to Trigger Actions”
🎙️ Voice Pipeline
- Record audio
- Transcribe (Whisper)
- Detect intent/emotion (optional)
- Trigger BlogGen, memory update, or codegen task
→ Tutorial: “Voice-to-Agent: Automating From Spoken Input”
📈 Personal Life Pipelines
🧾 Receipts Pipeline
Drop a receipt in a folder (or forward to email):
- OCRs it
- Classifies vendor, amount, and category
- Adds to memory + file log
- Optionally triggers financial summaries
→ Tutorial: “Building a Receipts-to-Memory Automation”
🏃 Apple HealthKit Ingest
Uses OpenFitness or JSON data from HealthKit:
- Syncs weight, steps, activity, sleep
- Adds trends to memory
- Can trigger nudges or future workflows
→ Tutorial: “Syncing Your Health into an AI Feedback Loop”
🧠 Building a MemoryManager: Giving Guardian a Reliable, Searchable Mind
In any agentic system — especially one designed for autonomy — memory is everything.
Without memory:
- Agents forget what they’ve done
- Context is lost between steps
- Decisions become shallow and repetitive
So I built a MemoryManager — a pipeline-driven engine designed to process, classify, embed, and organize thousands of high-quality memories into Qdrant and Supabase. This became the foundation of everything Guardian does.
🧠 What Is the MemoryManager?
The MemoryManager is a process + system that:
- Ingests raw logs, thoughts, tasks, and documents
- Embeds them using a model like
mxbai-embed-large
or OpenAI - Classifies them using a local LLM (
runLLM
) - Tags, scores, and indexes them into Qdrant and Supabase
- Provides long-term searchable memory to all pipelines and agents
This memory isn’t just passive storage — it’s active cognitive context.
Every BlogGen, CodeGen, reflection, or reasoning step draws from this semantic archive.
🧱 Why Do We Need It?
Because without structured, persistent memory:
- Your system is stateless
- Agents can’t evolve
- Pipelines can’t reflect or self-improve
- Observability is shallow (you only see what just happened)
The MemoryManager:
- Adds continuity to your agents
- Unlocks rich search and reasoning
- Enables scoped and filtered recall by topic, source, or task type
- Allows agents to “remember” what worked, what failed, and what changed
In my case, this pipeline created over 7,500 curated memories from months of logs, notes, LLM decisions, and work outputs.
That memory now powers everything from:
- 🧠 Agent recall
- 📓 Task context injection
- 📝 Auto-blog generation
- ✅ Test classification
- 🤖 Self-reflection loops
🔧 How to Build a MemoryManager Pipeline
Here’s a high-level outline for reproducing this system in your own Guardian stack:
1. Source Ingest
Start with your raw data:
- Markdown files, logs, transcripts, PDF text, emails
- Anything that holds insight or context
You can automate this with:
- File watchers
- Email ingest
- Notion export
- Git hooks
guardian ingest ./path/to/files
2. Preprocessing + Chunking
Split the text intelligently:
- Max token length: 512–1024 per chunk
- Use semantic or paragraph breaks
- Strip out irrelevant noise (e.g. timestamps, code fences, signature lines)
3. Embedding
Use an embedding model:
- Local:
mxbai-embed-large
orbge-large-en
- API: OpenAI
text-embedding-3-small
Send each chunk → vector → embed → store in Qdrant.
4. Classification (via runLLM)
Each memory chunk gets run through a local LLM with a prompt like:
“Classify this memory by type, topic, tone, and source. Extract key entities. Score its usefulness from 0–1.”
Result:
{
"type": "Observation",
"tags": ["infra", "qdrant", "supabase"],
"score": 0.92,
"source": "blog_draft_2025_05_20.md"
}
Then you write the structured metadata to Supabase, and link it to the vector ID in Qdrant.
5. Write to Supabase + Qdrant
Now you have:
- Qdrant for fast similarity search
- Supabase for structured metadata filtering
Together they enable:
await getMemories({
vector: embedding,
filter: { tags: ["agent", "reflection"], score: { gte: 0.7 } },
limit: 10
});
This becomes your guardian.memory.retrieve()
interface.
6. Memory Quality Assurance
Use the following to maintain quality:
- Remove low-score (≤ 0.4) chunks
- Deduplicate based on fingerprint hash
- Periodically retrain tag associations via LLM clustering
🔁 How This Powers Everything Else
After building the MemoryManager and ingesting 7.5k+ high quality memory entries, every part of Guardian started behaving differently:
- CodeGen wrote more relevant utilities
- BlogGen cross-referenced old decisions and architecture notes
- Reflection became smarter, linking ideas across weeks
- Pipelines self-improved based on actual past outcomes
It’s not a memory log. It’s an evolving brain.
📘 Tutorial Coming: How to Build a MemoryManager with Guardian
That full pipeline will get its own dedicated walkthrough:
- File structure
- Embedding and classification logic
- Sync loop
- Memory pruning strategies
- Optional agentic memory QA (e.g. “Is this memory useful to you?”)
🔚 Final Thought
The Guardian system is only as good as its memory.
The better you organize and enrich it, the smarter everything becomes.
If you’re serious about building agentic systems — you don’t just need an LLM.
You need a mind.
And the MemoryManager is where that mind begins.
🧠 Meta-Pipeline Tutorials
These will also become walkthroughs:
- Reflective Loop Agent – Memory grows over time, used to improve agent output
- Auto-Tweet Pipeline – Daily blog summaries → LLM → 3–5 tweet threads per day
- AgentFlow Runtime (preview only) – Closed source, but will have full documentation so devs can plug into it
🔚 Where This All Leads
The goal isn’t to just teach people how to self-host services.
It’s to teach them how to orchestrate intelligence.
To use local-first, memory-enhanced, multi-agent systems
that think, act, and evolve — not in the cloud, but in your own system.
If that vision resonates with you — these tutorials are just the beginning.
🔜 What’s Next
Coming soon:
- 📦 Guardian OSS Lite (starter template)
- 💻 guardian-cli (npm package)
- 🧠 Memory-as-a-Service (cloud utility for Zapier-style integrations)
- 📚 Learn Page with all tutorials indexed + status indicators
- 🎥 Optional video guides (once the written docs are complete)
📣 Want Early Access?
Follow the blog, join the newsletter, or subscribe to the RSS.
First tutorials drop this week — and we’ll build this together, one pipeline at a time.
Leave a Reply