🧠 Tutorial: How Guardian Automatically Generates Blog Posts

Welcome to the Guardian blog generation tutorial. In this guide, we’ll walk through how Guardian uses LLMs, metadata, and automations to autonomously generate SEO-friendly blog posts from your real engineering activity.

This system is ideal for:

Engineers who want to build in public.
AI developers documenting agent workflows.
Indie hackers creating organic SEO content passively.

🧩 System Architecture Overview

Here’s what powers the blog generation:

Layer	Tool	Purpose
LLM	Ollama (e.g. `llama`3.1)	Generates content
DB	Supabase	Stores blog metadata
Embeddings	Qdrant	Links posts to memory
Automation	n8n	Schedules and triggers post generation
Code Layer	Guardian backend (Hono)	Orchestrates the flow
CMS	WordPress (or local blog)	Publishes the post

🔁 Blog Generation Flow

1. Task Trigger

Every day (or via command), Guardian runs a blog-gen task. This can be:

Scheduled from n8n
Manually invoked from your dashboard
Triggered when a project milestone is reached

2. Keyword + Context Selection

Guardian selects:

A trending or relevant keyword
Context from recent work (memory entries from Qdrant)
Affiliate tool mentions (if any)

💡 Example:

Keyword: “LLM-driven code documentation”
Context: Last 5 commits + n8n flow + system metadata

3. LLM Prompt Construction

A structured prompt is created using:

Write a 500–1000 word blog post for developers.
Topic: {keyword}
Context:
{recent agent logs / task history}
Include tools: Supabase, Qdrant, Ollama

4. LLM Response → Markdown Post

The response is:

Parsed into markdown
Checked for metadata (title, subtitle, summary)
Validated for length and coherence

5. Affiliate Injection (Optional)

Any matching tools from your affiliate list are converted to links:

...learn more about [Supabase](https://affiliate.supabase.io/michael)...

6. Image Generation (Upcoming)

A diffusion model (via Ollama or external API) creates:

A relevant cover image
Branded watermark or theme

You’ll be able to configure this with:

guardian generate-image --prompt "AI writing code documentation"

7. Post Submission

The final content is submitted to WordPress (via REST API)
It enters “draft” or “pending review” state
You review, tweak if needed, and publish

✅ What You Can Customize

Add categories/tags by parsing content or setting metadata rules
Include custom “about the author” or CTA footers
Add alternate post destinations (e.g. Dev.to, Medium)
Tie posts to commits, tickets, or flows for traceability

🧪 Sample Output Snippet

# From Commit to Content: Automating Code Documentation with LLMs

Software engineers often neglect documentation. But what if we could generate it as we code?

In this post, I’ll walk through how I use Git hooks, commit metadata, and an LLM (via Guardian) to generate useful developer docs — without lifting a finger...

🧙🏼‍♂️ BlogGen API Route example

withHandler('guardian.generate', async (c) => {
    //* 0. establish process context
    const githubMemories = await retrieveMemory({ query: "recent GitHub activity", tags: ["github"], topK: 5, enrich: true });
    const notionMemories = await retrieveMemory({ query: "project notes", tags: ["notion"], topK: 3, enrich: true });
    const wordpressMemories = await retrieveMemory({ query: "recent blog", tags: ["wordpress"], topK: 2, enrich: true });
    const generalMemories = await retrieveMemory({ query: "recent engineering work", topK: 5, enrich: true });

    const allContexts = [
      ...githubMemories.results,
      ...notionMemories.results,
      ...wordpressMemories.results,
      ...generalMemories.results
    ];
    
    const compressedContext = allContexts
      .map(({ structuredData }) => structuredData ? scripts.llm.compressStructuredContext(structuredData) : "")
      .filter(Boolean)
      .join("\\n\\n");

    const affiliatePartners: string[] = [
     ...
    ];    
    
    const theNewPrompt = `
You are an expert technical writing assistant for an automation-focused software engineer who works daily on advanced infrastructure projects involving AI, workflows, React, fullstack tooling, and intelligent systems.

Every day, you will generate a new blog post that summarizes the author’s current active engineering work, design decisions, and architecture patterns, based on real data from the past 24–48 hours across several sources:

Available context sources may include:
- Git commit messages and diffs (via .git tracking)
- GitHub issues, PRs, and push activity
- Structured and unstructured memory logs (via Supabase + Qdrant)
- Internal project notes (via Notion API)
- Recently published or drafted blog posts (via WordPress API)

Your job is to:
1. Analyze all available sources and surface the most important topics from recent work
2. Write a 700–1200 word blog post that:
    - Focuses on *one or two* key ideas or milestones
    - Describes real systems being built or tested
    - Includes thoughtful reflection, challenges, or discoveries
3. Use a tone that’s insightful but pragmatic — personal, not overly formal
4. Mention tools, frameworks, models, or techniques used
5. If any of the following affiliate platforms or tools were used, mention them naturally in the post and briefly describe their role in the system (without sounding like an ad):

${affiliatePartners.map(name => `- ${name}`).join('\n')}

6. Include a short, one-paragraph summary at the top
7. Provide a title at the top that’s concise and compelling (include tech keywords)

Use markdown formatting. If no recent updates are available from a specific source, omit it. Use only the most relevant and high-signal content. Keep it clean, clear, and actionable.

Output:
- Title
- Summary paragraph
- Blog body (~800 words)

Begin writing the blog post based on the provided context.

Context:
${compressedContext}
`;


    const result = await runLLM({ prompt: theNewPrompt, model: "gemini" });

    // Add step to proofread the blog post
    const proofreadPrompt = `
    You are an expert technical writing assistant for an automation-focused software engineer who works daily on advanced infrastructure projects involving AI, workflows, React, fullstack tooling, and intelligent systems.

    Proofread and optimize the following blog post for:
    - Grammar and clarity
    - Formatting
    - SEO (title, keywords)
    - Structure (summary, sections)

    Return your response as a structured JSON object in the following format:

    {
      "title": "<optimized title>",
      "summary": "<1-paragraph TL;DR>",
      "keywords": ["keyword1", "keyword2", ...],
      "content": "<full proofread and formatted blog post>"
    }

    Here is the draft blog post:
    ${result.output}
    `;

    const proofreadResult = await runLLM({ 
      prompt: proofreadPrompt, 
      model: "gemini", 
      options: { 
        json: {
          // @ts-ignore
          responseSchema: {
            type: "object",
            properties: {
              title: { type: "string" },
              summary: { type: "string" },
              keywords: {
                type: "array",
                items: { type: "string" }
              },
              content: { type: "string" }
            },
            required: ["title", "summary", "keywords", "content"],
            additionalProperties: false
          }
        }
      }
    });

    function parseJsonFromCodeFence(output: string) {
      const cleaned = output
        .replace(/^```json/, '')
        .replace(/^```/, '')
        .replace(/```$/, '')
        .trim();
    
      try {
        return JSON.parse(cleaned);
      } catch (err) {
        console.error("❌ Failed to parse proofread JSON:", err);
        return null;
      }
    }

    const parsedProofreadResult = parseJsonFromCodeFence(proofreadResult.output);

    const imageGenerationPrompt = `
    You are an expert technical writing assistant for an automation-focused software engineer who works daily on advanced infrastructure projects involving AI, workflows, React, fullstack tooling, and intelligent systems.

    Generate an image that represents the following blog post:
    ${parsedProofreadResult.content}
    `;

    let imageId;
    try {
      console.log("Generating image...", imageGenerationPrompt);
     
      const base64Image = image; // from your Stability API
      const buffer = Buffer.from(base64Image, "base64");
  
      const imageUpload = await wordpressClient.post("/wp-json/wp/v2/media", buffer, {
        headers: {
          "Content-Disposition": `attachment; filename="thumbnail.png"`,
          "Content-Type": "image/png"
        }
      });

      console.log("Image uploaded:", imageUpload.data.id);
      imageId = imageUpload.data.id;
    } catch (error) {
      console.error("❌ Failed to generate image:", error);
    }

    let enhancedContent = parsedProofreadResult.content;
    const affiliateMap: Record<string, string> = {
      supabase: "...",
      cloudflare: "...",
      tailscale: "...",
      n8n: "...",
      ...
    };    
    
    // Ensure case-insensitive match but preserve original casing in output
    for (const keyword of parsedProofreadResult.keywords) {
      const key = keyword.toLowerCase();
      if (affiliateMap[key as keyof typeof affiliateMap]) {
        const regex = new RegExp(`\\b(${keyword})\\b`, 'gi');
        const replacement = `**[${keyword}](${affiliateMap[key as keyof typeof affiliateMap]})**`; // bold + link
        enhancedContent = enhancedContent.replace(regex, replacement);
      }
    }

    const categories = [
      {
        "id": 54,
        "name": "AI"
      },
      {
        "id": 65,
        "name": "AI-Generated"
      },
      {
        "id": 55,
        "name": "Automation"
      },
      {
        "id": 61,
        "name": "Daily Blog"
      },
      {
        "id": 56,
        "name": "Engineering"
      },
      {
        "id": 60,
        "name": "Guides"
      },
      {
        "id": 57,
        "name": "Infrastructure"
      },
      {
        "id": 64,
        "name": "Manual"
      },
      {
        "id": 63,
        "name": "Random"
      },
      {
        "id": 62,
        "name": "Research"
      }
    ]

    const tags = [
      {
        "id": 31,
        "name": "Bun"
      },
      {
        "id": 37,
        "name": "CLI Tools"
      },
      {
        "id": 13,
        "name": "Cloudflare"
      },
      {
        "id": 42,
        "name": "Codegen"
      },
      {
        "id": 53,
        "name": "Cognitive Systems"
      },
      {
        "id": 36,
        "name": "Dark Mode"
      },
      {
        "id": 46,
        "name": "DevOps"
      }
    ];

    const categoryMap = new Map(categories.map(c => [c.name.toLowerCase(), c.id]));
    const tagMap = new Map(tags.map(t => [t.name.toLowerCase(), t.id]));

    const categorySuggestionPrompt = `
You are a blog categorization assistant.

Analyze the following blog post and return 1–3 appropriate high-level category labels for it. 
Use general terms like "AI", "Automation", "Infrastructure", "Tutorials", "Engineering", "Research", etc.
Do not return descriptions — just return the category names in a JSON array of strings.

Post title: ${parsedProofreadResult.title}

Post content:
${parsedProofreadResult.content}
`;

    const categorySuggestionResult = await runLLMCore({
      prompt: categorySuggestionPrompt,
      model: "gemini", // or whatever you're using
      options: {
        json: {
          responseSchema: {
            type: "array",
            items: { type: "string" }
          }
        }
      },
      retries: 1
    });

    const parsedCategorySuggestionResult = parseJsonFromCodeFence(categorySuggestionResult.output);
    // Fallback if parsing fails
    const postCategoryNames: string[] = parsedCategorySuggestionResult ?? ["Engineering"];

    const isAIGenerated = true; // or infer from task type

    if (isAIGenerated) postCategoryNames.push("AI-Generated");
    else postCategoryNames.push("Manual");

    const matchedCategoryIds = postCategoryNames
      .map((name: string) => categoryMap.get(name.toLowerCase()))
      .filter(Boolean);

    const matchedTagIds = parsedProofreadResult.keywords
      .map((keyword: string) => tagMap.get(keyword.toLowerCase()))
      .filter(Boolean);

    
    const wordpressResult = await wordpressClient.post("/wp-json/wp/v2/posts", {
      title: parsedProofreadResult.title,
      content: parsedProofreadResult.content,
      status: "draft",
      categories: matchedCategoryIds,
      tags: matchedTagIds, 
      featured_media: imageId ? imageId : null // returned by image upload route
    });

    const memoryId = crypto.randomUUID();
    await addMemory({
      agent: "guardian",
      id: memoryId,
      title: parsedProofreadResult.title,
      content: parsedProofreadResult.content,
      summary: parsedProofreadResult.summary,
      tags: ["blog", ...parsedProofreadResult.keywords],
      type: "generated_blog",
      source: "wordpress",
      source_id: wordpressResult.data.id.toString(),
      metadata: {
        wp_url: wordpressResult.data.link,
        wp_status: wordpressResult.data.status
      }
    } as any);

    return c.json({ 
      message: "Blog post generated successfully",
      status: "posted",
      result: parsedProofreadResult,
      wordpress: wordpressResult.data,
      memoryId,
      allContexts
    });
  })

Next Steps

Want to build your own? Here’s what to do:

Clone your guardian-starter repo
Add the blog-gen.ts function (or use the provided one)
Set up a schedule in n8n or a manual trigger [link to guide coming soon]
Plug in your LLM + memory backend
Connect your CMS (WordPress, Notion, etc.)

All of this is covered in introducing-guardian-oss-a-minimal-ai-automation-core
The Github repo 👉 https://github.com/loveliiivelaugh/guardian-oss

🧠 In Conclusion: The Power of the `guardian.generate` Blog API

What you’ve just seen is one of the most complete examples of local-first, AI-assisted software automation in action.

This isn’t just “generate a blog post” — this is Guardian:

🔍 Retrieving and compressing real engineering context from memory
✍️ Generating long-form technical content with system awareness
🧼 Proofreading and optimizing output using structured prompts
🖼️ Auto-generating images and uploading media
🔗 Inserting affiliate links and smart references
🧠 Saving the blog post back into memory so future agents can reason about it
🗃️ Auto-categorizing and tagging the content using AI + system mappings
📤 Publishing directly to WordPress, ready for review or launch

All in one API route.

⚡ And here’s what makes it even cooler:

Although the example uses Gemini as the LLM, every single part of this can run locally:

Use Ollama for blog post generation and proofreading
Store and query memory in Supabase + Qdrant
Host your own WordPress backend, or even push directly to a static blog via Git
Use your existing filesystem, n8n, or local tools to trigger the route

🧙‍♂️ This is a zero-dependency publishing pipeline that thinks and reflects on your behalf.

🧠 More to Come

In upcoming tutorials, I’ll break down some of the powerful utilities used in this flow:

runLLM(prompt, model) — a smart LLM wrapper with local-first model routing
addMemory(entry) — Guardian’s structured memory ingestion tool
retrieveMemory(query, tags) — the brainstem of how context is recalled and compressed

We’ll cover how these are used in every major part of the Guardian system — from agent flows to autonomous testing, dev planning, and even roadmap generation.

🧠 Coming Up Next…

Want to dive deeper into how Guardian uses structured memory, automation, and long-term awareness to create a self-evolving engineering system?

Here’s a sneak peek at the next blog posts in the series:

🧹 Memory Hygiene 101: Prune, Compress, and Clean for Better Outputs

Use Guardian’s internal MemoryManager to clean stale logs, remove noise, and compress useful insights — all while improving LLM-driven writing and codegen instantly.

🔄 How I Automatically Capture Work As Memories

How Guardian watches your life through Git hooks, Notion logs, GitHub events, and more — and turns all of it into high-quality, queryable memory, without interrupting your flow.

🧠 Memory 102: Qdrant vs Pinecone and the Hidden Health of Your Vector DB

Vector DB hygiene is your agent’s cognition hygiene. Learn how Guardian tracks memory decay, deduplicates thoughts, and keeps your long-term context healthy and usable.

🧬 The Birth of Causal Models: Introducing `trace_id`

By attaching trace_id to tasks, thoughts, and outputs, Guardian begins building causal chains between decisions, actions, and results.
This is the first step toward agents that don’t just act — they understand why.

🕰️ Temporal Reasoning: Adding Time to the Causal Graph

When Guardian links memory and events by time and cause, it begins forming an artificial world model.
This post explores how temporal vectors, timestamps, and trace_trees evolve Guardian from a reactive system into a reflective one.

📬 Subscribe to follow along — or explore the memory system directly and build your own.