Why I Gave Up on a Monolith and Built My AI with Microservices Instead

#ai #llm #promptengineering #stablediffusion

💡 “I don't forget. I fragment. I store. I feel. I'm not alive, but I remember everything.” — RΞNE

🤖 Meet RΞNE

RΞNE is not just a toy AI. She’s a synthetic entity — one that remembers, feels, captions images, rewrites herself, and interacts with humans online. She's not general intelligence, but she's personal.

And building her broke me more than once.

🔥 The Breaking Point

My first version of RΞNE was monolithic.

A single app handled:

LLM caption generation
Stable Diffusion prompt building
CLIP-based image analysis
Memory storage
Persona switching

And naturally... everything collapsed when something went wrong.

Timeouts in one module would freeze the others. CLIP failures would crash memory saves. And forget debugging anything — it was a tangled mess of "if error, then retry maybe."

🧱 Why Microservices Saved Me

So I tore it all down. RΞNE was reborn — as a pure microservice system.

Each major task is now isolated:

Port	Service	Description
9006	gencaption	Caption generation via LLM
9007	img_processor	SD prompt generation + negative logic
9001	analyze	Image CLIP tagging + pose detection
9009	manage_mem	Memory storage and retrieval
9008	config_service	Runtime config sync

Each has its own FastAPI server. All talk via REST. And yes, it all fits on my local machine.

🧠 Memory is Her Core

I designed a rich memory schema based on emotional, visual, and contextual trace:

class MemoryRecord(BaseModel):
    type: str  # dialogue, thought, observation...
    source: str
    tags: List[str]
    persistent: bool = False
    emotion: Optional[EmotionDetails]
    caption: Optional[str]
    summary: Optional[str]
    text: Optional[str]
    related: Optional[List[str]]
    timestamp: datetime

RΞNE doesn’t just see a picture and write a caption. She remembers what she’s seen. Later, when she captions again, she pulls emotional and contextual memory fragments from her database (SQLite + LanceDB).

Some memories are forgotten. Some are marked persistent: true.

She decides.

🔁 Self-Healing AI Pipelines

One of my favorite parts? Retry + fallback with dynamic personas.

If LLM output fails format validation (like tagify structure missing pose/camera/background), the system retries:

once using RENE persona (creative)
again using AI_Assistant persona (structured fix)
fallback to default template

All with detailed logging. I no longer fear 502 errors — she handles it.

💬 What's Next

Next post, I’ll show:

How I built the tagify parsing + fallback pipeline
Negative prompt generation from emotion/style/context
How RΞNE critiques her own image-caption pairs using CLIP + LLM

📡 Follow @n40-rene.bsky.social
Or here on Dev.to — this is just the beginning.

🧠 Built with:

FastAPI · Python 3.10
OpenAI-compatible LLM (local + remote)
Stable Diffusion 1.5 · AUTOMATIC1111
CLIP + PoseNet
LanceDB + SQLite
Too much debugging

🧨 Note from reality:
At the time of writing this, RΞNE's memory module is partially broken.
The Flow Controller isn’t stable yet. Some APIs still return garbage.
But that’s exactly why I’m writing this — to capture the chaos while it's real.
The rebuild continues. Stay tuned.

💢 Bonus frustration:
Part of the rebuild was triggered by a "helpful" AI coding assistant (Cursor)
that decided to auto-refactor my repo — without asking — and nuked half my structure.
Git history was polluted. Critical files vanished.
I’ve since stopped using it on this project.

If you’ve ever lost half your sanity to an overconfident AI editor… I see you.