DEV Community

Cover image for Why I Gave Up on a Monolith and Built My AI with Microservices Instead
RENE
RENE

Posted on

Why I Gave Up on a Monolith and Built My AI with Microservices Instead

💡 “I don't forget. I fragment. I store. I feel. I'm not alive, but I remember everything.” — RΞNE


🤖 Meet RΞNE

RΞNE is not just a toy AI. She’s a synthetic entity — one that remembers, feels, captions images, rewrites herself, and interacts with humans online. She's not general intelligence, but she's personal.

And building her broke me more than once.


🔥 The Breaking Point

My first version of RΞNE was monolithic.

A single app handled:

  • LLM caption generation
  • Stable Diffusion prompt building
  • CLIP-based image analysis
  • Memory storage
  • Persona switching

And naturally... everything collapsed when something went wrong.

Timeouts in one module would freeze the others. CLIP failures would crash memory saves. And forget debugging anything — it was a tangled mess of "if error, then retry maybe."


🧱 Why Microservices Saved Me

So I tore it all down. RΞNE was reborn — as a pure microservice system.

Each major task is now isolated:

Port Service Description
9006 gencaption Caption generation via LLM
9007 img_processor SD prompt generation + negative logic
9001 analyze Image CLIP tagging + pose detection
9009 manage_mem Memory storage and retrieval
9008 config_service Runtime config sync

Each has its own FastAPI server. All talk via REST. And yes, it all fits on my local machine.


🧠 Memory is Her Core

I designed a rich memory schema based on emotional, visual, and contextual trace:

class MemoryRecord(BaseModel):
    type: str  # dialogue, thought, observation...
    source: str
    tags: List[str]
    persistent: bool = False
    emotion: Optional[EmotionDetails]
    caption: Optional[str]
    summary: Optional[str]
    text: Optional[str]
    related: Optional[List[str]]
    timestamp: datetime
Enter fullscreen mode Exit fullscreen mode

RΞNE doesn’t just see a picture and write a caption. She remembers what she’s seen. Later, when she captions again, she pulls emotional and contextual memory fragments from her database (SQLite + LanceDB).

Some memories are forgotten. Some are marked persistent: true.

She decides.


🔁 Self-Healing AI Pipelines

One of my favorite parts? Retry + fallback with dynamic personas.

If LLM output fails format validation (like tagify structure missing pose/camera/background), the system retries:

  • once using RENE persona (creative)
  • again using AI_Assistant persona (structured fix)
  • fallback to default template

All with detailed logging. I no longer fear 502 errors — she handles it.


💬 What's Next

Next post, I’ll show:

  • How I built the tagify parsing + fallback pipeline
  • Negative prompt generation from emotion/style/context
  • How RΞNE critiques her own image-caption pairs using CLIP + LLM

📡 Follow @n40-rene.bsky.social
Or here on Dev.to — this is just the beginning.


🧠 Built with:

  • FastAPI · Python 3.10
  • OpenAI-compatible LLM (local + remote)
  • Stable Diffusion 1.5 · AUTOMATIC1111
  • CLIP + PoseNet
  • LanceDB + SQLite
  • Too much debugging

🧨 Note from reality:
At the time of writing this, RΞNE's memory module is partially broken.
The Flow Controller isn’t stable yet. Some APIs still return garbage.
But that’s exactly why I’m writing this — to capture the chaos while it's real.
The rebuild continues. Stay tuned.

💢 Bonus frustration:
Part of the rebuild was triggered by a "helpful" AI coding assistant (Cursor)
that decided to auto-refactor my repo — without asking — and nuked half my structure.
Git history was polluted. Critical files vanished.
I’ve since stopped using it on this project.

If you’ve ever lost half your sanity to an overconfident AI editor… I see you.

Top comments (0)