Skip to main content
Runflow
Sentinel

You don't know what your AI image pipeline ships.
Sentinel does.

Sentinel scores every input and every output image, so you can see what you are shipping and fix it at scale.

The problem

You're generating at scale.
Nobody's checking the output.

When AI pipelines run at volume, defects multiply. No team reviews 10,000 images a day. Without automated evaluation, broken images reach your users.
Invisible

You can't see what's broken

Face distortions, wrong backgrounds, skin-tone issues, logo placement errors. At API scale, defects stay invisible until a user complains.

Manual

Human review doesn't scale

Manual QA works for dozens of images. Not for thousands. The moment you scale, quality control becomes the bottleneck, or disappears.

Costly

Bad outputs cost real money

Every image that fails QA after delivery is a refund, a complaint, or a churn event. The cost does not show in COGS. It shows in your NRR.

Where it starts

Swap your API key.
See what you ship.

Point your key at any of the 100+ models, every Solution API, or your own custom workflows. Sentinel scores quality on all of it, so for the first time you see what you actually ship.
google / nano-banana-2
Output qualitySentinel

Evaluations

2,924

last 30 days

Pass rate

89.53%

of all evaluations

Avg score

0.91

out of 1.00

Gate failures

20

0.7% of evaluations

Failed judges

1,133

0.4 per evaluation

Regenerate rate

9.68%

lower is better

Success rate by categoryclick to inspect
1,133failed judges
Input scoring

Catches a weak input before you spend a GPU second on it.

Output scoring

Every result graded against the checks that matter for its use case.

Trace on every call

One ID ties the input, the score, and the retry together.

How it works

Show your customers the quality of every image you ship.

Embed Sentinel in every image you deliver. Each one carries a quality score that tells your customer what came out right, and what would need one more edit.
FabelStudio
Share ExportRG
Compare
Studio canvas
Versions
Original1K

Original

How it works

Create custom workflows that scale with Sentinel.

Score the input and the output to deliver at scale, the way BetterPic does.

Step 02 below is Sentinel on the input, step 05 is Sentinel on the output.

How BetterPic delivers 36M headshots without a single human intervention
8selfies
Upload 1
Upload 2
Upload 3
Upload 4
Upload 5
Upload 6
Upload 7
Upload 8
Step 01 · You
User uploads.

Eight selfies to start. The customer sends their source photos.

In
8 source selfies
YouRunflow
35M+
Headshots scored
4x
Candidates generated, best ones kept
0
Manual QA reviewers
100K+
Jobs scored per month
Read the BetterPic case study
Evaluation layers

Built for your use case,
not the generic one.

Sentinel evaluates at three levels, from universal quality checks to custom business rules you define. Every layer is configurable.
Layer 01

Generic quality

Universal checks that apply to any AI image generation. Runs automatically on every pipeline, zero configuration.

Prompt-to-image alignment
Artifact detection
Composition scoring
Sharpness & exposure
Most used
Layer 02

Use-case specific

Pre-tuned criteria for production pipelines. Headshots, fashion, on-model. Each one with its own quality standard.

Face fidelity & likeness
Garment & logo accuracy
Background consistency
Expression & skin tone
Layer 03

Your custom rules

Define what matters to your business. Sentinel checks your edge cases on every single run, at any volume.

Custom quality schema
Business-specific criteria
Threshold configuration
Team-shareable templates
Developer-first

Send a generated image. Get a verdict back.

One POST to the Sentinel API. You send the image, the task, and what good looks like. Sentinel returns a verdict you can branch on, no quality model of your own to build.

One endpoint

POST /api/v1/evaluate with the image URL, the task type, and a description of what it should be. Authenticate with an x-api-key header.

A verdict you can branch on

Every evaluation returns pass, soft_fail, or hard_fail, plus a weighted pass rate to threshold on yourself.

Tuned to your task

Pass reference images and evaluation instructions so Sentinel scores against your use case, not a generic one.

POST sentinel.runflow.io/api/v1/evaluate
// request
{
  "generated_image_url": "https://cdn.example.com/headshot_9a8b.jpg",
  "task_type": "headshot",
  "task_description": "Professional headshot, neutral background",
  "evaluation_instructions": "Skin natural, identity matches the reference",
  "reference_images": [{ "url": "...selfie.jpg", "role": "identity" }]
}

// poll the result
{
  "eval_id": "ev_3f2d8a91",
  "status": "completed",
  "verdict": "pass",
  "overall_passed": true,
  "weighted_pass_rate": 0.94
}

Stop shipping bad images.

Quality control is not optional at scale. It is what separates an AI product from a reliable one.

Start free

Create a free account

Add your API key and get your first quality insights on everything you ship. No call needed.

  • Swap your key, see what you ship
  • Input and output scoring on every call
  • Free to start
Create a free account
Custom workflow

Build with Sentinel

Want Sentinel wired into a custom workflow, with input gates and automated next steps? We build that with you.

  • A custom evaluation schema for your use case
  • Input gates and automated next steps
  • Built hands-on with our team
Talk to a founder

See quality scoring in action

Try the free fashion product scorer. Upload a garment image, get an instant AI readiness score.

Score my product