AI Moderation and Sensitive Data Protection
If it’s forever, it better be safe.
In Part 1, I introduced Immutable Notes, a decentralized archive where people can mint messages or personal notes as on-chain SVG NFTs — no IPFS, no servers, no silent failures. The entire visual output is generated in SVG and stored fully on-chain.
But blockchain permanence comes with responsibility.
If you let people write anything to the chain, you risk storing:
NSFW or hateful content
Personally identifiable information (PII) like emails, phone numbers, and passwords
Sensitive data they might later regret putting on a permanent ledger
So in Part 2, I focused on making minting safe by design, using a combination of AI moderation and custom validation logic.
The Moderation Stack
I added a two-layer protection system:
1. OpenAI Moderation API
Before minting, the user’s message is sent to the OpenAI Moderation endpoint, which checks the text for violations like:
- Hate or harassment
- Sexual or violent content
- Self-harm
- Criminal activity
It returns a simple flagged: true/false response — fast, accurate, and cheap ($0.002 per 1,000 messages).
2. PII Detection with Regex
OpenAI’s moderation doesn’t catch everything, so I wrote a set of custom filters to block:
- Emails: [email protected]
- Phone numbers: (555) 123-4567
- Card numbers: 4242 4242 4242 4242
- Passwords: password=secret123
function containsSensitiveInfo(text: string): string | null {
const emailRegex = /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/i;
const phoneRegex = /\b(\+?\d{1,3}[-.\s]?)?\(?\d{2,4}\)?[-.\s]?\d{3,5}[-.\s]?\d{4}\b/;
const cardRegex = /\b(?:\d[ -]*?){13,16}\b/;
const passwordRegex = /(?:password|passwd|pass)\s*[:=]\s*\S+/i;
if (emailRegex.test(text)) return "email address";
if (phoneRegex.test(text)) return "phone number";
if (cardRegex.test(text)) return "credit card number";
if (passwordRegex.test(text)) return "password";
return null;
}
If any of those patterns are detected, the user gets an alert and can’t proceed to minting.
Sample Workflow
The user types their message into the NftSvgEditor.
On “Submit”, the app:
- Runs PII checks
- Sends the message to /api/moderate
- Receives moderation response from OpenAI
If the message is flagged or contains PII, the app shows a helpful alert and blocks minting.
Built With
-
Next.js 15.3.5
App Router (Edge-ready) -
React 19
with server/client components OpenAI Moderation API
-
Regex-based
validation for common PII patterns - Custom Alert component for UX feedback
Why It Matters
I believe blockchain permanence requires thoughtful constraints. Immutable Notes isn't just a place to write — it’s a place to preserve something that should be preserved.
Letting harmful, illegal, or personal data into the system would contradict that mission, so these safeguards are essential.
Coming in Part 3…
Next up, I’ll show how I’m:
- Storing the
SHA-256 hash
of the message for on-chain verification - Using that to ensure messages aren’t altered after moderation
- Making the smart contract enforce this safety — not just the frontend
Want to Collaborate?
GitHub: github.com/lucas-costa/immutable-notes
Feedback and pull requests are welcome.
If you’ve built something similar or have ideas on improving this moderation pipeline, I’d love to hear from you.
Let’s build something worth remembering — and safe enough to last forever.
Top comments (1)
What blockchain used?