DEV Community

Cover image for 🧠 Building a Smart Regulatory Chatbot for IRDAI Using LangChain, Angular, FastAPI & OpenAI
swapnil shingare
swapnil shingare

Posted on • Edited on

🧠 Building a Smart Regulatory Chatbot for IRDAI Using LangChain, Angular, FastAPI & OpenAI

Have you ever struggled to find the latest circular or regulation from the IRDAI (Insurance Regulatory and Development Authority of India) website?

So did I β€” and that’s why I decided to build a full-stack AI-powered chatbot that can:

  • Scrape circulars and press notes from the IRDAI website
  • Extract and embed content from PDFs/HTML
  • Answer user questions with accurate regulatory information
  • Suggest follow-up questions like a real assistant
  • Provide a beautiful and modern UI using Angular + Bootstrap

In this blog, I’ll walk through the architecture, tech stack, and some cool agentic automation behind the scenes.


πŸš€ Overview: What I Built

The IRDAI Chatbot is a fully automated system that:

  1. Scrapes and downloads IRDAI circulars across all paginated pages
  2. Parses and embeds PDFs using LangChain + OpenAI embeddings
  3. Stores the embeddings in a local Chroma vectorstore
  4. Uses a smart LangChain QA Agent to answer questions using RAG (retrieval-augmented generation)
  5. Offers an interactive, smooth Angular frontend with live chat, typing effects, suggestion bubbles, and animated scroll
  6. Suggests relevant follow-up questions based on each answer!

🧰 Tech Stack

Layer Tech
🧠 LLM OpenAI GPT-4 via LangChain
🧱 Vectorstore ChromaDB
πŸ“š Embedding OpenAIEmbeddings
πŸ” RAG LangChain QA chains
🧩 Agent Framework LangGraph
πŸ”§ Backend FastAPI
🧼 PDF Parsing PyMuPDF
🌐 Frontend Angular 17 + Bootstrap 5
πŸ€– Scraping Selenium + BeautifulSoup
πŸ” Async Python asyncio + batching

🧠 Backend: AI Agent Architecture

I built a smart multi-step LangGraph agent with these nodes:

  1. Scrape Node β€” uses Selenium to crawl all IRDAI circular pages, follows paginated "Next" links, and downloads PDFs
  2. Parse Node β€” uses PyMuPDF to read PDF content
  3. Embed Node β€” splits content into chunks and stores embeddings in Chroma
  4. QA Node β€” answers questions by retrieving relevant docs using vector similarity
  5. Suggestion Node β€” uses another agent to suggest follow-up questions based on the bot's answer

All nodes are reusable and callable as standalone FastAPI routes too.


🧠 Chatbot Flow: How Everything Connects

Here’s a visual flow of how the chatbot works β€” from user input to AI agents performing RAG-based document search and response formatting:

IRDAI Chatbot Flow Diagram

  • Text Input: User submits a question.
  • ChatGPT Core: Formats the query, routes it to agents.
  • AI Agents:
    • Scraper: Collects PDFs & press notes
    • ETL: Parses and embeds documents
    • QA: Handles similarity search + answer generation
  • Database: Stores and retrieves document embeddings
  • Chat Interface: Formats HTML responses and suggestions

This modular design ensures scalability and clarity.


βš™οΈ Smart Features

  • βœ… Async batching for large document QA (splits input across token-safe chunks)
  • βœ… Automatic spell correction + similar question detection
  • βœ… Answer caching to improve performance with a time-aware LRU-like strategy
  • βœ… Suggestions engine that generates related follow-up questions using a second LLM chain
  • βœ… Common llm_provider.py to centralize LLM configuration across the app

πŸ’¬ Frontend: Angular Chat UI

The frontend is built with Angular 17 standalone components, styled with Bootstrap 5, and includes:

  • πŸ’‘ Suggested questions before and after answers
  • πŸ€– Typing animation (blinking dots)
  • 🎯 Smart session tracking using UUIDs
  • πŸ”„ Smooth scroll-to-bottom on every update
  • ❌ Graceful error handling
  • πŸ”₯ Responsive, mobile-friendly layout

🌐 FastAPI Backend

The backend exposes:

  • /ask β€” main QA endpoint
  • /suggest β€” generate follow-up questions
  • /scrape β€” run scraper
  • /embed β€” re-embed new content

You can trigger scraping + embedding via the LangGraph agent, CLI, or API β€” fully flexible.


🧠 Example Q&A

Q: What is Saral Jeevan Bima?

A: Saral Jeevan Bima is a standard term life insurance policy mandated by IRDAI...

Suggested follow-ups:

  • "Who is eligible for Saral Jeevan?"
  • "Is it mandatory for insurers?"
  • "What are the premium limits?"

πŸ’‘ Lessons Learned

  • πŸ” LangGraph is amazing for building modular multi-step agent flows.
  • ⚠️ Be cautious of OpenAI token limits β€” I had to chunk documents smartly.
  • πŸ”„ Building a good frontend experience is just as important as the backend logic.
  • ⚑ Don’t forget caching when dealing with repeated queries or expensive operations.

🎁 What's Next

  • Add user authentication for session history
  • Push updates to a Firebase or Netlify-hosted frontend
  • Enable upload of user PDFs for comparison
  • Train a custom model on domain-specific terms

πŸ“¦ Repo Coming Soon

Planning to open-source this soon. Let me know if you’d like early access!


πŸ™Œ Let’s Connect!

If you found this useful or have feedback:

πŸ’¬ Comment below

🧠 Follow me on LinkedIn

πŸ’‘ Have a chatbot idea? Let’s collaborate!

Top comments (2)

Collapse
 
iamakashsoni profile image
Akash Soni

Great Work πŸ‘πŸ‘

Collapse
 
swapnil_shingare_f01cbac9 profile image
swapnil shingare

Thank you