Erry Kostala

Posted on Jun 12

Building a Smart Job Search API with FastAPI, LangChain, PostgreSQL, and LLMs

#python #langchain #llm #chatgpt

In this post, we’ll walk through a modern job search API built with FastAPI and PostgreSQL, featuring advanced AI-powered search using large language models (LLMs). We’ll cover the project structure, how to run and migrate the service, and take a deep dive into the /jobs-ai-search endpoint, which lets users search for jobs using natural language and smart filters.

Project Overview

This service is designed to help users search for jobs using flexible filters and natural language input. It uses:

FastAPI for the web API
PostgreSQL as the database (via Docker Compose)
SQLAlchemy for ORM models and migrations (with Alembic)
LangChain + OpenAI for LLM-powered filter generation
Modular code structure for maintainability

Key Features

AI-powered search via /jobs-ai-search:
- Users can enter natural language queries (e.g., "remote senior developer jobs at a medium company in London with at least $100k salary").
- The backend uses a large language model (LLM) to interpret the query and generate structured filters, making the search experience intuitive and powerful.
- The LLM understands synonyms, context, and intent, so users don't need to know the exact filter fields or values.
Company size logic:
- Users can filter jobs by company size using business terms like "small", "medium", "large", or "enterprise".
- The service maps these terms to employee count ranges using a dedicated company_sizes table (e.g., "medium" = 51–250 employees).
- This mapping is automatic and transparent to the user, enabling more human-friendly search.
Flexible, composable filters:
- The API supports a wide range of filters, including job title, company, location, employment type, experience level, remote/hybrid, industry, and salary range.
- Filters use a double-underscore syntax (e.g., salary__gte, title__ilike) for advanced queries, inspired by Django ORM.
- You can combine multiple filters in a single query for highly targeted results.

The `/jobs-ai-search` Endpoint: AI-Powered Job Search

The /jobs-ai-search endpoint is the star of this service. It allows users to search for jobs using natural language queries, which are interpreted and converted into structured filters by an LLM (like GPT-4).

How It Works (with Code Examples)

1. User Query

The user sends a GET request to /jobs-ai-search with a filter parameter containing a natural language query. For example:

GET /jobs-ai-search?filter=at least $80000 medium size company remote in Matthewshaven

2. LLM Filter Generation

The backend uses LangChain and OpenAI to process the query. The LLM is prompted to generate a JSON filter object that matches the user’s intent. For example, the above query might be converted to:

{
  "salary__gte": 80000,
  "company_size": "medium",
  "remote": true,
  "location": "Matthewshaven"
}

To do this, the query is passed to ChatGPT:

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI

def get_llm_job_filters(query_text: str) -> dict:
    llm = ChatOpenAI(
        model="gpt-4.1", temperature=0.0, max_tokens=1000, timeout=60, max_retries=2
    ).with_structured_output(method="json_mode")

    template = """
    You are a job filter generator.
    Given a query, generate a JSON object that can be used to filter jobs in a database.
    Each value should only be used on a single field based on what is most suitable
    where the values belong to one of the field examples, the field should take priority.
    {format_instructions}
    {query}
    """

By default, ChatGPT returns human-readable text. So how do we get something our app can actually use? That’s where LangChain’s support for structured outputs comes in.

We define the expected structure using Pydantic. This ensures the LLM returns valid, predictable JSON that our backend understands.

Here’s the Pydantic class:

class JobFilter(BaseModel):
    """
    Filter for job listings.
    """

    title__ilike: str = Field(
        default=None,
        description="Filter jobs by title using case-insensitive partial match.",
    )
    company__ilike: str = Field(
        default=None,
        description="Filter jobs by company using case-insensitive partial match.",
    )
    location__ilike: str = Field(
        default=None,
        description="Filter jobs by location using case-insensitive partial match.",
    )
    employment_type: str = Field(
        default=None,
        description="Filter jobs by employment type (e.g., full_time, part_time, contract).",
        examples=[
            "full_time",
            "part_time",
            "contract",
        ],
    )
    experience_level: str = Field(
        default=None,
        description="Filter jobs by experience level (e.g., entry_level, mid_level, senior_level).",
        examples=[
            "entry_level",
            "mid_level",
            "senior_level",
            "director",
            "executive",
            "internship",
            "other",
        ],
    )
    remote: bool = Field(
        default=None,
        description="Filter jobs by remote work availability.",
    )
    hybrid: bool = Field(
        default=None,
        description="Filter jobs by hybrid work availability.",
    )
    industry__ilike: str = Field(
        default=None,
        description="Filter jobs by industry using case-insensitive partial match.",
    )
    salary__gte: int = Field(
        default=None,
        description="Filter jobs by minimum salary.",
    )
    company_size: str = Field(
        default=None,
        description="Filter jobs by company size",
        examples=[
            "small",
            "medium",
            "large",
            "enterprise",
        ],
    )

Back in get_llm_job_filters, we now generate the actual structured filters:

    parser = JsonOutputParser(pydantic_object=JobFilter)
    prompt = PromptTemplate(
        template=template,
        input_variables=["query"],
        partial_variables={"format_instructions": parser.get_format_instructions()},
    )

    chain = LLMChain(llm=llm, prompt=prompt)
    response = chain.run({"query": query_text})

    smart_filter = {k: v for k, v in response.items() if v not in (None, 0, "", [], {})}

    return smart_filter

3. Converting the Filter to a SQLAlchemy Query

Once we have the filters, we need to apply them to the SQLAlchemy query. Here's how we convert Django-style syntax like employees__gte=100 into SQLAlchemy filter expressions:

def query_dict_to_sqlalchemy(model, query, filter_dict):
    for key, value in filter_dict.items():
        if "__" in key:
            field, op = key.split("__", 1)
            col = getattr(model, field, None)
            if not col:
                continue
            if op == "ilike":
                query = query.filter(col.ilike(f"%{value}%"))
            elif op == "gte":
                query = query.filter(col >= value)
            elif op == "lte":
                query = query.filter(col <= value)
        else:
            col = getattr(model, key, None)
            if col:
                query = query.filter(col == value)
    return query

4. Final FastAPI Route

@router.get("/jobs-ai-search")
def read_jobs_ai_search(filter: str, skip: int = 0, limit: int = 10):
    query = session.query(Job)
    filter_dict = get_llm_job_filters(filter)
    query = query_dict_to_sqlalchemy(Job, query, filter_dict)

    total = query.count()
    jobs = query.offset(skip).limit(limit).all()
    next_page, prev_page = get_pagination_urls(...)
    results = format_jobs(jobs)

    return {
        "total": total,
        "next_page": next_page,
        "prev_page": prev_page,
        "results": results,
    }

Example Response

{
  "total": 12,
  "next_page": "/jobs-ai-search?skip=10&limit=10&filter=...",
  "prev_page": null,
  "results": [
    {
      "id": 1,
      "title": "Software Engineer",
      "company": "Acme Corp",
      "salary": 90000,
      "remote": true,
      "location": "Matthewshaven"
    }
  ]
}

Why This Is Powerful

Natural Language Search: Users don’t need to know the exact filter fields; they can just describe what they want.
LLM-Driven: The LLM interprets intent and generates structured filters, making the search much smarter than keyword matching.
Company Size Intelligence: Understands business terms like "medium company" and translates them into actionable filters.
Flexible Filtering: The double-underscore filter syntax allows for a wide range of queries (e.g., salary_gte, title_ilike) without changing the backend code.
Extensible: You can add more fields, operators, or even new entity types (like company size) without changing the core logic.

Conclusion

This project demonstrates how to combine FastAPI, SQLAlchemy, and LLMs to build a modern, flexible, and intelligent job search API. The /jobs-ai-search endpoint is a great example of how AI can bridge the gap between user intent and structured data, making search more accessible and powerful.

Want to try it out? Clone the repo, run the stack, and start searching for your dream job—using plain English!

The repo is here: https://github.com/errietta/ai-job-search-poc

DEV Community

Building a Smart Job Search API with FastAPI, LangChain, PostgreSQL, and LLMs

Project Overview

Key Features

The `/jobs-ai-search` Endpoint: AI-Powered Job Search

How It Works (with Code Examples)

1. User Query

2. LLM Filter Generation

3. Converting the Filter to a SQLAlchemy Query

4. Final FastAPI Route

Why This Is Powerful

Conclusion

Top comments (0)

Project Overview

Key Features

The /jobs-ai-search Endpoint: AI-Powered Job Search

How It Works (with Code Examples)

1. User Query

2. LLM Filter Generation

3. Converting the Filter to a SQLAlchemy Query

4. Final FastAPI Route

Why This Is Powerful

Conclusion

The `/jobs-ai-search` Endpoint: AI-Powered Job Search