DEV Community

Erry Kostala
Erry Kostala

Posted on

Building a Smart Job Search API with FastAPI, LangChain, PostgreSQL, and LLMs

In this post, we’ll walk through a modern job search API built with FastAPI and PostgreSQL, featuring advanced AI-powered search using large language models (LLMs). We’ll cover the project structure, how to run and migrate the service, and take a deep dive into the /jobs-ai-search endpoint, which lets users search for jobs using natural language and smart filters.


Project Overview

This service is designed to help users search for jobs using flexible filters and natural language input. It uses:

  • FastAPI for the web API
  • PostgreSQL as the database (via Docker Compose)
  • SQLAlchemy for ORM models and migrations (with Alembic)
  • LangChain + OpenAI for LLM-powered filter generation
  • Modular code structure for maintainability

Key Features

  • AI-powered search via /jobs-ai-search:

    • Users can enter natural language queries (e.g., "remote senior developer jobs at a medium company in London with at least $100k salary").
    • The backend uses a large language model (LLM) to interpret the query and generate structured filters, making the search experience intuitive and powerful.
    • The LLM understands synonyms, context, and intent, so users don't need to know the exact filter fields or values.
  • Company size logic:

    • Users can filter jobs by company size using business terms like "small", "medium", "large", or "enterprise".
    • The service maps these terms to employee count ranges using a dedicated company_sizes table (e.g., "medium" = 51–250 employees).
    • This mapping is automatic and transparent to the user, enabling more human-friendly search.
  • Flexible, composable filters:

    • The API supports a wide range of filters, including job title, company, location, employment type, experience level, remote/hybrid, industry, and salary range.
    • Filters use a double-underscore syntax (e.g., salary__gte, title__ilike) for advanced queries, inspired by Django ORM.
    • You can combine multiple filters in a single query for highly targeted results.

The /jobs-ai-search Endpoint: AI-Powered Job Search

The /jobs-ai-search endpoint is the star of this service. It allows users to search for jobs using natural language queries, which are interpreted and converted into structured filters by an LLM (like GPT-4).


How It Works (with Code Examples)

1. User Query

The user sends a GET request to /jobs-ai-search with a filter parameter containing a natural language query. For example:

GET /jobs-ai-search?filter=at least $80000 medium size company remote in Matthewshaven
Enter fullscreen mode Exit fullscreen mode

2. LLM Filter Generation

The backend uses LangChain and OpenAI to process the query. The LLM is prompted to generate a JSON filter object that matches the user’s intent. For example, the above query might be converted to:

{
  "salary__gte": 80000,
  "company_size": "medium",
  "remote": true,
  "location": "Matthewshaven"
}
Enter fullscreen mode Exit fullscreen mode

To do this, the query is passed to ChatGPT:

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI

def get_llm_job_filters(query_text: str) -> dict:
    llm = ChatOpenAI(
        model="gpt-4.1", temperature=0.0, max_tokens=1000, timeout=60, max_retries=2
    ).with_structured_output(method="json_mode")

    template = """
    You are a job filter generator.
    Given a query, generate a JSON object that can be used to filter jobs in a database.
    Each value should only be used on a single field based on what is most suitable
    where the values belong to one of the field examples, the field should take priority.
    {format_instructions}
    {query}
    """
Enter fullscreen mode Exit fullscreen mode

By default, ChatGPT returns human-readable text. So how do we get something our app can actually use? That’s where LangChain’s support for structured outputs comes in.

We define the expected structure using Pydantic. This ensures the LLM returns valid, predictable JSON that our backend understands.

Here’s the Pydantic class:

class JobFilter(BaseModel):
    """
    Filter for job listings.
    """

    title__ilike: str = Field(
        default=None,
        description="Filter jobs by title using case-insensitive partial match.",
    )
    company__ilike: str = Field(
        default=None,
        description="Filter jobs by company using case-insensitive partial match.",
    )
    location__ilike: str = Field(
        default=None,
        description="Filter jobs by location using case-insensitive partial match.",
    )
    employment_type: str = Field(
        default=None,
        description="Filter jobs by employment type (e.g., full_time, part_time, contract).",
        examples=[
            "full_time",
            "part_time",
            "contract",
        ],
    )
    experience_level: str = Field(
        default=None,
        description="Filter jobs by experience level (e.g., entry_level, mid_level, senior_level).",
        examples=[
            "entry_level",
            "mid_level",
            "senior_level",
            "director",
            "executive",
            "internship",
            "other",
        ],
    )
    remote: bool = Field(
        default=None,
        description="Filter jobs by remote work availability.",
    )
    hybrid: bool = Field(
        default=None,
        description="Filter jobs by hybrid work availability.",
    )
    industry__ilike: str = Field(
        default=None,
        description="Filter jobs by industry using case-insensitive partial match.",
    )
    salary__gte: int = Field(
        default=None,
        description="Filter jobs by minimum salary.",
    )
    company_size: str = Field(
        default=None,
        description="Filter jobs by company size",
        examples=[
            "small",
            "medium",
            "large",
            "enterprise",
        ],
    )
Enter fullscreen mode Exit fullscreen mode

Back in get_llm_job_filters, we now generate the actual structured filters:

    parser = JsonOutputParser(pydantic_object=JobFilter)
    prompt = PromptTemplate(
        template=template,
        input_variables=["query"],
        partial_variables={"format_instructions": parser.get_format_instructions()},
    )

    chain = LLMChain(llm=llm, prompt=prompt)
    response = chain.run({"query": query_text})

    smart_filter = {k: v for k, v in response.items() if v not in (None, 0, "", [], {})}

    return smart_filter
Enter fullscreen mode Exit fullscreen mode

3. Converting the Filter to a SQLAlchemy Query

Once we have the filters, we need to apply them to the SQLAlchemy query. Here's how we convert Django-style syntax like employees__gte=100 into SQLAlchemy filter expressions:

def query_dict_to_sqlalchemy(model, query, filter_dict):
    for key, value in filter_dict.items():
        if "__" in key:
            field, op = key.split("__", 1)
            col = getattr(model, field, None)
            if not col:
                continue
            if op == "ilike":
                query = query.filter(col.ilike(f"%{value}%"))
            elif op == "gte":
                query = query.filter(col >= value)
            elif op == "lte":
                query = query.filter(col <= value)
        else:
            col = getattr(model, key, None)
            if col:
                query = query.filter(col == value)
    return query
Enter fullscreen mode Exit fullscreen mode

4. Final FastAPI Route

@router.get("/jobs-ai-search")
def read_jobs_ai_search(filter: str, skip: int = 0, limit: int = 10):
    query = session.query(Job)
    filter_dict = get_llm_job_filters(filter)
    query = query_dict_to_sqlalchemy(Job, query, filter_dict)

    total = query.count()
    jobs = query.offset(skip).limit(limit).all()
    next_page, prev_page = get_pagination_urls(...)
    results = format_jobs(jobs)

    return {
        "total": total,
        "next_page": next_page,
        "prev_page": prev_page,
        "results": results,
    }
Enter fullscreen mode Exit fullscreen mode

Example Response

{
  "total": 12,
  "next_page": "/jobs-ai-search?skip=10&limit=10&filter=...",
  "prev_page": null,
  "results": [
    {
      "id": 1,
      "title": "Software Engineer",
      "company": "Acme Corp",
      "salary": 90000,
      "remote": true,
      "location": "Matthewshaven"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Why This Is Powerful

  • Natural Language Search: Users don’t need to know the exact filter fields; they can just describe what they want.

  • LLM-Driven: The LLM interprets intent and generates structured filters, making the search much smarter than keyword matching.

  • Company Size Intelligence: Understands business terms like "medium company" and translates them into actionable filters.

  • Flexible Filtering: The double-underscore filter syntax allows for a wide range of queries (e.g., salary_gte, title_ilike) without changing the backend code.

  • Extensible: You can add more fields, operators, or even new entity types (like company size) without changing the core logic.

Conclusion

This project demonstrates how to combine FastAPI, SQLAlchemy, and LLMs to build a modern, flexible, and intelligent job search API. The /jobs-ai-search endpoint is a great example of how AI can bridge the gap between user intent and structured data, making search more accessible and powerful.

Want to try it out? Clone the repo, run the stack, and start searching for your dream job—using plain English!

The repo is here: https://github.com/errietta/ai-job-search-poc

Top comments (0)