Amverum Cloud

Posted on Jun 3

Pydantic 2: The Complete Guide for Python Developers - From Basics to Advanced Techniques

#database #devops #python #sysadmin

Recently, I noticed that there is very little accessible and understandable information about the Pydantic library on the Russian-language Internet, especially about its updated version 2. This seems strange, because Pydantic is a very easy-to-learn library that will take you only a few hours to learn. However, once you master it, you will have a powerful tool at your disposal that can be effectively used in most Python projects.

Interesting fact: more than 30% of all Python projects use Pydantic, even if it is not always noticeable at first glance. And frameworks such as FastAPI generally build their logic on top of Pydantic, making it an integral part of their solutions.

This is confirmed by the statistics of projects deployed in our cloud Amverum Cloud. The technology is really often used by our users.

About Amverum Cloud

Amverum - is a cloud for easy deployment of applications via git push. Built-in CI/CD, backups and monitoring allow you to deploy a project with three commands in the IDE and not think about setting up the infrastructure. Amverum is simpler than using a VPS or Kubernetes cluster. Git push amverum master, and your bot or site will be launched in the cloud.

As you may have guessed from the title, today we will take a detailed look at how to use Pydantic 2 in your projects, and most importantly — why you need it. We will cover the key concepts, features, and changes that have appeared in the new version of the library.

What you will learn by the end of this article:

What is Pydantic and its main purpose.
The concept of model in Pydantic.
We will study in detail what fields are and how Pydantic's built-in mechanisms help with data validation.
We will examine custom field validation (via field_validator) and global validation at the model level (model_validator).
We will deal with the issue of auto-generated fields in Pydantic
We will dive into model settings using ConfigDict to understand why they are needed and how to use them effectively.
Let's look at the model inheritance mechanism, which can significantly optimize and simplify your code.
Learn how to integrate Pydantic with ORM models (using SQLAlchemy as an example, although this can be applied to other ORMs as well).
Learn how to transform data into convenient formats — dictionaries and JSON strings.

As a result: you will get acquainted with Pydantic version 2 and master its main methods and approaches.

Brief theoretical block

Before moving on to practice, let's get acquainted with the basic concepts and capabilities of Pydantic 2. This block will be theoretical, but extremely important — we will then consolidate all its aspects in practice.

I will try to explain everything in as much detail and "meticulously" as possible, so that there are no questions left.

What is Pydantic?

Pydantic 2 is a Python library for data validation and transformation. It helps developers ensure that input data conforms to established rules and types, and automatically transforms it into the required formats.

Key features of Pydantic:

Data validation: checks input data against expected types and constraints.
Data transformation: automatically transforms data into the required types and formats.

Models in Pydantic

Models in Pydantic inherit from the BaseModel class. Each model describes a set of fields that represent the structure of the data and the conditions for its validation.

Field descriptions:

Typing: Fields in a model are described with types, for example, name: str. This provides basic type validation.
Usage Field(): Allows you to annotate fields with additional parameters such as default values, constraints, and other settings.

Basic Model Example:

from pydantic import BaseModel, Field

class User(BaseModel):
    name: str
    email: str = Field(..., alias='email_address')

Field Validation

Minimal Type Validation: Using Python's built-in types (e.g. str, int), you can perform basic field validation.
Using Validators: Pydantic provides validators such as EmailStr for validating email addresses. Using advanced validators requires installing additional dependencies: pydantic[email] or pydantic[all].

Example with validator:

from pydantic import BaseModel, EmailStr

class User(BaseModel):
    name: str
    email: EmailStr

Decorators in Pydantic

Pydantic 2 adds new capabilities for validating and evaluating fields using decorators.

@field_validator — replaces the old @validator and allows you to add custom field validation logic. Called when a model is created or modified.

Usage example @field_validator:

from pydantic import BaseModel, field_validator

class User(BaseModel):
    age: int

    @field_validator('age')
    def check_age(cls, value):
        if value < 18:
            raise ValueError('Age must be over 18 years')
        return value

@computed_field - a field that is calculated based on other data in the model. It can be used to automatically generate values, as well as for validation.

Example of usage @computed_field:

from pydantic import BaseModel, computed_field

class User(BaseModel):
    name: str
    surname: str

    @computed_field
    def full_name(self) -> str:
        return f"{self.name} {self.surname}"

Working with ORM

Pydantic supports integration with ORM (e.g. SQLAlchemy) for validation and transformation of data retrieved from the database.

To configure the model to work with ORM, use the ConfigDict parameter with the from_attributes=True flag.

Example:

from datetime import date
from pydantic import BaseModel, ConfigDict

class User(BaseModel):
    id: int
    name: str = 'John Doe'
    birthday_date: date

    config = ConfigDict(from_attributes=True)

To create a Pydantic model from an ORM object, use the from_orm method.

Example:

user = User.from_orm(orm_instance)

Methods for working with data

dict() / model_dump() — convert the model into a Python dictionary. In version 2, the model_dump() method became an analogue of dict().

Example:

data = user.model_dump()

json() / model_dump_json() — convert the model to a JSON string. In the new version, the model_dump_json() method replaces the old json().

Example:

json_data = user.model_dump_json()

Passing data to the model

Named arguments: Model fields can be set directly when creating an instance.

Example:

user = User(name="Oleg", age=30)

Unpacked Dictionaries: You can pass field values using dictionary unpacking **.

Example:

user_data = {"name": "Oleg", "age": 30} user = User(**user_data)

By now you should have learned that Pydantic 2 is a powerful data tool that supports validation and transformation of various data types, as well as ORM integration and custom validators.

Now let's put it all into practice.

Getting Started with Pydantic 2

To get started, create a new project in your favorite IDE, such as PyCharm. If you don't have Pydantic installed yet, install it using the following command:

pip install -U pydantic[all]

What does this command do:

-U — updates Pydantic to the latest version if it is already installed, or just installs the latest available version if Pydantic is missing.
[all] — this flag adds all sorts of additional modules and validators that might be useful in the project, such as an email address validator and other advanced features.

Now that the installation is complete, we are ready to start practicing Pydantic 2.

Let's describe the first Pydantic model

from datetime import date
from pydantic import BaseModel


class User(BaseModel):
    id: int
    name: str
    birthday_date: date

In this Pydantic model, we have defined three required fields: id, name and birthday_date. These fields will be validated automatically.

Let's now create an object of the User class, passing the required parameters.

oleg = User(id=1, 
            name='Oleg', 
            birthday_date=date(year=1993, month=2, day=19))

Everything is clear here and we don't see any differences from working with a regular class yet. Now we'll get data in the form of a dictionary and a JSON string, and then I'll show you a few tricks.

To transform a model into a Python dictionary (dict), you can use the dict() method or its full analogue model_dump().

To transform into a JSON string, you can use the json() or model_dump_json() methods.

We'll perform the transformation using each of the methods.

to_dict = oleg.model_dump()
to_json = oleg.model_dump_json()

print(to_dict, type(to_dict))
print(to_json, type(to_json))

Output the object type to the console so that the differences are visible.

What happens if we create another user's object and pass the data as follows?

alex = User(id="2",
            name='Alexey',
            birthday_date="1990-11-22")

The first thing that comes to mind is: "Of course it's a mistake!" Which is logical, because we passed the user ID as a string when an integer is expected, and in the birthday_date field we passed a string instead of a date object, but we'll check.

alex = User(id="2",
            name='Alexey',
            birthday_date="1990-11-22")

to_dict = alex.model_dump()
to_json = alex.model_dump_json()

print(to_dict, type(to_dict))
print(to_json, type(to_json))

With this simple example, I showed how Pydantic automatically converts data into the format we need. This wonderful feature of the library can be incredibly useful in a variety of situations, and I think you can guess which ones.

Now let's create another user and describe it as follows:

dima = User(id="3",
            name=156,
            birthday_date="1990-11-22")

to_dict = dima.model_dump()
to_json = dima.model_dump_json()

print(to_dict, type(to_dict))
print(to_json, type(to_json))

Logically, everything should work correctly now, right? We've already checked that the string is converted to an integer, and the date passed as a string is converted to a date format. Let's make sure of that.

Unexpectedly, we got an error. Let's figure out why.

Why did the error occur?

Pydantic performs data validation and can also automatically convert types in some cases. However, such conversions only occur for data that can be safely and unambiguously cast to the expected type. For example:

If the model expects an int, and we pass the string "123", Pydantic can convert it to an int.
If the model expects a str, but you pass an int, as in our case, Pydantic does not try to perform automatic typecasting, since it is not always obvious that an integer can be correctly interpreted as a string.

There are two ways out of this situation:

Try to pass correct data for ambiguous cases
Use field validators that will perform data transformation inside the model before the main type check and convert it to a string.

Field validator (field_validator) in Pydantic

In previous versions of Pydantic, this decorator was called validate. Now, in new versions, its name has changed to field_validator, which better reflects its purpose.

Purpose of the decorator

The field_validator decorator is used to check the correctness of the fields of the Pydantic model. In addition to validation, it can be used to transform data before saving it to the model.

Importing the decorator

First, we import the necessary elements from Pydantic:

from pydantic import BaseModel, field_validator
from datetime import date

Usage example

Let's create a User model, adding a validator for the name field:

class User(BaseModel):
    id: int
    name: str
    birthday_date: date

    @field_validator('name', mode='before')
    def validate_name(cls, v):
        return str(v)

The field_validator decorator always takes one mandatory argument - the name of the field to validate. The second argument, which is preferable to specify, is mode.

It is important to note the use of the mode='before' parameter. This tells Pydantic to perform validation and data transformation before the model instance is created, not after. Another option is mode='after'.

The method itself takes the field (v) and then starts validating it. The above is a simple example, but we will complicate it very soon.

Now our validator for the name field will automatically convert any values passed to a string.

Important. Most likely, your IDE will complain about the description of cls in the validator method. This is not an error, but in order to avoid an annoying warning, you can describe this type of decorators as follows:

@field_validator('name', mode='before')
@classmethod
def validate_name(cls, v):
        return str(v)

Let's check the model

Let's try to create an instance of the model with the following data:

user_data = {'id': 3, 'name': '156', 'birthday_date': '1990-11-22'}
user = User(**user_data)
print(user.dict())

Result:

{'id': 3, 'name': '156', 'birthday_date': datetime.date(1990, 11, 22)}

There are no errors, the data was successfully converted and serialized to JSON. However, there is one problem here - our validator is used only to convert the data, not to validate it.

Unexpected result

Let's say we pass the following data:

dima = User(
    id="3",
    name=("Kolya", True, False, 0, 19933),
    birthday_date="1990-11-22"
)

Result:

{'id': 3, 'name': "('Kolya', True, False, 0, 19933)", 'birthday_date': datetime.date(1990, 11, 22)}

There is no error, but something is clearly wrong - instead of a string in the name field, we got a tuple that was simply converted to a string. Obviously, this situation is unacceptable.

Validation Fix

To avoid this situation, we can add strict data type checking to the validator:

@field_validator('name', mode='before')
def validate_name(cls, v):
    if isinstance(v, int):
        return str(v)
    elif isinstance(v, str):
        return v
    else:
        raise ValueError("The name must be a string or a number")

Now our validator checks whether the passed value is a string or a number. If the value is not suitable, a ValueError exception is raised.

Checking the fix

Let's try passing the incorrect data again:

dima = User(
    id="3",
    name=("Kolya", True, False, 0, 19933),
    birthday_date="1990-11-22"
)

Result:

An error occurred: 1 validation error for User
name
  Value error, The name must be a string or a number [type=value_error, input_value=('Коля', True, False, 0, 19933), input_type=tuple]
    For further information visit https://errors.pydantic.dev/2.9/v/value_error

Now the validator worked correctly - it stopped the object creation and returned an error indicating that the passed value was incorrect.

Thus, field_validator can be used not only for data transformation, but also for strict validation of its correctness. In this example, we implemented a check that avoids invalid data in the model, and returned an appropriate error if the conditions are not met.

Model validator (model_validator) in Pydantic

In older versions of Pydantic, this decorator was called root_validator. Its main purpose is to validate the model as a whole, after all fields have already been individually validated. This allows complex checks to be performed that depend on several model fields at once.

Key features of the @model_validator decorator:

Executed after individual fields have been validated.
Has access to all model fields at once.
Can change field values or the entire model.
Used for complex validations involving multiple fields.

Example of using model_validator

Let's extend our User class by adding a model validator to check the user's age and set a default name:

from pydantic import BaseModel, field_validator, model_validator
from datetime import date

class User(BaseModel):
    id: int
    name: str
    birthday_date: date

    @field_validator('name', mode='before')
    def validate_name(cls, v):
        if isinstance(v, int):
            return str(v)
        elif isinstance(v, str):
            return v
        else:
            raise ValueError("The name must be a string or a number")

    @model_validator(mode='after')
    def check_age(self):
        today = date.today()
        age = today.year - self.birthday_date.year - (
            (today.month, today.day) < (self.birthday_date.month, self.birthday_date.day))

        if age < 18:
            raise ValueError("User must be over 18 years old")
        if age > 120:
            raise ValueError("Age cannot exceed 120 years")
        return self

    @model_validator(mode='after')
    def set_default_name(self):
        if self.name.strip() == '':
            self.name = f"User_{self.id}"
        return self

In this example:

The check_age method checks that the user's age is greater than 18 but less than 120. This check requires access to the birthday_date field and the current date, so it is implemented as a model validator.
The set_default_name method sets the default name if the name field is empty. This validator uses multiple fields (name and id), so it is also implemented at the model level.

Both validators use the after mode, which means that they are executed after the individual fields have been validated.

Example usage:

try:
    user = User(id=1, name="John", birthday_date=date(2000, 1, 1))
    print(user)
except ValueError as e:
    print(f"Error: {e}")

try:
    user = User(id=2, name="", birthday_date=date(2010, 1, 1))
    print(user)
except ValueError as e:
    print(f"Error: {e}")

try:
    user = User(id=3, name="Alice", birthday_date=date(1900, 1, 1))
    print(user)
except ValueError as e:
    print(f"Error: {e}")

This example demonstrates how @model_validator helps to perform complex checks and modify the model after validating individual fields.

Computed fields (computed_field)

The @computed_field decorator allows you to create fields that are calculated "on the fly" when accessed. This is useful when you need to automatically get values based on other model fields.

An example with computed fields:

Let's add the full_name and age computed fields to our User class:

from pydantic import BaseModel, computed_field
from datetime import date
from dateutil.relativedelta import relativedelta

class User(BaseModel):
    id: int
    name: str
    surname: str
    birthday_date: date

    @computed_field
    def full_name(self) -> str:
        return f"{self.name} {self.surname}"

    @computed_field
    def age(self) -> str:
        today = date.today()
        delta = relativedelta(today, self.birthday_date)
        return f"{delta.years} years, {delta.months} months и {delta.days} days"

The full_name field is calculated by concatenating the first and last names.
The age field calculates the user's age in years, months, and days using the relativedelta module library of the dateutil library.

An example of using calculated fields:

alex = User(id=1, name="Alexey", surname="Yakovenko", birthday_date="1993-02-19")
print(alex.dict())

Result:

{'id': 1, 'name': 'Alexey', 'surname': 'Yakovenko', 'birthday_date': datetime.date(1993, 2, 19), 'full_name': 'Alexey Yakovenko', 'age': '31 years, 7 months and 28 days'}

As you can see, computed fields work automatically, providing convenient access to derived values.

Decorators @model_validator, @field_validator and @computed_field allow you to flexibly and efficiently manage data validation in Pydantic models, as well as add computed fields. Model-level validation is useful for complex checks, while computed fields make it easier to work with derived values without requiring additional logic in the code.

Computed fields can be combined with model_validate. In this format. We create a new field and then, using model_validate, we can check whether everything is correct. In addition, the check can be done at the stage of creating the computed field itself.

To consolidate this block, practice using decorators yourself: model_validate, field_validate and computed_field.

Annotable fields in Pydantic 2: using the Field function

Pydantic 2 offers developers a convenient and powerful way to work with data model fields, and one of the key tools for this is the Field function. It allows you to detail the behavior of fields: set default values, add metadata, configure validation, and even document the model. In this block, I will talk about how to use Field to configure models in Pydantic 2.

Import the necessary modules:

from pydantic import BaseModel, Field

What is a Field in Pydantic 2?

The Field function allows you to add metadata and settings to model fields that Pydantic uses for validation, serialization, and documentation. Here are the main parameters that can be passed to Field:

default: sets the default value for the field.
default_factory: a function that returns the default value.
alias: an alternative name for the field for serialization and deserialization.
title: the title of the field for documentation.
description: the description of the field for documentation.
exclude: excludes the field from serialization.
repr: determines whether the field will be included in the string representation of the model.

A simple example of using Field

Let's say we have a User model where we want to set default values and add some metadata:

class User(BaseModel):
    id: int = Field(default=1, description="Unique user identifier")
    name: str = Field(default="John Doe", title="Username", description="Full name")
    role: str = Field(default="user", alias="user_role", description="User role in the system")

In this example:

The id field has a default value of 1 and a description for documentation.
The name field describes the user name with a title for documentation.
The role field has an alias user_role that will be used when serializing and deserializing data.

Field Validation

The Field function also allows you to set various restrictions on values. For example, you can define minimum and maximum values for numeric fields, or limit the length of strings.

Validation example via Field

class Product(BaseModel):
    price: float = Field(gt=0, description="The price must be greater than zero")
    name: str = Field(min_length=2, max_length=50, description="Product name must be between 2 and 50 characters")

Here:

The price field must be greater than zero.
The name field has string length restrictions: minimum 2 characters, maximum 50.

Here is a detailed description of commonly used built-in validators:

1. gt, ge, lt, le — for numeric restrictions

These validators are used to restrict numeric values (integers, floats, and other types that support arithmetic).

gt (greater than): Checks that the value is greater than the specified number.

Example: Field(gt=0) — the value must be greater than zero.

ge (greater than or equal): Checks that the value is greater than or equal to the specified number.

Example: Field(ge=1) — the value must be at least one.

lt (less than): Checks that the value is less than the specified number.
Example: Field(lt=100) - the value must be less than one hundred.
le (less than or equal): Checks that the value is less than or equal to the specified number.

Example: Field(le=10) - the value must be no greater than ten.

These validators are useful for specifying ranges of values, such as checking age, price, rating, and any other numeric data that has upper and lower bounds.

Example with numeric constraints:

class Product(BaseModel):
    price: float = Field(gt=0, le=10000, description="The price must be positive and not exceed 10,000")
    rating: int = Field(ge=1, le=5, description="Rating must be from 1 to 5")

Where:

The price field must be greater than 0 and less than 10,000.
The rating field must be between 1 and 5.

2. max_length, min_length — for string fields

These validators are used to limit the length of strings, which is important for validating text data such as usernames, descriptions, and other fields.

min_length: Specifies the minimum number of characters that a string must contain.

Example: Field(min_length=3) — the string must be at least 3 characters long.

max_length: Specifies the maximum number of characters a string can contain.

Example: Field(max_length=100) — the string must contain no more than 100 characters.

An example of using string length validators:

class User(BaseModel):
    username: str = Field(min_length=3, max_length=20, description="Username must be between 3 and 20 characters long")
    bio: str = Field(max_length=300, description="Profile description must not exceed 300 characters")

Here:

The username field must be between 3 and 20 characters long.
The bio field is limited to 300 characters.

3. regex — for regular expression validation

This validator allows you to check string values for compliance with a regular expression. Regular expressions (regex) allow you to flexibly describe acceptable string formats — for example, to validate email addresses, phone numbers, date formats, etc.

regex: Specifies a regular expression that the string must match. Example: Field(regex=r"[^@]+@[^@]+.[^@]+") — checks that the string is a valid email address.

Example of using regex:

class User(BaseModel):
    email: str = Field(regex=r"[^@]+@[^@]+.[^@]+", description="The email must be in the correct format")
    phone_number: str = Field(regex=r"^\+\d{1,3}\s?\d{4,14}$", description="The phone number must be in the format +123456789")

Here:

The email field must match the email address pattern (characters before @, domain name, dot and domain).
The phone_number field must match the international phone number format.

Dynamic default values

Sometimes you need to generate values for fields dynamically. This is done with the default_factory parameter, which takes a function to generate the value.

Example with default_factory

from uuid import uuid4


class Item(BaseModel):
    id: str = Field(default_factory=lambda: uuid4().hex)

Here, each id field will automatically receive a unique identifier when a new Item model object is created.

Using aliases

For compatibility with external APIs or for code readability, you can define aliases for fields. An alias is an alternative name for a field that will be used during serialization or deserialization.

Example with aliases

class User(BaseModel):
    username: str = Field(alias="user_name")

When you specify an alias for a field, the field will have one name in your Python code, but when you send or read data as JSON, it will use the name specified as the alias.

In this example, the username field will be serialized and deserialized as user_name.

Excluding fields from serialization

Sometimes you want to hide certain fields when serializing data — for example, to avoid sending sensitive information like passwords.

Example of excluding fields

class User(BaseModel):
    password: str = Field(exclude=True)

In this example, the password field will be excluded from the serialized representation of the model object.

This means that when serializing the model object (for example, when converting it to JSON or a dictionary), the password field will not be included in the result. This way, you can hide sensitive information such as passwords from external systems or clients.

Customizing the string representation of the model

You can control which fields will be displayed in the string representation of the model using the repr parameter.

Example of customizing the representation

class Config(BaseModel):
    debug_mode: bool = Field(repr=False)

The string representation of a model object is what is returned when calling repr() or str(). For example, when you want to see the state of an object or debug code, you can print its string representation.

The repr=False parameter allows you to exclude certain fields from this representation, so that they are not displayed when printing the object, even if they are present in the model.

In our example, the debug_mode field will not be displayed in the string representation of the model, which can be useful for hiding technical information.

Advanced capabilities via Annotated

Now that we have covered the basic capabilities of Field, we can move on to using annotations via Annotated. This method allows you to add metadata and validation more flexibly and precisely.

Annotated example

from typing_extensions import Annotated

class User(BaseModel):
    id: Annotated[int, Field(gt=0)]
    name: Annotated[str, Field(min_length=2, max_length=50)]
    email: Annotated[str, Field(regex=r"[^@]+@[^@]+.[^@]+")]
    role: Annotated[str, Field(default="user")]

Where:

The id field must be greater than zero.
The name field is limited to 2-50 characters.
The email field must match a regular expression that validates the email format.
The role field has a default value of "user".

The current preferred approach is to use Annotated in the field details.

The Field feature in Pydantic 2 provides developers with flexible tools for customizing model fields. It allows fine-tuning validation, setting default values, using aliases, and adding metadata for documentation. Using Annotated makes this process even more powerful and convenient.

These features help create structured and secure data models, providing easy integration with external APIs and clear validation of incoming data.

Model Configuration in Pydantic 2

In Pydantic 2, model configuration is now specified via ConfigDict rather than the old Config class format. This is a major change that simplifies and makes configuration more flexible.

What it looks like now:

Instead of writing:

class MyModel(BaseModel):
    class Config:
        from_attributes = True

now we use ConfigDict:

from pydantic import BaseModel, ConfigDict

class MyModel(BaseModel):
    model_config = ConfigDict(from_attributes=True)

Main ConfigDict options

from_attributes=True - allows you to create a model object directly from Python object attributes (e.g. when model fields match attributes of another object). Most often, this option is used to convert ORM models to Pydantic models.
str_to_lower, str_to_upper - convert all model strings to lower or upper case
str_strip_whitespace - whether to remove leading and trailing whitespace for str types (similar to strip)
str_min_length, str_max_length - sets the maximum and minimum string length for all string fields
use_enum_values - whether to populate models with values chosen from enums instead of using raw values? This is often needed when working with ORM models where columns are defined as enums (ENUM).

Why is this important?

Because it makes model configuration more explicit, convenient, and flexible. Using ConfigDict allows you to define model parameters directly through a dictionary, avoiding the need to create nested classes, which simplifies the configuration process and makes it more intuitive. This is especially useful when working with large and complex data models that require a high degree of customization.

Now you can easily define and change model parameters without diving into cumbersome hierarchical constructs. This contributes to better readability and maintainability of the code, speeding up the development process.

For a more detailed study of all ConfigDict parameters and methods, I recommend referring to the official documentation where you will find a full description of the capabilities and examples of use.

Inheritance in Pydantic 2

Inheritance in Pydantic allows you to create models that can override or extend attributes and methods of their parent models. This makes your code more flexible and helps avoid duplication.

An example of how inheritance works in Pydantic:

from pydantic import BaseModel


class ParentModel(BaseModel):
    name: str
    age: int


class ChildModel(ParentModel):
    school: str


parent = ParentModel(name="Alex", age=40)
child = ChildModel(name="Bob", age=12, school="Greenwood High")

In this example, ChildModel inherits the name and age fields from ParentModel and adds its own school field.

Advantages:

Code reuse: Common fields and methods can be defined in the base model.
Extensibility: Child models can add new fields or methods.

In Pydantic, when you create a base class with settings, those settings and attributes can be inherited by child classes. This avoids code duplication and makes it easier to maintain consistency.

For example, you can define common parameters in the base model and then use inheritance to create specialized models that automatically receive all the properties and settings of the base class, but can add their own unique attributes. This approach makes the code cleaner and more manageable.

Pydantic 2 with ORM using SQLAlchemy as an example

If you follow my publications on Habr, then you know about the series of articles dedicated to working with the SQLAlchemy Python framework. In the previous article "Asynchronous SQLAlchemy 2: Step-by-Step Guide to Session Management, Adding and Retrieving Data with Pydantic" we looked at the integration of SQLAlchemy with Pydantic in detail

Today I will briefly talk about the key concepts that explain why SQLAlchemy needs Pydantic and give a practical example.

Why does SQLAlchemy need Pydantic?

When working with SQLAlchemy in ORM style, we encounter one significant inconvenience: data is returned as table model objects, which is not always convenient for further processing. Developers usually prefer to work with data in JSON or Python dictionaries. This is where Pydantic comes to the rescue.

The principle of SQLAlchemy and Pydantic integration

The process of SQLAlchemy and Pydantic integration can be described in the following steps:

Describing the table model in SQLAlchemy
Creating a Pydantic model to work with the received data
Querying data from the table via SQLAlchemy
Converting a SQLAlchemy object to a Pydantic object
Using the model_dump or model_dump_json methods to obtain data in the required format

Practical example

Let's look at an example of SQLAlchemy and Pydantic integration:

from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.orm import declarative_base, sessionmaker
from pydantic import BaseModel, ConfigDict


# Step 1: Describing the SQLAlchemy Model
Base = declarative_base()


class UserORM(Base):
    __tablename__ = "users"
    id = Column(Integer, primary_key=True)
    name = Column(String)
    email = Column(String)


# Step 2: Create a Pydantic Model
class UserPydantic(BaseModel):
    id: int
    name: str
    email: str

    model_config = ConfigDict(from_attributes=True)


# Setting up a database and session
engine = create_engine("sqlite:///example.db")
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)


# Step 3: Request data
def get_user(user_id: int):
    with Session() as session:
        user = session.select(UserORM).filter_by(id = user_id).scalar_one_or_none()

        # Step 4: Convert SQLAlchemy Object to Pydantic
        if user:
            user_pydantic = UserPydantic.from_orm(user)

            # Step 5: Getting the data in the right format
            return user_pydantic.dict()
        return None


# Example of use
user_data = get_user(1)
print(user_data)

Here we see the use of from_orm() methods to create a Pydantic model from an ORM object and dict() to convert the Pydantic model to a dictionary.

Modern approach

With new versions of Pydantic (especially 2.0+), the following style is recommended:

def get_user(user_id: int):
    with Session() as session:
        user = session.select(UserORM).filter_by(id = user_id).scalar_one_or_none()

        # Step 4: Convert SQLAlchemy Object to Pydantic
        if user:
            user_pydantic = UserPydantic.model_validate(user)

            # Step 5: Getting the data in the right format
            return user_pydantic.model_dump()
        return None

Key changes and rationale

from_orm() → model_validate()

from_orm() is now an alias for model_validate().
model_validate() is more generic and can be used with more than just ORM objects.
It is recommended to use model_validate() for greater flexibility and compliance with modern Pydantic standards.

dict() → model_dump()

dict() has become an alias for model_dump().
model_dump() provides more customizable output.
Using model_dump() makes code more explicit and compliant with the new Pydantic API.

While both code options are functional, switching to the new methods (model_validate() and model_dump()) is recommended for the following reasons:

Compliance with modern Pydantic standards and recommendations.
Improved code readability and explicit intent.
Ability to use additional options available in the new methods.
Preparing code for future library updates.

Using from_attributes in model_validate

In Pydantic 2.0+, the model_validate method has become more flexible and convenient. You can directly specify from_attributes=True when calling the method:

user_pydantic = UserPydantic.model_validate(user, from_attributes=True)

This allows you to dynamically control whether object attributes will be used to create a model, without having to change the configuration of the model itself.

Other useful attributes of model_validate

In addition to from_attributes, the model_validate method has several other useful attributes:

strict: bool | None

When set to True, enforces strict type validation.
Example: model_validate(data, strict=True)

context: Any | None

Allows you to pass additional context to validators.
Example: model_validate(data, context={'user_id': 123})

from_attributes: bool | None

As we have already discussed, allows you to extract data from object attributes.

Usage example

class User:
    def __init__(self, name: str, age: int):
        self.name = name
        self.age = age


class UserModel(BaseModel):
    name: str
    age: int


user = User("Alice", 30)


# Using from_attributes
user_model = UserModel.model_validate(user, from_attributes=True)


# Using strict and context
data = {"name": "Bob", "age": "25"}
user_model = UserModel.model_validate(
    data,
    strict=True,
    context={"source": "external_api"},
    from_attributes=False
)

These attributes provide flexibility in data validation, allowing you to tailor the process to the specific requirements of your application.

Using from_attributes directly in model_validate is especially convenient when you need to dynamically switch between different data sources without changing the model configuration.

The concept of reverse validation

The idea is to use Pydantic models not only for output validation, but also for structuring the input parameters of database queries. This ensures type safety and ease of use when forming filters for queries.

Example implementation

1. Defining a Pydantic model for filters

from pydantic import BaseModel, ConfigDict


class TelegramIDModel(BaseModel):
    telegram_id: int

    model_config = ConfigDict(from_attributes=True)

2. Method for searching in the database

from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from sqlalchemy.exc import SQLAlchemyError


class BaseRepository:
    @classmethod
    async def find_one_or_none(cls, session: AsyncSession, filters: BaseModel):
        filter_dict = filters.model_dump(exclude_unset=True)
        try:
            query = select(cls.model).filter_by(**filter_dict)
            result = await session.execute(query)
            return result.scalar_one_or_none()
        except SQLAlchemyError as e:
            # Error handling
            raise

3. Usage

async def get_user_by_telegram_id(session: AsyncSession, telegram_id: int):
    filters = TelegramIDModel(telegram_id=telegram_id)
    user = await UserRepository.find_one_or_none(session, filters)
    return user

By the way, in my universal template for creating Telegram bots based on Aiogram 3 and SQLAlchemy, I used a more complex example. If you are interested in the source code of this template and other exclusive content that I do not publish on Habr, you can get it for free in my Telegram channel "Easy way to Python".

Advantages of this approach

Type safety: Pydantic provides input data type validation.
Flexibility: It is easy to create different filter models for different queries.
Readability: The code becomes more understandable and structured.
Reusability: Filter models can be used in different parts of the application.

This approach demonstrates how Pydantic can be effectively used not only for output data validation, but also for structuring query input parameters. This creates an additional level of abstraction between the business logic and the data access layer, which improves the overall architecture of the application.

This method is especially useful in large projects where consistency and type safety when working with a database must be ensured.

Practical application of Pydantic in various areas

Pydantic is a powerful tool that finds wide application in various areas of development. Here are some key areas where Pydantic can be especially useful:

1. Web development

Pydantic is often used in web frameworks such as FastAPI and Flask to validate incoming data from users. This allows developers to easily process JSON requests and ensure that the data matches the expected types.

On Habr, I have published more than ten large articles about developing your own API using FastApi. If you are interested in this topic, I recommend that you take a look at my profile. There you will find many examples of how to use FastApi together with Pydantic.

2. API and microservices

When creating RESTful APIs, Pydantic helps define data schemas and automatically generate documentation. This simplifies integration between different services and ensures data consistency.

3. Configuration processing

Pydantic can be used to work with configuration files (for example, JSON or YAML). This allows you to easily load and validate application settings, minimizing the likelihood of errors.

4. Data processing

In projects related to data analysis or machine learning, Pydantic helps validate input data and ensure that it meets the necessary requirements before processing it.

5. Testing

Pydantic can be useful when writing tests, allowing you to create dummy data with a guarantee of its correctness. This simplifies the testing process and increases its reliability.

Conclusion

Friends, today we have taken an exciting journey into the world of Pydantic 2, and I hope that this article has become a useful and informative guide for you. Let's briefly summarize:

We have covered the basics of Pydantic and its importance in the Python ecosystem.
We have studied the key concepts of Pydantic 2: models, fields, validation, ConfigDict and inheritance.
We have looked at the integration of Pydantic with ORM, in particular with SQLAlchemy.
We have discussed the practical application of Pydantic in various areas of development.
Noted the performance and functionality improvements of Pydantic 2.

Remember that to fully master Pydantic 2, you need to practice. Experiment with the code and integrate Pydantic into your projects.

If you liked the article, don't forget to like and leave a comment. Your opinion is very important to me.

For even more exclusive content, join my Telegram channel "Easy Path to Python". And if you need to deploy a project on a server and deliver updates to it with three commands in the IDE, register in Amverum Cloud, and get $1 for testing the functionality.

Thank you for your time! Good luck with your projects, and may Pydantic 2 become your reliable assistant in the world of Python development!