How to create index for MongoDB Collection using Python?
Creating indexes in MongoDB improves query performance, especially on large datasets. PyMongo, the official MongoDB driver for Python, provides the create_index() method to define indexes on specific fields. By default, MongoDB indexes the _id field, but adding custom indexes is important for optimizing complex queries.
Syntax
collection.create_index([(field, direction)], **options)
Parameters:
- [(field, direction)]: A list of tuples where each tuple contains the field name and the direction (1 for ascending, -1 for descending).
- **options: (Optional) Additional options such as unique=True makes the index unique and name="custom_index_name" gives a name to the index.
Returns: This method returns the name of the created index as a string.
Examples
Example 1: Create a single field index
from pymongo import MongoClient
c = MongoClient("mongodb://localhost:27017/")
db = c["mydb"]
col = db["students"]
col.drop()
col.insert_many([
{"name": "Ava", "age": 22},
{"name": "Liam", "age": 24},
{"name": "Emma", "age": 20}
])
idx_name = col.create_index([("name", 1)])
print(idx_name)
Output
name_1
Explanation: create_index([("name", 1)]) creates an ascending index on the name field to speed up queries and sorting and returns "name_1", the default name based on the field and sort order.
Example 2: Create a compound index (multiple fields)
from pymongo import MongoClient
c = MongoClient("mongodb://localhost:27017/")
db = c["mydb"]
col = db["students"]
col.drop()
col.insert_many([
{"name": "Ava", "city": "New York"},
{"name": "Liam", "city": "Chicago"},
{"name": "Emma", "city": "Boston"}
])
idx_name = col.create_index([("name", 1), ("city", -1)])
print(idx_name)
Output
name_1_city_-1
Explanation: This creates a compound index on name (ascending) and city (descending). It helps optimize queries that filter or sort on both fields and returns the default index name "name_1_city_-1".
Example 3: Create a unique index
from pymongo import MongoClient
c = MongoClient("mongodb://localhost:27017/")
db = c["mydb"]
col = db["students"]
col.drop()
col.insert_many([
{"email": "[email protected]"},
{"email": "[email protected]"},
{"email": "[email protected]"}
])
idx_name = col.create_index([("email", 1)], unique=True)
print(idx_name)
Output
email_1
Explanation: Creates an ascending unique index on the email field to prevent duplicates. Inserting a duplicate triggers a DuplicateKeyError.
Related article