Shrijith Venkatramana

Posted on Jun 24

MeiliSearch Indexing Best Practices

#programming #beginners #productivity #devops

Hi there! I'm Shrijith Venkatrama, founder of Hexmos. Right now, I’m building LiveAPI, a first of its kind tool for helping you automatically index API endpoints across all your repositories. LiveAPI helps you discover, understand and use APIs in large tech infrastructures with ease.

MeiliSearch is a fast, open-source search engine that’s easy to set up and delivers instant results. But to make it shine, you need to nail indexing. Indexing is how MeiliSearch organizes your data to make searches lightning-fast. Get it wrong, and your search feels sluggish or returns irrelevant results. This guide dives into practical, developer-friendly best practices for MeiliSearch indexing, complete with examples and tips to make your search experience awesome.

Let’s break it down into actionable steps.

Understand Your Data Before Indexing

Before you start throwing data at MeiliSearch, take a moment to analyze it. Know your dataset’s structure, size, and search requirements. Are you indexing a blog with titles and content? An e-commerce catalog with products and prices? Each use case needs a tailored approach.

Identify searchable fields: Only index fields users will search. For a blog, that’s likely title and content, not created_at.
Check data volume: MeiliSearch handles millions of documents, but large datasets need careful chunking.
Plan for updates: If data changes often, like product stock, optimize for frequent updates.

Example: For an e-commerce site, here’s a sample product document:

{
  "id": 1,
  "name": "Wireless Headphones",
  "description": "High-quality noise-canceling headphones",
  "price": 99.99,
  "category": "Electronics",
  "stock": 50
}

Only index name, description, and category if those are searchable. Skip stock unless users search by it.

Tip: Use MeiliSearch’s schema inference to preview how your data maps to the index.

Choose the Right Primary Key

Every MeiliSearch document needs a unique primary key. This is how MeiliSearch identifies and updates documents. Pick a field that’s unique and stable. If you don’t specify one, MeiliSearch won’t index your data.

Good choices: Database IDs, UUIDs, or SKU codes.
Bad choices: Names, emails, or anything non-unique or changeable.

Example: Indexing products with Node.js:

const { MeiliSearch } = require('meilisearch');

const client = new MeiliSearch({
  host: 'http://localhost:7700',
  apiKey: 'your-api-key',
});

async function indexProducts() {
  const products = [
    { id: 1, name: 'Wireless Headphones', description: 'Noise-canceling' },
    { id: 2, name: 'Bluetooth Speaker', description: 'Portable' },
  ];

  await client.index('products').addDocuments(products, { primaryKey: 'id' });
  console.log('Products indexed!');
}

indexProducts();
// Output: Products indexed!

Here, id is the primary key. If you try to index without it or use a non-unique key, MeiliSearch will throw an error.

Tip: Always set the primary key explicitly to avoid surprises.

Optimize Field Selection with Searchable Attributes

MeiliSearch indexes every field by default, but that’s not always efficient. Use searchable attributes to control which fields are searchable. This reduces index size and speeds up queries.

Limit to relevant fields: For a blog, index title and content, not author_id.
Exclude metadata: Fields like timestamps or internal IDs rarely need searching.

Example: Configure searchable attributes for a blog index:

const { MeiliSearch } = require('meilisearch');

const client = new MeiliSearch({
  host: 'http://localhost:7700',
  apiKey: 'your-api-key',
});

async function setSearchableAttributes() {
  await client.index('posts').updateSearchableAttributes(['title', 'content']);
  console.log('Searchable attributes updated!');
}

setSearchableAttributes();
// Output: Searchable attributes updated!

Table: Searchable vs. Non-Searchable Fields

Field	Searchable?	Reason
title	Yes	Users search by post titles
content	Yes	Main searchable text
created_at	No	Not relevant for search queries
author_id	No	Internal metadata, not searchable

Link: Check MeiliSearch’s searchable attributes docs for more.

Fine-Tune Relevance with Ranking Rules

MeiliSearch’s ranking rules decide how results are ordered. The default rules work well, but tweaking them can make results more relevant. Customize ranking rules based on user expectations.

Default rules (in order):

Words
Typo
Proximity
Attribute
Sort
Exactness

For an e-commerce site, you might want to prioritize products in stock or sort by price. Here’s how:

Example: Add a custom ranking rule for products:

const { MeiliSearch } = require('meilisearch');

const client = new MeiliSearch({
  host: 'http://localhost:7700',
  apiKey: 'your-api-key',
});

async function setRankingRules() {
  await client.index('products').updateRankingRules([
    'words',
    'typo',
    'proximity',
    'attribute',
    'sort',
    'exactness',
    'stock:desc', // Prioritize in-stock products
  ]);
  console.log('Ranking rules updated!');
}

setRankingRules();
// Output: Ranking rules updated!

Tip: Test ranking rules with sample queries to ensure results match user intent.

Handle Large Datasets with Chunked Indexing

Indexing millions of documents at once can overwhelm MeiliSearch. Break large datasets into smaller chunks to keep things smooth. Aim for 10,000–50,000 documents per batch, depending on your server’s resources.

Example: Chunked indexing in Python:

from meilisearch import Client
import math

client = Client('http://localhost:7700', 'your-api-key')
index = client.index('products')

# Sample large dataset
products = [{'id': i, 'name': f'Product {i}', 'description': 'Sample'} for i in range(100000)]
chunk_size = 10000

# Index in chunks
for i in range(0, len(products), chunk_size):
    chunk = products[i:i + chunk_size]
    index.add_documents(chunk, primary_key='id')
    print(f'Indexed {i + len(chunk)} products')

# Output:
# Indexed 10000 products
# Indexed 20000 products
# ...
# Indexed 100000 products

Table: Chunk Size Recommendations

Dataset Size	Chunk Size	Notes
< 100,000	10,000	Good for most servers
100,000–1,000,000	50,000	Monitor server memory
> 1,000,000	50,000–100,000	Test for optimal performance

Tip: Use async indexing to track progress with MeiliSearch’s task API.

Keep Indexes Fresh with Efficient Updates

If your data changes often, like product prices or stock, you need to update the index efficiently. Use partial updates or replace only changed documents to avoid reindexing everything.

Example: Update a product’s price in Node.js:

const { MeiliSearch } = require('meilisearch');

const client = new MeiliSearch({
  host: 'http://localhost:7700',
  apiKey: 'your-api-key',
});

async function updateProductPrice() {
  const updatedProduct = { id: 1, price: 89.99 };
  await client.index('products').updateDocuments([updatedProduct]);
  console.log('Product price updated!');
}

updateProductPrice();
// Output: Product price updated!

Tip: For frequent updates, batch them to reduce API calls. MeiliSearch’s update documents guide has more details.

Monitor and Optimize Index Size

Large indexes slow down searches and eat up storage. Regularly check index size and optimize it. MeiliSearch provides stats to monitor this.

Remove unused fields: Drop non-searchable fields from the index.
Compress data: Shorten field names (e.g., desc instead of description).
Delete old data: Remove outdated documents.

Example: Check index stats in Python:

from meilisearch import Client

client = Client('http://localhost:7700', 'your-api-key')
index = client.index('products')

stats = index.get_stats()
print(f"Number of documents: {stats['number_of_documents']}")
print(f"Index size: {stats['index_size']} bytes")

# Output (example):
# Number of documents: 1000
# Index size: 524288 bytes

Tip: If the index grows too large, consider splitting it into multiple smaller indexes by category or region.

Test and Iterate for Better Search Quality

Indexing isn’t a one-and-done task. Test your search results regularly to ensure they’re relevant and fast. Use real user queries to spot issues.

Run sample searches: Check if results match expectations.
Tweak settings: Adjust searchable attributes or ranking rules as needed.
Gather feedback: Ask users what works or doesn’t.

Example: Search test in Node.js:

const { MeiliSearch } = require('meilisearch');

const client = new MeiliSearch({
  host: 'http://localhost:7700',
  apiKey: 'your-api-key',
});

async function testSearch() {
  const results = await client.index('products').search('headphones');
  console.log('Search results:', results.hits);
}

testSearch();
// Output (example):
// Search results: [{ id: 1, name: 'Wireless Headphones', description: 'Noise-canceling' }]

Tip: Use MeiliSearch’s query parameters like limit or filter to refine results during testing.

Keep Your Search Snappy and Relevant

MeiliSearch’s power lies in its simplicity and speed, but indexing is where the magic happens. By understanding your data, choosing the right primary key, optimizing fields, and fine-tuning relevance, you can build a search experience that users love. Chunk large datasets, keep updates efficient, and monitor index size to maintain performance. Most importantly, test and iterate to ensure your search stays sharp.

Experiment with these practices in your project. Start small, tweak as you go, and watch your search transform into something fast, accurate, and delightful.