Roberto B.

Posted on Jun 16

Building a RAG (Retrieval-Augmented Generation) system in PHP with Neuron AI

#php #tutorial #rag #ai

In this article, we explore how to build a simple but powerful Retrieval-Augmented Generation (RAG) system using Neuron AI and Ollama.
You'll learn what RAG is, why it's useful, and how to implement it in PHP by leveraging Neuron AI's RAG capabilities with a local model via Ollama.

What is Retrieval-Augmented Generation (RAG)?

RAG is an AI technique that combines the strengths of information retrieval with natural language generation. Instead of relying purely on what a model has memorized during training, a RAG system actively fetches relevant data from an external source and uses it to craft accurate, contextually relevant responses.

Why use RAG?

Accuracy: ground responses in actual documents rather than model guesses.
Custom knowledge: use your own content as the source of truth.
Flexibility: update knowledge without retraining the model.

Real-world use cases

Internal knowledge assistants: help teams query internal wikis or documentation.
Customer support bots: pull answers from support articles.
Compliance tools: reference legal documents and policies.
Academic helpers: answer questions based on research papers

How RAG uses your documentation for generating smarter and updated answers

In many projects, technical documentation is stored across a collection of files, often in Markdown format. These documents contain valuable knowledge, but they're typically accessed through a simple keyword search, which doesn't always yield the most relevant or helpful results.

This is where Retrieval-Augmented Generation (RAG) shines.

The concept

RAG enhances a general-purpose AI model by combining it with your own documentation. Here's how it works:

Ingest documentation: you feed your local documentation files (e.g., Markdown) into the system.
Create embeddings: each document is transformed into a mathematical representation (an embedding) helpful for searching.
Semantic search: when a user asks a question, the system doesn’t just look for matching words; it uses embeddings to find the documents that are most contextually relevant to the query.
Enrich the prompt: the selected documents are automatically added as context to the user’s prompt.
Generate the answer: the AI model receives the enriched prompt and produces a response that reflects both its general knowledge and your specific, up-to-date content.

Why this matters

Dynamic knowledge: unlike general-purpose models that may be outdated, your AI agent uses current, project-specific information, directly from your documents.
Domain expertise: you can train the model on internal documentation, user guides, or proprietary workflows, enabling it to respond with expert-level insight.
Custom and trustworthy: responses are grounded in the exact information you've curated, reducing hallucination and increasing trust.

Getting started with Neuron AI

First, install Neuron AI via Composer:

composer require inspector-apm/neuron-ai

Make sure you also have Ollama running locally with models like llama3.2:latest for generation and nomic-embed-text for embedding.

New to Neuron AI?

Start with my article Building My First AI Agent with Neuron AI and Ollama, a practical guide for PHP developers.
And if you prefer video, check out the YouTube version for a gentle intro to Neuron AI and PHP:

Implementing a RAG agent in PHP

Here’s an example of a RAG agent called RobertitoBot, which can answer questions based on the contents of a local folder:

<?php

require './vendor/autoload.php';

use NeuronAI\Chat\Messages\UserMessage;
use NeuronAI\Providers\AIProviderInterface;
use NeuronAI\Providers\Ollama\Ollama;
use NeuronAI\RAG\DataLoader\FileDataLoader;
use NeuronAI\RAG\Embeddings\EmbeddingsProviderInterface;
use NeuronAI\RAG\Embeddings\OllamaEmbeddingsProvider;
use NeuronAI\RAG\RAG;
use NeuronAI\RAG\VectorStore\FileVectorStore;
use NeuronAI\RAG\VectorStore\VectorStoreInterface;

class RobertitoBot extends RAG
{
    protected function provider(): AIProviderInterface
    {
        return new Ollama(
            url: 'http://localhost:11434/api',
            model: 'llama3.2:latest',
        );
    }

    public function instructions(): string
    {
        return "You are an AI Agent specialized in Laravel and providing answers using the information I provided you as extra content.";
    }

    protected function embeddings(): EmbeddingsProviderInterface
    {
        return new OllamaEmbeddingsProvider(
            url: 'http://localhost:11434/api',
            model: 'nomic-embed-text',
        );
    }

    protected function vectorStore(): VectorStoreInterface
    {
        return new FileVectorStore(
            directory: __DIR__,
            name: "laravel-doc",
            ext: ".store",
            topK: 4
        );
    }
}

$documents = FileDataLoader::for(
    '/path/to/your/knowledgebase'
)->getDocuments();

$robertito = RobertitoBot::make();
$robertito->addDocuments($documents);

$response = $robertito->answer(
    new UserMessage(
        'How does Laravel handle route model binding?'
    )
);
echo $response->getContent();

Summary of key steps

Let’s break down the core steps required to build a Retrieval-Augmented Generation (RAG) system in PHP using Neuron AI and Ollama:

1. Create a custom RAG class

We start by extending Neuron AI’s RAG base class. This allows us to define how the AI model, embedding model, and vector store are configured.


class RobertitoBot extends RAG
{
    protected function provider(): AIProviderInterface
    {
        return new Ollama(
            url: 'http://localhost:11434/api',
            model: 'llama3.2:latest',
        );
    }

      public function instructions(): string
    {
        return "You are an AI Agent specialized in Laravel and providing answers using the information I provided you as extra content.";
    }

    protected function embeddings(): EmbeddingsProviderInterface
    {
        return new OllamaEmbeddingsProvider(
            url: 'http://localhost:11434/api',
            model: 'nomic-embed-text',
        );
    }

    protected function vectorStore(): VectorStoreInterface
    {
        return new FileVectorStore(
            directory: __DIR__,
            name: "laravel-doc",
            ext: ".store",
            topK: 4
        );
    }
}

provider() tells Neuron AI which local LLM to use for generating responses.
instructions() tells Neuron AI to add instruction information to the LLM
embeddings() specifies how document vectors are created for similarity search.
vectorStore() defines how and where the vectors are stored. In this case, we are using a file named laravel-doc.store to store the embedding vectors. Then, you can define how many "most relevant" documents you want to use to instruct the model.

2. Load the knowledge base

Next, we ingest documents that will be used as reference context. In this case, we load a local directory of Markdown files:

$documents = FileDataLoader::for(
    '/Users/roberto/Projects/Storyblok/KnowledgeBase',
)->getDocuments();

This step uses FileDataLoader, which scans the directory and parses text content into a format Neuron AI can work with.

By default, this process creates a file named neuron.store (but you can customize it with name and ext parameters (in the example laravel-doc.store) when you create the instance of the vector store class), which contains the generated embeddings of your documentation.

In the provided example, we’re using FileVectorStore to store embeddings in a single file. While this is perfectly suitable for local testing or simple use cases, Neuron AI also supports more scalable and optimized vector store implementations, ideal for production environments or large documentation sets. More info about the available vector store: https://docs.neuron-ai.dev/components/vector-store

3. Instantiate and populate the RAG agent

Now we create an instance of our RobertitoBot and feed it the documents for vector indexing:

$robertito = RobertitoBot::make();
$robertito->addDocuments($documents);

This step:

Initializes the customer RAG class
Converts document content into embeddings
Stores the vector representations for later retrieval

4. Ask a question and get a context-aware answer

Finally, we prompt the agent with a user message. Neuron AI retrieves the most relevant documents, injects them as context, and sends the full prompt to the LLM:

$response = $robertito->answer(
    new UserMessage(
        'Tell me about "The Default Route Files"',
    )
);
echo $response->getContent();

This is the core of RAG: it grounds the LLM's response in your actual data, reducing hallucinations and increasing relevance.

Practical example: building a RAG system with Laravel documentation

To wrap up, let’s apply what we’ve learned in a real-world scenario: using Laravel’s official documentation as the source for a RAG-powered assistant.

Laravel’s documentation is open source and written in Markdown, making it a perfect candidate for this kind of integration.

Step 1: clone the Laravel docs repository

Start by cloning the Laravel documentation repository from GitHub:

git clone https://github.com/laravel/docs.git laravel-docs

This will give you access to all versioned documentation files in Markdown format.

Step 2: load the documentation

Using FileDataLoader, you can point to the relevant directory in the cloned repository:

$documents = FileDataLoader::for(
    __DIR__ . '/laravel-docs/'
)->getDocuments();

This will read and parse all .md files into Neuron AI-compatible documents.

Step 3: create your Laravel documentation RAG agent

You can reuse the RAG subclass from earlier in the article, adapting it to work with the Laravel docs. Here's a minimal version:


class LaravelDocBot extends RAG
{
    protected function provider(): AIProviderInterface
    {
        return new Ollama(
            url: 'http://localhost:11434/api',
            model: 'llama3.2:latest',
        );
    }

    protected function embeddings(): EmbeddingsProviderInterface
    {
        return new OllamaEmbeddingsProvider(
            url: 'http://localhost:11434/api',
            model: 'nomic-embed-text',
        );
    }

    protected function vectorStore(): VectorStoreInterface
    {
        return new FileVectorStore(directory: __DIR__ . '/laravel-docs');
    }
}

$laravelBot = LaravelDocBot::make();
$laravelBot->addDocuments($documents);

Step 4: Ask questions about Laravel

Now you're ready to interact with the documentation through your AI assistant:


$response = $laravelBot->answer(
    new UserMessage(
        'How does Laravel handle route model binding?'
    )
);

echo $response->getContent();

The system will retrieve the most relevant document sections and use them as context to generate an accurate, up-to-date response.

Why this is powerful

By applying the RAG pattern to Laravel’s docs:

You're making official documentation searchable with semantic understanding, not just keywords.
You enable natural language queries for internal teams or tools.
You can keep the system updated simply by pulling the latest docs from the repo.

This same approach works with any Markdown-based documentation, including private repos or internal wikis.

An important note about loading documents

When using Neuron AI's RAG system, it’s important to understand how document loading and vector storage work under the hood.

Each time you call $rag->addDocuments($documents), Neuron AI generates embeddings for those documents and updates the vector store defined in your implementation (e.g., FileVectorStore). This process can be time-consuming depending on the size and number of documents.

Optimize by reusing the vector store

If your documentation doesn’t change frequently, you should only run addDocuments() when the documentation is updated. Once the embeddings are stored, they persist in the vector store (like the neuron.store file), and there’s no need to reprocess them for every query.

For subsequent questions or user interactions based on the same documentation, you can skip the document-loading step entirely and go straight to querying:

$laravelBot = LaravelDocBot::make();
$response = $laravelBot->answer(new UserMessage('How does Laravel handle queues?'));
echo $response->getContent();

This significantly reduces response time and avoids unnecessary computation.

To recap:

Run addDocuments() only when your source documentation changes.
After that, Neuron AI can reuse the saved vector store for all future queries.
This makes your RAG system more efficient, especially in production use.

Final thoughts

Neuron AI makes it surprisingly simple to build advanced AI features like RAG in PHP. Combined with Ollama, it empowers you to run everything locally and securely. Whether you're working on internal tools, chatbots, or smart document assistants, this approach gives you a solid foundation for reliable, explainable AI.

If you're a PHP developer curious about applied AI, give RAG with Neuron AI a try, it might change how you think about building apps!

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.