Mahinsha Nazeer

Posted on Jun 25 • Originally published at Medium on Jun 25

Run AI Locally: Creating a Local AI Chat Assistant for Targeted Workflows

#llm #ollama #chatbotdevelopment #ai

Building a Local AI Chatbot using WebUI with Ollama

In this blog, I will walk you through setting up a local AI chatbot using Ollama and Open WebUI.

Open WebUI is a self-hosted, open-source web interface designed for interacting with Large Language Models (LLMs). It offers a clean, ChatGPT-style UI that allows users to work with various LLMs — whether hosted locally or through OpenAI-compatible APIs. The goal is to make LLM usage more accessible and privacy-focused, especially for users who prefer to keep their data and models local.

Ollama is an open-source framework that enables running LLMs directly on your local machine. It ensures full data privacy by handling all model execution and storage locally, eliminating reliance on cloud infrastructure. With a simple CLI and developer-friendly API, Ollama streamlines the deployment and management of LLMs on personal systems.

I am using a Raspberry Pi 5 (8GB variant) for this project. The entire setup will be containerised using Docker for better portability and isolation.

Before proceeding, ensure the following prerequisites are met based on your environment:

For Raspberry Pi or other remote servers : Confirm that SSH access is properly configured and operational.
For cloud instances (e.g., AWS EC2, Azure Compute Engine): Make sure the instance is accessible over the network. You will need:
SSH access to manage and deploy containers remotely.
Port 4444 is open in the security group/firewall, as Open WebUI will run on this port.

Proper connectivity and access are essential for a smooth setup and for interacting with the WebUI once deployed.

1. Docker Installation

If you’re unfamiliar with installing or configuring Docker, please refer to the official Docker documentation for your operating system:

👉 https://docs.docker.com/engine/install/

👉 https://medium.com/@mahinshanazeer/raspberry-pi-k8s-cluster-setup-for-home-lab-with-cilium-c861f7815511 #title 3 guides to how to configure Docker on your machine.

This guide provides step-by-step instructions for all major platforms and ensures your Docker environment is set up correctly before moving forward.

2. Installing Ollama

Ollama is an open-source tool designed to simplify the deployment and management of large language models (LLMs) directly on your local machine. Short for Omni-Layer Learning Language Acquisition Model, Ollama represents a modern approach to natural language processing by making advanced AI technology more accessible.

It provides a user-friendly interface that enables customisation and usage of LLMs without requiring deep technical expertise or reliance on cloud services. Capable of generating human-like text, Ollama is ideal for both individuals and teams, unlocking the power of open-source models for a wide range of applications.

To get started with Ollama , run the following command in your terminal:

curl -fsSL https://ollama.com/install.sh | sh

This one-liner fetches and installs the latest version of Ollama , automatically detecting your operating system and handling the setup process. It’s quick, hassle-free, and requires no manual intervention.

If you need the source code or have specific installation requirements, you can visit the official repository at:

Download Ollama on macOS

Once installation is completed, enable and start ollama using the following commands:

sudo systemctl enable --now ollama
sudo systemctl start ollama

ollama service status

Once the installation is complete, you’re ready to move on to the next steps.

3. Web-UI Setup

Let’s begin by creating a project directory and setting up Docker Compose.

First, create a new folder namedopen-web-ui, and inside it, create a compose.yaml file with the following content:

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    volumes:
      - ./data:/app/backend/data
    ports:
      - 4444:8080
    extra_hosts:
      - "host.docker.internal:172.17.0.1"
    restart: unless-stopped

Breakdown of the Configuration:

Image: ghcr.io/open-webui/open-webui:main

Pulls the open-webui container image from GitHub Container Registry using the main tag.

Container Name: open-webui

The running container will be named open-webui for easy identification.

Volumes: ./data:/app/backend/data

Mounts a local directory (./data) to the container’s internal path. Ensures persistent storage of backend data.

Ports: 4444:8080

Maps port 8080 inside the container to port 4444 on the host. You access the service at http://localhost:4444.

Extra Hosts : “host.docker.internal:172.17.0.1”

Adds a custom hostname resolution inside the container. It maps host.docker.internal to the Docker host IP, enabling container-to-host communication.

Restart Policy : unless-stopped

Automatically restarts the container unless it is explicitly stopped by the user.

compose.yaml file

You can also create a directory data to ensure persistent storage for your application. This directory is already referenced in the Docker Compose file under the volumes section, where it maps to the container's internal data path. This setup ensures that your configurations and model data are retained even if the container is stopped or rebuilt.

Once your the compose.yaml file is ready, navigate to the project directory and run the following command:

docker compose -f compose.yaml up -d

When you run the docker compose up -d command, Docker Compose not only pulls and starts the container(s) defined in your compose_. A YAML_ file also creates a dedicated Docker network by default. This network is typically a bridge network, designed to facilitate isolated and secure communication between the containers defined within the same Compose project.

docker compose up output

When you run a container using docker run (typically built from a Dockerfile), Docker does not create a new network. Instead, it attaches the container to the default bridge network, unless a specific network is defined using the — network flag. In contrast, when you use docker compose up, Docker Compose automatically creates a dedicated bridge network (named _default) for the Compose project. This is because Docker Compose is designed to manage multi-container applications where services often need to communicate with each other.

To verify that the container is running, use:

docker ps

This will list all active containers. You should see a container named openwebui running and mapped to a port 4444 on your host.

If you want to view the Docker networks automatically created by the docker-compose command, use the following:

docker network ls

This command displays a list of all Docker networks, including those automatically created by Docker Compose.

docker networks

To view detailed information about a specific network, such as connected containers, IP address assignments, and internal settings, use the inspect command:

#to inspect any docker network 'docker network inspect <network_name> or <network-id>'
docker network inspect 699722204cd3

docker inspect output

Creating a project-scoped network enables built-in service discovery via container names, ensures isolation between different projects, and simplifies orchestration. This automatic networking behaviour is one of the key differences between Docker run and Docker Compose.

4. Accessing the Web Interface

Once the container is up and running, open your web browser and navigate to:

http://192.168.1.200:4444

Here, 192.168.1.200 is the local IP address of my Raspberry Pi. If you are running this setup on your local machine, you can simply use:

http://localhost:4444

This will load the Open WebUI interface, where you can begin interacting with your local LLM once it’s configured. Now, click on get started.

Web interface

On your first visit to the WebUI, you will be prompted to create an admin account. Fill in your desired credentials to proceed — this account will be used to manage access and interact with the interface securely.

admin account configuration

Now, return to the Open WebUI interface. On the top left, you will see an option called “Choose Your Model”. When you click on it, you’ll notice that the list is currently empty:

Available models

After successfully registering your admin account, return to the terminal to configure the models.

At this stage, no models will be visible in the UI because none have been configured yet. Model setup is essential for enabling the core functionality of the application.

5. Configuring AI Models in Ollama

In this setup, Ollama acts as the core LLM manager , handling the downloading, running, and management of AI models locally on your machine. Open WebUI , on the other hand, serves as a user-friendly graphical interface that allows you to interact with Ollama and its models more conveniently. While Ollama provides the engine and CLI for model operations, Open WebUI offers an accessible way to chat with and utilise those models without touching the terminal.

So to begin with, take a look at the available models in the Ollama Model Library at the link below, and choose the ones best suited for your system and use case:

library

Since we’re using a Raspberry Pi 5, it’s recommended to use a lightweight model like TinyLlama , which is well-suited for devices with limited resources. This ensures optimal performance without overloading the system.

Once you’ve selected a model from the Ollama library, return to your terminal and run the following command:

ollama list

At this point, the output will likely be empty since no models have been pulled yet. This command lists all AI models currently available on your local system. Although we haven’t downloaded any models yet, running this command first helps you get familiar with the basic usage before proceeding to pull a model.

available models in my device

Now, use the following command to pull the TinyLlama model:

ollama pull tinyllama

This will download the model and make it available for use within your local environment.

tinyllama

Now, return to the WebUI and reload the page. This refresh ensures that any newly pulled models are detected and listed in the “Select a model” dropdown. If everything is configured correctly, you should now see the available models ready for interaction.

web-ui

Now, try starting a conversation with the model through the WebUI. Simply type your prompt into the chat box and observe the response. This is where you can test the model’s capabilities, ask questions, or explore use cases — all directly from your browser.

If the models still don’t appear in the dropdown list, you’ll need to ensure that Ollama is properly integrated with the WebUI.

Make sure the Docker network is correctly configured as specified in the Docker Compose file.
Verify the WebUI settings to ensure it is correctly mapped to the Ollama API server.

Proper network alignment and API mapping are essential for the WebUI to detect and communicate with the available models. I’ve encountered issues where models didn’t appear simply due to misconfigured Docker networks. Ensuring that both the WebUI and Ollama containers share the same Docker network is critical for seamless communication.

6. Managing users in webUI:

In the Settings window of the WebUI, click on “Admin Settings”. This section gives you access to advanced configuration options, including user management, system preferences, and model-related settings available only to the administrator.

settings

admin settings

To add a new user, click on the “+” icon located at the top-right corner of the Users section. Fill in the required details such as username , email , and password , then click Create to add the user.

user settings

Now, log out of the current session and log in using the newly created user account. This will help you verify that the setup is functioning correctly and that the model integration is working as expected.

7. Model — configurations:

Once users are created, you can manage their access permissions from the Models section to ensure they have visibility and access to the necessary models.

Now, navigate to Settings → Models in the WebUI. This section allows you to view all available models, configure access permissions, and manage model visibility for different users. From here, you can control who has access to specific models and ensure proper access is granted across your user base.

Model settings

Next, click on the Edit button next to the model you want to configure. This will open the model’s settings panel, where you can update its details, manage access permissions, and assign it to specific users or roles as needed.

Model Visibility

Make sure the Visibility setting is set to Public. This ensures that the model is accessible to all users within the WebUI without needing individual assignment. If visibility is set to Private or Restricted, only explicitly assigned users will be able to access the model.

The Model Settings section allows fine-tuning of how the AI model behaves during interactions. Here’s a breakdown of the available options:

1. Model Parameters

Controls the model behaviour at runtime.

Key parameters include:

temperature: Controls randomness. Lower values make responses more deterministic.

top_p: Nucleus sampling to balance creativity.

max_tokens: Limits response length.

2. System Prompt

A predefined instruction that sets the behaviour or personality of the AI.

Useful for tailoring the assistant’s tone, domain knowledge, or role (e.g., “You are programmer”)

3. Advanced Parameters

Enables deeper customisation such as:

stop sequences: Tells the model where to stop generating.

presence_penalty / frequency_penalty: Prevents repetitive output_s.

etc.._

4. Prompt Suggestions

Predefined prompts are shown to users as quick actions.

Helps guide users on how to start conversations effectively.

5. Knowledge

Refers to the external documents or datasets the model can access during chat.

Enables context-aware responses using domain-specific information.

6. Select Knowledge

Allows you to choose which uploaded knowledge bases to link with the current session or model.

Supports fine-grained control over what the model references.

7. Tools

Integrations such as code interpreters, web access, file readers, or custom plugins.

Enhances the model’s capabilities beyond static text responses.

8. Filters

Allows content moderation or prompt sanitisation.

Can restrict topics, block certain terms, or adjust tone.

9. Actions

Predefined commands or scripts that the model can execute in response to prompts.

Useful for automation, such as restarting a server or fetching logs.

10. Capabilities

Toggles access to enhanced features (e.g., vision, voice, or plugin support).

Helps tailor the model to specific use cases and environments.

Conclusion

So far, we’ve covered the complete setup of Ollama and Open WebUI, including model configuration, user management, and ensuring smooth integration between the two. If you’re facing issues accessing models from the WebUI, double-check that the Docker network matches the configuration shown in the Compose file.

In the upcoming blog, we’ll dive into training your own AI agent for a specific requirement , including how to fine-tune models using custom data for targeted use cases. Stay tuned!

DEV Community