Skip to content

A python-based terminal client for interacting with MCP servers using Ollama. Features include multi-server support, dynamic model switching, streaming responses, tool management, Human-in-the-Loop control, Thinking Mode, and saved session preferences. Built for developers working with local LLMs.

License

Notifications You must be signed in to change notification settings

jonigl/mcp-client-for-ollama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

64 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

A simple yet powerful Python client for interacting with Model Context Protocol (MCP) servers using Ollama, allowing local LLMs to use tools.


MCP Client for Ollama (ollmcp)

Python 3.10+ PyPI - Python Version PyPI - Python Version Build, Publish and Release CI

MCP Client for Ollama Demo

Overview

This project provides a robust Python-based client that connects to one or more Model Context Protocol (MCP) servers and uses Ollama to process queries with tool use capabilities. The client establishes connections to MCP servers, sends queries to Ollama models, and handles the tool calls the model makes.

This implementation was adapted from the Model Context Protocol quickstart guide and customized to work with Ollama, providing a user-friendly interface for interacting with LLMs that support function calling.

Features

  • 🌐 Multi-Server Support: Connect to multiple MCP servers simultaneously
  • πŸš€ Multiple Transport Types: Supports STDIO, SSE, and Streamable HTTP server connections
  • 🎨 Rich Terminal Interface: Interactive console UI
  • πŸ–₯️ Streaming Responses: View model outputs in real-time as they're generated
  • πŸ› οΈ Tool Management: Enable/disable specific tools or entire servers during chat sessions
  • πŸ€– Human-in-the-Loop (HIL): Review and approve tool executions before they run for enhanced control and safety
  • 🎨 Enhanced Tool Display: Beautiful, structured visualization of tool executions with JSON syntax highlighting
  • 🧠 Context Management: Control conversation memory with configurable retention settings
  • πŸ€” Thinking Mode: Advanced reasoning capabilities with visible thought processes for supported models (deepseek-r1, qwen3)
  • πŸ”„ Cross-Language Support: Seamlessly work with both Python and JavaScript MCP servers
  • πŸ” Auto-Discovery: Automatically find and use Claude's existing MCP server configurations
  • πŸŽ›οΈ Dynamic Model Switching: Switch between any installed Ollama model without restarting
  • πŸ’Ύ Configuration Persistence: Save and load tool preferences between sessions
  • πŸ”„ Server Reloading: Hot-reload MCP servers during development without restarting the client
  • πŸ“Š Usage Analytics: Track token consumption and conversation history metrics
  • πŸ”Œ Plug-and-Play: Works immediately with standard MCP-compliant tool servers
  • πŸ”” Update Notifications: Automatically detects when a new version is available

Requirements

Quick Start

Option 1: Install with pip and run

pip install ollmcp
ollmcp

Option 2: One-step install and run

uvx ollmcp

Option 3: Install from source and run using virtual environment

git clone https://github.com/jonigl/mcp-client-for-ollama.git
cd mcp-client-for-ollama
uv venv && source .venv/bin/activate
uv pip install .
uv run -m mcp_client_for_ollama

Usage

Run with default settings:

ollmcp

If you don't provide any options, the client will use auto-discovery mode to find MCP servers from Claude's configuration.

Command-line Arguments

Server Options:

  • --mcp-server: Path to one or more MCP server scripts (.py or .js). Can be specified multiple times.
  • --servers-json: Path to a JSON file with server configurations.
  • --auto-discovery: Auto-discover servers from Claude's default config file (default behavior if no other options provided).

Tip

Claude's configuration file is typically located at: ~/Library/Application Support/Claude/claude_desktop_config.json

Model Options:

  • --model MODEL: Ollama model to use. Default: qwen2.5:7b
  • --host HOST: Ollama host URL. Default: http://localhost:11434

Usage Examples

Connect to a single server:

ollmcp --mcp-server /path/to/weather.py --model llama3.2:3b

Connect to multiple servers:

ollmcp --mcp-server /path/to/weather.py --mcp-server /path/to/filesystem.js --model qwen2.5:latest

Use a JSON configuration file:

ollmcp --servers-json /path/to/servers.json --model llama3.2:1b

Use a custom Ollama host:

ollmcp --host http://localhost:22545 --servers-json /path/to/servers.json --model qwen3:latest

Interactive Commands

During chat, use these commands:

ollmcp main interface

Command Shortcut Description
help h Display help and available commands
tools t Open the tool selection interface
model m List and select a different Ollama model
context c Toggle context retention
thinking-mode tm Toggle thinking mode (deepseek-r1, qwen3 only)
show-thinking st Toggle thinking text visibility
show-tool-execution ste Toggle tool execution display visibility
human-in-loop hil Toggle Human-in-the-Loop confirmations for tool execution
clear cc Clear conversation history and context
context-info ci Display context statistics
cls clear-screen Clear the terminal screen
save-config sc Save current tool and model configuration to a file
load-config lc Load tool and model configuration from a file
reset-config rc Reset configuration to defaults (all tools enabled)
reload-servers rs Reload all MCP servers with current configuration
quit, exit q or Ctrl+D Exit the client

Tool and Server Selection

The tool and server selection interface allows you to enable or disable specific tools:

ollmcp model selection interface

  • Enter numbers separated by commas (e.g. 1,3,5) to toggle specific tools
  • Enter ranges of numbers (e.g. 5-8) to toggle multiple consecutive tools
  • Enter S + number (e.g. S1) to toggle all tools in a specific server
  • a or all - Enable all tools
  • n or none - Disable all tools
  • d or desc - Show/hide tool descriptions
  • s or save - Save changes and return to chat
  • q or quit - Cancel changes and return to chat

Model Selection

The model selection interface shows all available models in your Ollama installation:

ollmcp tool and server selection interface

  • Enter the number of the model you want to use
  • s or save - Save the model selection and return to chat
  • q or quit - Cancel the model selection and return to chat

Server Reloading for Development

The reload-servers command (rs) is particularly useful during MCP server development. It allows you to reload all connected servers without restarting the entire client application.

Key Benefits:

  • πŸ”„ Hot Reload: Instantly apply changes to your MCP server code
  • πŸ› οΈ Development Workflow: Perfect for iterative development and testing
  • πŸ“ Configuration Updates: Automatically picks up changes in server JSON configs or Claude configs
  • 🎯 State Preservation: Maintains your tool enabled/disabled preferences across reloads
  • ⚑️ Time Saving: No need to restart the client and reconfigure everything

When to Use:

  • After modifying your MCP server implementation
  • When you've updated server configurations in JSON files
  • After changing Claude's MCP configuration
  • During debugging to ensure you're testing the latest server version

Simply type reload-servers or rs in the chat interface, and the client will:

  1. Disconnect from all current MCP servers
  2. Reconnect using the same parameters (server paths, config files, auto-discovery)
  3. Restore your previous tool enabled/disabled settings
  4. Display the updated server and tool status

This feature dramatically improves the development experience when building and testing MCP servers.

Human-in-the-Loop (HIL) Tool Execution

The Human-in-the-Loop feature provides an additional safety layer by allowing you to review and approve tool executions before they run. This is particularly useful for:

  • πŸ›‘οΈ Safety: Review potentially destructive operations before execution
  • πŸ” Learning: Understand what tools the model wants to use and why
  • 🎯 Control: Selective execution of only the tools you approve
  • 🚫 Prevention: Stop unwanted tool calls from executing

HIL Confirmation Display

When HIL is enabled, you'll see a confirmation prompt before each tool execution:

Example:

πŸ§‘β€πŸ’» Human-in-the-Loop Confirmation
Tool to execute: weather.get_weather
Arguments:
  β€’ city: Miami

Options:
  y/yes - Execute the tool call
  n/no - Skip this tool call
  disable - Disable HIL confirmations permanently

What would you like to do? (y):

Human-in-the-Loop (HIL) Configuration

  • Default State: HIL confirmations are enabled by default for safety
  • Toggle Command: Use human-in-loop or hil to toggle on/off
  • Persistent Settings: HIL preference is saved with your configuration
  • Quick Disable: Choose "disable" during any confirmation to turn off permanently
  • Re-enable: Use the hil command anytime to turn confirmations back on

Benefits:

  • Enhanced Safety: Prevent accidental or unwanted tool executions
  • Awareness: Understand what actions the model is attempting to perform
  • Selective Control: Choose which operations to allow on a case-by-case basis
  • Peace of Mind: Full visibility and control over automated actions

Configuration Management

Tip

It will automatically load the default configuration from ~/.config/ollmcp/config.json if it exists.

The client supports saving and loading tool configurations between sessions:

  • When using save-config, you can provide a name for the configuration or use the default
  • Configurations are stored in ~/.config/ollmcp/ directory
  • The default configuration is saved as ~/.config/ollmcp/config.json
  • Named configurations are saved as ~/.config/ollmcp/{name}.json

The configuration saves:

  • Current model selection
  • Enabled/disabled status of all tools
  • Context retention settings
  • Thinking mode settings
  • Tool execution display preferences
  • Human-in-the-Loop confirmation settings

Server Configuration Format

The JSON configuration file supports STDIO, SSE, and Streamable HTTP server types:

{
  "mcpServers": {
    "stdio-server": {
      "command": "command-to-run",
      "args": ["arg1", "arg2", "..."],
      "env": {
        "ENV_VAR1": "value1",
        "ENV_VAR2": "value2"
      },
      "disabled": false
    },
    "sse-server": {
      "type": "sse",
      "url": "http://localhost:8000/sse",
      "headers": {
        "Authorization": "Bearer your-token-here"
      },
      "disabled": false
    },
    "http-server": {
      "type": "streamable_http",
      "url": "http://localhost:8000/mcp",
      "headers": {
        "X-API-Key": "your-api-key-here"
      },
      "disabled": false
    }
  }
}

Note

If you specify a URL without a type, the client will default to using Streamable HTTP transport.

Compatible Models

The following Ollama models work well with tool use:

  • qwen2.5
  • qwen3
  • llama3.1
  • llama3.2
  • mistral

For a complete list of Ollama models with tool use capabilities, visit the official Ollama models page.

How Tool Calls Work

  1. The client sends your query to Ollama with a list of available tools
  2. If Ollama decides to use a tool, the client:
    • Displays the tool execution with formatted arguments and syntax highlighting
    • NEW: Shows a Human-in-the-Loop confirmation prompt (if enabled) allowing you to review and approve the tool call
    • Extracts the tool name and arguments from the model response
    • Calls the appropriate MCP server with these arguments (only if approved or HIL is disabled)
    • Shows the tool response in a structured, easy-to-read format
    • Sends the tool result back to Ollama for final processing
    • Displays the model's final response incorporating the tool results

Where Can I Find More MCP Servers?

You can explore a collection of MCP servers in the official MCP Servers repository.

This repository contains reference implementations for the Model Context Protocol, community-built servers, and additional resources to enhance your LLM tool capabilities.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments


Made with ❀️ by jonigl

About

A python-based terminal client for interacting with MCP servers using Ollama. Features include multi-server support, dynamic model switching, streaming responses, tool management, Human-in-the-Loop control, Thinking Mode, and saved session preferences. Built for developers working with local LLMs.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages