DEV Community

Cover image for ToolHive: A Kubernetes Operator for Deploying MCP Servers
Chris Burns for Stacklok

Posted on • Edited on

ToolHive: A Kubernetes Operator for Deploying MCP Servers

Introduction

Building on our earlier discussion about enterprises needing dedicated hosting for MCP servers and ToolHive's Kubernetes-based solution, we're excited to announce our new Kubernetes Operator for ToolHive. This specialised tool streamlines the secure deployment of MCP servers to Kubernetes environments for enterprise and engineers.

In this article, we'll explore practical ways to leverage this new operator's capabilities.

Let's jump right in! 🚀

Deploying the Operator

For the installation of the ToolHive Operator, we’ve assumed there is already a Kubernetes cluster available with an Ingress controller. We have used Kind for this post as it is simple to set up, free and easy to use.

For simplified local ingress setup with Kind we utilise a basic IP with the Kind Load Balancer - feel free to follow our guide for easy steps on how to do this. To keep things straightforward, we won't use a local hostname in this setup.

Now, with a running cluster, execute the following Helm commands (remember to adjust the --kubeconfig and --kube-context flags as needed).

  1. Install the ToolHive Operator Custom Resource Definitions (CRD’s):

    $ helm upgrade -i toolhive-operator-crds oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds
    
  2. Deploy the Operator:

    $ helm upgrade -i toolhive-operator oci://ghcr.io/stacklok/toolhive/toolhive-operator -n toolhive-system --create-namespace
    

At this point, the ToolHive Kubernetes Operator should now be installed and running.

To verify this, run the following:

$ kubectl get pods -n toolhive-system

NAME                                READY   STATUS    RESTARTS   AGE
toolhive-operator-7f946d9c5-9s8dk   1/1     Running   0          59s
Enter fullscreen mode Exit fullscreen mode

Deploy an MCP Server

Now to install a sample fetch MCP server, run the following:

$ kubectl apply -f https://raw.githubusercontent.com/stacklok/toolhive/main/examples/operator/mcp-servers/mcpserver_fetch.yaml
Enter fullscreen mode Exit fullscreen mode

To verify this has been installed, run the following:

$ kubectl get pods -n toolhive-system -l toolhive=true

NAME                     READY   STATUS    RESTARTS   AGE
fetch-0                  1/1     Running   0          115s
fetch-649c5b958c-nhjbq   1/1     Running   0          2m1s
Enter fullscreen mode Exit fullscreen mode

As shown above, 2 pods are running. The fetch MCP server (fetch-0) is a pod associated with the MCP Server StatefulSet. The other - fetch-xxxxxxxxxx-xxxxx - is the proxy server that deals with all communication between the fetch MCP server and external callers.

Looking back, let’s review how the MCP server was created. Here is the fetch MCP server resource that we’ve applied to the cluster.

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPServer
metadata:
  name: fetch
  namespace: toolhive-system
spec:
  image: docker.io/mcp/fetch
  transport: stdio
  port: 8080
  permissionProfile:
    type: builtin
    name: network
  podTemplateSpec:
    spec:
      containers:
        - name: mcp
          securityContext:
            allowPrivilegeEscalation: false
            runAsNonRoot: false
            runAsUser: 0
            runAsGroup: 0
            capabilities:
              drop:
              - ALL
          resources:
            limits:
              cpu: "500m"
              memory: "512Mi"
            requests:
              cpu: "100m"
              memory: "128Mi"
      securityContext:
        runAsNonRoot: false
        runAsUser: 0
        runAsGroup: 0
        seccompProfile:
          type: RuntimeDefault
  resources:
    limits:
      cpu: "100m"
      memory: "128Mi"
    requests:
      cpu: "50m"
      memory: "64Mi"
Enter fullscreen mode Exit fullscreen mode

The ToolHive Operator introduces a new Custom Resource called: MCPServer. Here’s a breakdown of the MCPServer configuration:

  1. transport: stdio - This creates the MCP server allowing only stdin and stdout traffic. In Kubernetes this results in the proxy server attaching to the container via the Kubernetes API. No other access is given to the caller.
  2. permissionProfile.type: builtin - This references the built-in profiles with ToolHive
  3. permissionProfile.name: network - Permits outbound network connections to any host on any port (not recommended for production use).

Now to connect an example Client such as Cursor to our MCP server, we can do so simply with an Ingress record that is enabled by the Load Balancer mentioned earlier.

We can apply the following Ingress entry, ensuring that the ingressClassName matches what we have in our cluster.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mcp-fetch-ingress
  namespace: toolhive-system
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: mcp-fetch-proxy
            port:
              number: 8080
Enter fullscreen mode Exit fullscreen mode

At this point we should be able to connect to the running fetch MCP server using the external IP address of our Load Balancer.

Note: If you have not chosen Kind for the cluster and you have a different Load Balancer setup than what is followed in this post, you will have to make the respective changes in your configuration to send ingress traffic to the fetch server proxy service.

Due to the fact that we did not use the CLI to create the MCP server, the server configuration did not get automatically applied to our local Client configurations. For this reason, we have to add the configuration manually.

For Cursor, we go to Users/$USERNAME/.cursor/mcp.json, ensuring to replace $USERNAME with our home directory username and we add the following:

{
    "mcpServers": {
        "fetch": {"url": "http://localhost:8080/sse#fetch"}
    }
}
Enter fullscreen mode Exit fullscreen mode

Now, if we go into the Cursor chat, and we ask it to fetch the contents of a web page, it should ask us for approval for the use of the fetch MCP server and then return the content.

Cursor Fetch MCP

Now if we see the logs for the fetch MCP server.

$ {"jsonrpc":"2.0","id":2,"result":{"content":[{"type":"text","text":"Contents of https://chrisjburns.com/:\n\n\nchrisjburns\n\n# Chris Burns\n\n## Software engineer\n\n"}],"isError":false}}
$ {"jsonrpc":"2.0","id":2,"result":{"content":[{"type":"text","text":"Content type text/html; charset=utf-8 cannot be simplified to markdown, but here is the raw content:\nContents of https://chrisjburns.com/:\n<!doctype html><html lang=en><head><meta charset=utf-8><meta name=viewport content=\"width=device-width,initial-scale=1\"><meta name=author content=\"Chris Burns\"><meta name=keywords content=\"blog,developer,personal\"><meta name=twitter:card content=\"summary\"><meta name=twitter:title content=\"chrisjburns\"><meta name=twitter:description content><meta property=\"og:title\" content=\"chrisjburns\"><meta property=\"og:description\" content><meta property=\"og:type\" content=\"website\"><meta property=\"og:url\" content=\"https://chrisjburns.com/\"><meta property=\"og:updated_time\" content=\"2020-05-20T00:18:23+01:00\"><base href=https://chrisjburns.com/><title>chrisjburns</title><link rel=canonical href=https://chrisjburns.com/><link href=\"https://fonts.googleapis.com/css?family=Lato:400,700%7CMerriweather:300,700%7CSource+Code+Pro:400,700\" rel=stylesheet><link href=\"https://fonts.googleapis.com/css?family=Montserrat:400,700|Open+Sans:400,600,300,800,700\" rel=stylesheet type=text/css><link rel=stylesheet href=https://use.fontawesome.com/releases/v5.11.2/css/all.css integrity=sha384-KA6wR/X5RY4zFAHpv/CnoG2UW1uogYfdnP67Uv7eULvTveboZJg0qUpmJZb5VqzN crossorigin=anonymous><link rel=stylesheet href=https://cdnjs.cloudflare.com/ajax/libs/normalize/8.0.1/normalize.min.css integrity=\"sha256-l85OmPOjvil/SOvVt3HnSSjzF1TUMyT9eV0c2BzEGzU=\" crossorigin=anonymous><link rel=stylesheet href=https://chrisjburns.com/css/coder.min.9f38ad26345e306650770a3b91475e09efa3026c59673a09eff165cfa8f1a30e.css integrity=\"sha256-nzitJjReMGZQdwo7kUdeCe+jAmxZZzoJ7/Flz6jxow4=\" crossorigin=anonymous media=screen><link rel=icon type=image/png href=https://chrisjburns.com/images/favicon-32x32.png sizes=32x32><link rel=icon type=image/png href=https://chrisjburns.com/images/favicon-16x16.png sizes=16x16><link rel=alternate type=application/rss+xml href=https://chrisjburns.com/index.xml title=chrisjburns><meta name=generator content=\"Hugo 0.63.2\"></head><body class=colorscheme-light><main class=wrapper><nav class=navigation><section class=container><a class=navigation-title href=https://chrisjburns.com/>chrisjburns</a>\n<input type=checkbox id=menu-toggle>\n<label class=\"menu-button float-right\" for=menu-toggle><i class=\"fas fa-bars\"></i></label><ul class=navigation-list><li class=navigation-item><a class=navigation-link href=https://chrisjburns.com/posts/>BLOG</a></li></ul></section></nav><div class=content><section class=\"container centered\"><div class=about><div class=avatar><img src=https://chrisjburns.com/images/avatar.jpg alt=avatar></div><h1>Chris Burns</h1><h2>Software engineer</h2><ul><li><a href=https://github.com/ChrisJBurns/ aria-label=Github><i class=\"fab fa-github\" aria-hidden=true></i></a></li><li><a href=https://www.linkedin.com/in/chris-j-burns/ aria-label=LinkedIn><i class=\"fab fa-linkedin\" aria-hidden=true></i></a></li></ul><img src=https://ghchart.rshah.org/ChrisJBurns alt=\"Chris Burns's Github chart\"></div></section></div><footer class=footer><section class=container></section></footer></main></body></html>"}],"isError":false}}
Enter fullscreen mode Exit fullscreen mode

There we have it, an MCP Server, created in Kubernetes using the new ToolHive Operator.

Summary

At this point, we hope that it is possible to see the power that this will give engineers and enterprises that want to create MCP servers within Kubernetes. For those who have already played around with Operators and used them, they already know that the potential capabilities are unrivalled when it comes to creating and managing workloads inside of Kubernetes. We know at Stacklok, that behind the Operator we can hide away a lot of complexity that is normally burdened onto the engineer. We really are excited to release this and we are even more excited to see where it goes.

Give it a try, and let us know what you think!

Essential Links:

Top comments (0)