From a model in jupyter notebook to production API service in 5 minutes
BentoML
Installation | Getting Started | Documentation | Examples | Contributing | License
BentoML is a python framework for building, shipping and running machine learning services. It provides high-level APIs for defining an ML service and packaging its artifacts, source code, dependencies, and configurations into a production-system-friendly format that is ready for deployment.
Use BentoML if you need to:
-
Turn your ML model into REST API server, Serverless endpoint, PyPI package, or CLI tool
-
Manage the workflow of creating and deploying a ML service
Installation
pip install bentomlGetting Started
Defining a machine learning service with BentoML is as simple as a few lines of code:
@artifacts([PickleArtifact('model')])
@env(conda_pip_dependencies=["scikit-learn"])
class IrisClassifier(BentoService):
@api(DataframeHandler)
def predict(self, df):
return self.artifacts.model.predict(df) - Try out our 5-mins getting started guide, using BentoML to productionize a scikit-learn model and deploy it to AWS Lambda.
Feature Highlights
-
Multiple Distribution Format - Easily package your Machine Learning models and preprocessing code into a format that works best with your inference scenario:
- Docker Image - deploy as containers running REST API Server
- PyPI Package - integrate into your python applications seamlessly
- CLI tool - put your model into Airflow DAG or CI/CD pipeline
- Spark UDF - run batch serving on a large dataset with Spark
- Serverless Function - host your model on serverless platforms such as AWS Lambda
-
Multiple Framework Support - BentoML supports a wide range of ML frameworks out-of-the-box including Tensorflow, PyTorch, Scikit-Learn, xgboost, H2O, FastAI and can be easily extended to work with new or custom frameworks.
-
Deploy Anywhere - BentoML bundled ML service can be easily deployed with platforms such as Docker, Kubernetes, Serverless, Airflow and Clipper, on cloud platforms including AWS, Google Cloud, and Azure.
-
Custom Runtime Backend - Easily integrate your python pre-processing code with high-performance deep learning runtime backend, such as tensorflow-serving.
Documentation
Full documentation and API references can be found at bentoml.readthedocs.io
Examples
All examples can be found under the BentoML/examples directory. More tutorials and examples coming soon!
- Quick Start Guide - Google Colab | nbviewer | source
- Scikit-learn Sentiment Analysis - Google Colab | nbviewer | source
- Keras Text Classification - Google Colab | nbviewer | source
- Keras Fashion MNIST classification - Google Colab | nbviewer | source
- FastAI Pet Classification - Google Colab | nbviewer | source
- FastAI Tabular CSV - Google Colab | nbviewer | source
- PyTorch Fashion MNIST classification - Google Colab | nbviewer | source
- XGBoost Titanic Survival Prediction - Google Colab | nbviewer | source
- H2O Classification- Google Colab | nbviewer | source
Deployment guides:
- Serverless deployment with AWS Lambda
- API server deployment with AWS SageMaker
- API server deployment with Clipper
- (WIP) API server deployment on Kubernetes
We collect example notebook page views to help us improve this project.
To opt-out of tracking, delete the [Impression] line in the first markdown cell of any example notebook: 
