Skip to content

Summary

Configuration

API documentation for vLLM's configuration classes.

Offline Inference

LLM Class.

LLM Inputs.

vLLM Engines

Engine classes for offline and online inference.

Inference Parameters

Inference parameters for vLLM APIs.

Multi-Modality

vLLM provides experimental support for multi-modal models through the vllm.multimodal package.

Multi-modal inputs can be passed alongside text and token prompts to supported models via the multi_modal_data field in vllm.inputs.PromptType.

Looking to add your own multi-modal model? Please follow the instructions listed here.

Inputs

User-facing inputs.

Internal data structures.

Data Parsing

Data Processing

Memory Profiling

Registry

Model Development