Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
-
Updated
Jul 30, 2023 - Python
Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
Ray Aviary - evaluate multiple LLMs easily
本项目旨在分享大模型相关技术原理以及实战经验。
LMDeploy is a toolkit for compressing, deploying, and serving LLM
Extending Hugging Face transformers APIs for Transformer-based models and improve the productivity of inference deployment. With extremely compressed models, the toolkit can greatly improve the inference efficiency on Intel platforms.
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
LLMFlows - Simple, Explicit and Transparent LLM Apps
A comprehensive sample to deploy a multi LLM powered chatbot using AWS CDK on AWS
Run any Large Language Model behind a unified API
Aoororachain is Ruby chain tool to work with LLMs
A tool for generating function arguments and choosing what function to call with local LLMs
Deploy and Scale LLM-based applications
Your cross-cloud AI substrate
Ray and Anyscale for UC Berkeley AI Hackathon!
Google Colabs to run many Falcon based LLMs
🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline capable and easy to setup. Based on langchain and llama2
A collection of all available inference solutions for the LLMs
llm theoretical performance analysis tools and support params, flops, memory and latency analysis.
LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."