llm-inference

Here are 53 public repositories matching this topic...

liltom-eth / llama2-webui

Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.

llm llm-inference llama2 llama-2

Updated Jul 30, 2023
Python

ray-project / aviary

Star

Ray Aviary - evaluate multiple LLMs easily

distributed-systems transformers ray serving large-language-models llm llms llmops llm-serving llm-inference

Updated Jul 29, 2023
Python

liguodongiot / llm-action

Star

本项目旨在分享大模型相关技术原理以及实战经验。

llm llmops llm-serving llm-training llm-inference

Updated Jul 30, 2023
Jupyter Notebook

InternLM / lmdeploy

Star

LMDeploy is a toolkit for compressing, deploying, and serving LLM

cuda-kernels nvidia-gpu deepspeed fastertransformer llm-inference turbomind

Updated Jul 28, 2023
C++

intel / intel-extension-for-transformers

Star

Extending Hugging Face transformers APIs for Transformer-based models and improve the productivity of inference deployment. With extremely compressed models, the toolkit can greatly improve the inference efficiency on Intel platforms.

chatbot transformer stable-diffusion distributed-quantization sparse-gemm transformer-fusion large-language-model llama-cpp llm-inference pytorch-to-onnx ai-x86 smoothquant 4-bits

Updated Jul 28, 2023
C++

ray-project / ray-educational-materials

Star

This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.

deep-learning ray distributed-machine-learning ray-tune ray-train ray-distributed llm generative-ai ray-serve ray-data llm-serving llm-inference

Updated Jul 26, 2023
Jupyter Notebook

stoyan-stoyanov / llmflows

Star

LLMFlows - Simple, Explicit and Transparent LLM Apps

python machine-learning ai openai question-answering vector-database gpt-4 llm prompt-engineering llms chatgpt llmops llm-inference

Updated Jul 29, 2023
Python

aws-samples / aws-genai-llm-chatbot

Star

A comprehensive sample to deploy a multi LLM powered chatbot using AWS CDK on AWS

chatbot aurora semantic-search cdk sagemaker huggingface llm llms llmops pgvector vectordb genai llm-security llm-inference

Updated Jul 21, 2023
TypeScript

1b5d / llm-api

Star

Run any Large Language Model behind a unified API

python machine-learning llama huggingface llm chatgpt langchain llamacpp gptq llm-inference

Updated Jul 24, 2023
Python

mariochavez / aoororachain

Star

Aoororachain is Ruby chain tool to work with LLMs

artificial-intelligence large-language-models llm llm-inference

Updated Jul 12, 2023
Jupyter Notebook

rizerphe / local-llm-function-calling

Star

A tool for generating function arguments and choosing what function to call with local LLMs

json-schema huggingface-transformers llm llm-inference openai-functions chatgpt-functions openai-function-call

Updated Jul 10, 2023
Python

ray-project / llms-in-prod-workshop-2023

Star

Deploy and Scale LLM-based applications

ray anyscale llm llms llm-serving llm-inference

Updated Jun 15, 2023
Jupyter Notebook

substratusai / substratus

Star

Your cross-cloud AI substrate

kubernetes kubernetes-operator mlops ml-platform llmops llm-serving llm-security llm-training llm-inference

Updated Jul 29, 2023
Go

friendliai / periflow-client

Star

🚀 PeriFlow: The fastest Generative AI serving available

cloud fastest serving llm generative-ai llmops llm-inference

Updated Jul 26, 2023
Python

ray-project / anyscale-berkeley-ai-hackathon

Star

Ray and Anyscale for UC Berkeley AI Hackathon!

hackathon berkeley-ai ray-distributed anyscale llm llm-serving llm-inference

Updated Jun 17, 2023
Jupyter Notebook

lucianosb / falconllm-colabs

Star

Google Colabs to run many Falcon based LLMs

colab-notebook huggingface huggingface-transformers llm-inference

Updated Jun 16, 2023
Jupyter Notebook

vemonet / libre-chat

Star

🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline capable and easy to setup. Based on langchain and llama2

open-source openapi self-hosted gradio large-language-models llm llm-inference