The Wayback Machine - https://web.archive.org/web/20240921134722/https://github.com/DefTruth
Skip to content
View DefTruth's full-sized avatar
🎯
#pragma unroll
🎯
#pragma unroll

Block or report DefTruth

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DefTruth/README.md

llm-inference

Pinned Loading

  1. lite.ai.toolkit lite.ai.toolkit Public

    🛠 A lite C++ toolkit of awesome AI models, support ONNXRuntime, MNN, TNN, NCNN and TensorRT.

    C++ 3.6k 687

  2. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 27.1k 4k

  3. Awesome-LLM-Inference Awesome-LLM-Inference Public

    📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

    2.5k 170

  4. PaddlePaddle/FastDeploy PaddlePaddle/FastDeploy Public

    ⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end…

    C++ 2.9k 455

  5. CUDA-Learn-Notes CUDA-Learn-Notes Public

    🎉 CUDA Learn Notes with PyTorch: fp32、fp16/bf16、fp8/int8、flash_attn、sgemm、sgemv、warp/block reduce、dot prod、elementwise、softmax、layernorm、rmsnorm、hist etc.

    Cuda 1.2k 121

  6. statistic-learning-R-note statistic-learning-R-note Public

    📒《统计学习方法-李航》200页PDF手推公式细节讲解,包含详细的目录以及R语言代码实现,可结合《统计学习方法》提高学习效率,适合机器学习、深度学习初学者。

    403 55