Skip to content
View leideng's full-sized avatar

Block or report leideng

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
leideng/README.md

Hi there πŸ‘‹

  • πŸš€ I train and infer LLMs on Ascend NPUs.
  • 🧠 My current research interest is efficient AI, with a particular focus on sparse attention.
  • πŸŽ“ I am both a researcher and an engineer πŸ› οΈ.
  • πŸ”§ I design algorithms and build systems that work.
  • ⚑ I believe that in the LLM era, while an idea is important, the ability to quickly and efficiently implement that idea is even more vital.

Pinned Loading

  1. AI-primer AI-primer Public

    Jupyter Notebook 2 1

  2. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 1

  3. vllm-ascend vllm-ascend Public

    Forked from vllm-project/vllm-ascend

    Community maintained hardware plugin for vLLM on Ascend

    Python

  4. unified-cache-management unified-cache-management Public

    Forked from ModelEngine-Group/unified-cache-management

    Persist and reuse KV Cache to speedup your LLM.

    Python

  5. nanochat-ascend nanochat-ascend Public

    trainning karpathy's nanochat in ascend npu

    Python