Skip to content
View lkevinzc's full-sized avatar
🎯
Learning
🎯
Learning

Organizations

@mosecorg

Block or report lkevinzc

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. sail-sg/oat sail-sg/oat Public

    🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

    Python 656 62

  2. axon-rl/gem axon-rl/gem Public

    A Gym for Agentic LLMs

    Python 488 33

  3. sail-sg/understand-r1-zero sail-sg/understand-r1-zero Public

    Understanding R1-Zero-Like Training: A Critical Perspective

    Python 1.3k 59

  4. mosecorg/mosec mosecorg/mosec Public

    A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

    Python 900 72

  5. spiral-rl/spiral spiral-rl/spiral Public

    SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

    Python 189 22

  6. sail-sg/Precision-RL sail-sg/Precision-RL Public

    Defeating the Training-Inference Mismatch via FP16

    Python 193 17