Skip to content
View hijkzzz's full-sized avatar

Block or report hijkzzz

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hijkzzz/README.md

🔭 I'm a RLer + NLPer/2 + MLSyser/2.

Jian Hu's GitHub stats

Pinned Loading

  1. OpenRLHF/OpenRLHF OpenRLHF/OpenRLHF Public

    An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

    Python 9.5k 944

  2. Awesome-LLM-Strawberry Awesome-LLM-Strawberry Public

    A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

    6.9k 367

  3. pymarl2 pymarl2 Public

    Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)

    Python 710 136

  4. alpha-zero-gomoku alpha-zero-gomoku Public

    A Multi-threaded Implementation of AlphaZero (C++)

    Python 388 49

  5. NVIDIA-NeMo/RL NVIDIA-NeMo/RL Public

    Scalable toolkit for efficient model reinforcement

    Python 1.6k 387

  6. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 80.5k 17k