Pulse · modelscope/ms-swift

June 25, 2025 – June 26, 2025

Overview

5 Active pull requests

98 Active issues

4 Pull requests merged by 3 people

compat transformers==4.52 (vlm)
#4738 merged Jun 26, 2025
[grpo] check liger & sp
#4734 merged Jun 26, 2025
[grpo] fix max_step for dataloader when applying sequence parallel
#4731 merged Jun 26, 2025
[quant] Support fp8
#4729 merged Jun 26, 2025

1 Pull request opened by 1 person

[megatron] support fp8
#4730 opened Jun 26, 2025

91 Issues closed by 6 people

DPO的full微调后Qwen3-4B模型不再输出think
#4701 closed Jun 27, 2025
GRPO怎么自定义format reward
#4667 closed Jun 26, 2025
[grpo] loading BERT model in reward
#4580 closed Jun 26, 2025
GRPO训练中Loss和grad_norm一直为0
#4570 closed Jun 26, 2025
GRPO什么时候支持多机megatron训练
#4558 closed Jun 26, 2025
GRPO训练reward的std始终为0
#4512 closed Jun 26, 2025
多机训练使用--vllm_mode server 会卡死无法运行
#4532 closed Jun 26, 2025
GRPO Qwen3 32B training torch issue
#4491 closed Jun 26, 2025
qwen3强化训练，grpo训练结束后，爆通信错误
#4170 closed Jun 26, 2025
The expanded size of the tensor (8) must match the existing size (5) at non-singleton dimension 0.
#4056 closed Jun 26, 2025
训练结束报错/data/chatglm/retrieval_agent_new/ms_swift_train/ms-swift/swift/cli/rlhf.py FAILED
#4302 closed Jun 26, 2025
dapo时在UserWarning: None of the inputs have requires_grad=True. Gradients will be None一直卡住，直至timeout
#4050 closed Jun 26, 2025
用grpo训练qwen2.5-7b-instruct出现!!!!
#4060 closed Jun 26, 2025
训练正常 eval时报assert error
#4081 closed Jun 26, 2025
Batch size in GRPO.
#4341 closed Jun 26, 2025
grpo训练奖励函数注册失败
#4351 closed Jun 26, 2025
GRPO数据传递失败
#4362 closed Jun 26, 2025
Qwen-Omni 全量微调grpo报错ValueError: `max_new_tokens` must be greater than 0, but is -16384
#4392 closed Jun 26, 2025
GRPO微调多模态训练报错
#4470 closed Jun 26, 2025
双卡A6000使用GRPO微调Qwen2.5-VL-3B会OOM吗？
#4477 closed Jun 26, 2025
RTX3090上运行sft-rlhf-grpo微调，报错：torch.distributed.DistBackendError: [3] is setting up NCCL communicator and retrieving ncclUniqueId from [0] via c10d key-value store by key '0', but store->get('0') got error: wait timeout after 1800000ms,
#3612 closed Jun 26, 2025
Any plans to support megatron for GRPO training?
#3760 closed Jun 26, 2025
LLava 跑GRPO 无法跑通
#3928 closed Jun 26, 2025
QWQ：GRPO训练无法跑通，报错”RuntimeError: ACL stream synchronize failed, error code:107020“
#3932 closed Jun 26, 2025
While training GRPO, I noticed that my model crashes. Its loss is 0, its grad_norm and kl are both Nan, and it completes as “!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!””
#3930 closed Jun 26, 2025
GRPO训练中间一部分后报错
#3771 closed Jun 26, 2025
grpo训练卡住，一直显示一下问题。
#3794 closed Jun 26, 2025
GRPO训练报错
#3769 closed Jun 26, 2025
Various traceback error during GRPO training
#3836 closed Jun 26, 2025
贡献一个dockerfile吧，这个测试了多模态的grpo训练可以基本可以复现示例里面的结果
#3812 closed Jun 26, 2025
GRPO 算法如果设置 reward_model 而不是--reward_funcs ，reward模型和 model都加载到一张卡里去了
#3843 closed Jun 26, 2025
Meet GPU OutOfMemory in GRPO training
#3848 closed Jun 26, 2025
grpo训练32b模型OOM
#3871 closed Jun 26, 2025
GRPO 训练100 steps后性能骤降，请问是什么原因
#3876 closed Jun 26, 2025
Bug! Checkpoint resume failure - deepspeed different DP size. Is there a quick checkpoint converter anywere?
#3989 closed Jun 26, 2025
Bug! Help! MS-SWIFT GRPO + LoRA training hung/stuck after training 1 step from full merged model merged from lora adapter
#3990 closed Jun 26, 2025
if sleep_level > 0, gradient_accumulation_steps will be forced to 1
#3943 closed Jun 26, 2025
The GRPO training process hangs for multi-node training.
#3934 closed Jun 26, 2025
NPU环境训练速度问题
#3331 closed Jun 26, 2025
求一个能8卡A100使用GRPO跑通Qwen2.5 72B模型的脚本
#3416 closed Jun 26, 2025
GRPO 训练时使用2个节点并且设置--num_infer_workers 2 时会报错
#3393 closed Jun 26, 2025
基于qwenvl-7b-instruct训练grpo，eval过程会oom
#3541 closed Jun 26, 2025
4*v100环境执行lora_vllm脚本报错：Assertion `!(srcMmaLayout && dstMmaLayout && !srcMmaLayout.isAmpere()) && "mma -> mma layout conversion is only supported on Ampere"' failed.
#3549 closed Jun 26, 2025
单机多卡跑grpo，多个step后会报错
#3576 closed Jun 26, 2025
Loss goes to 0, Gibberish Outputs
#3582 closed Jun 26, 2025
日志怎么添加训练数据中的字段
#3591 closed Jun 26, 2025
多机多卡GRPO assert self.cpu_group is not None
#3583 closed Jun 26, 2025
设置NPROC_PER_NODE后会直接报错 failed (exitcode: -11) local_rank: 1
#3611 closed Jun 26, 2025
GRPO算法训练，后期训练时，显存暴增
#3600 closed Jun 26, 2025
grpo 固定seed，结果依旧不可复现
#3607 closed Jun 26, 2025
gemma3使用grpo用vllm的bug
#3660 closed Jun 26, 2025
【bug】Failed to open local file in cache
#3667 closed Jun 26, 2025
[Bug]: RuntimeError: setup failed!
#3662 closed Jun 26, 2025
使用GRPO训练llava-1.5以及qwen2-vl时，使用vllm推理，在eval时报错
#3666 closed Jun 26, 2025
有没有4*V100能跑起来GRPO的训练脚本和环境配置呀？
#3671 closed Jun 26, 2025
ValueError: RLHF do not support sequence parallel
#3673 closed Jun 26, 2025
Hanging after tqdm starts [COLOCATE MODE]
#3702 closed Jun 26, 2025
GRPO max_grad_norm seems don't work
#3713 closed Jun 26, 2025
It is recommended to use a dedicated device for vLLM
#3719 closed Jun 26, 2025
npu环境GRPO训练，使用vllm时，官方脚本无法正常启动，其他脚本则可以
#3726 closed Jun 26, 2025
GRPO 训练，数据格式解析有bug
#3728 closed Jun 26, 2025
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not NoneType
#3730 closed Jun 26, 2025
Support Ulysses in Swift
#3731 closed Jun 26, 2025
GRPO tutorial bug: world_size (8) is not equal to tensor_model_parallel_size (4) x pipeline_model_parallel_size (1)
#3739 closed Jun 26, 2025
多模态qwen2.5-vl-3B,grpo实验报错
#3398 closed Jun 26, 2025
grpo微调deepseek v2，训练过程中到eval阶段，就会卡住，然后就会停止训练
#3528 closed Jun 26, 2025
请问如何在grpo中配置自定义的数据集路径，并进行数据格式转换？
#3525 closed Jun 26, 2025
2workers_async_iterations2_vllm help
#3522 closed Jun 26, 2025
Bug in GRPO best practices document!
#3501 closed Jun 26, 2025
unhashable type: 'list'
#3490 closed Jun 26, 2025
请求支持GRPO训练中，vllm推理后端支持多张卡🙏 Request for support for using multiple cards in the vLLM inference backend during GRPO training
#3477 closed Jun 26, 2025
使用GRPO进行Qwen2.5-vl-7B-Instruct训练，报错：无法多卡训练，只能加载1张卡并oom
#3404 closed Jun 26, 2025
GRPO训练功能建议
#3415 closed Jun 26, 2025
GRPO 训练loss和reward异常
#3372 closed Jun 26, 2025
grpo 多机多卡训练timeout
#3343 closed Jun 26, 2025
GRPO训练LLAVA CUDA Error
#3264 closed Jun 26, 2025
GRPO LLava 训练报错，无法多卡训练，1卡可以
#3228 closed Jun 26, 2025
GRPO 4卡A100训练BUG
#3223 closed Jun 26, 2025
如何对deepseek r1做sft和grpo微调
#3211 closed Jun 26, 2025
使用GRPO 使用我已经训练的LLava模型加载问题
#3195 closed Jun 26, 2025
GRPO deepspeed lmdeploy训练InternVL2d5 报错
#3151 closed Jun 26, 2025
Using Unsloth in conjunction with GRPO to train a model for OOM
#3183 closed Jun 26, 2025
grpo训练如何设置vllm_device使用多张卡
#3098 closed Jun 26, 2025
Does ms-swift support tensor(model)-parallel GRPO training?
#3068 closed Jun 26, 2025
ValueError: Image features and image tokens do not match: tokens: 5589, features 5805
#2460 closed Jun 26, 2025
grad_norm nan
#2280 closed Jun 26, 2025
期望RLHF能支持序列并行（sequence_parallel）
#1958 closed Jun 26, 2025
GRPO训练的old_per_token_logps计算是不是有bug
#4727 closed Jun 26, 2025
rerank 数据加载错误
#4728 closed Jun 26, 2025
Issue with Multi-GPU Training
#4718 closed Jun 26, 2025
Qwen3 Full Sft设置predict_with_generate=true报错keyerror"messages"，为false时可以正常训练结束
#4695 closed Jun 26, 2025

7 Issues opened by 7 people

[WARNING:swift] Please install the package: pip install "decord" -U
#4740 opened Jun 27, 2025
Qwen2.5-omni GRPO训练出现内存OOM
#4739 opened Jun 27, 2025
微调DeepSeek模型报错：AssertionError: noaux_tc not supported for training
#4737 opened Jun 26, 2025
Does the packing feature block attention score between different samples?
#4736 opened Jun 26, 2025
a question for rl
#4735 opened Jun 26, 2025
Please open Security Advisories for vulnerability reporting
#4733 opened Jun 26, 2025
在学习全部轮次的SFT训练中，中间轮次结束符号不能被学习，导致训练后的模型无法停止
#4732 opened Jun 26, 2025

10 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[grpo]Tool rl: add reward func for ToolRL
#4694 commented on Jun 27, 2025 • 2 new comments
swift推理精度差异
#4726 commented on Jun 26, 2025 • 0 new comments
Swift代码库进行lora checkpoint的continue sft，加载模型和checkpoint后可训练参数为0%
#4723 commented on Jun 26, 2025 • 0 new comments
qwen2.5vl lora sft关于freeze_vit
#4722 commented on Jun 26, 2025 • 0 new comments
使用lora 训练qwen2.5vl3b之后，lora未合并，使用deploy部署，使用pt, 跟vllm 结果不一致
#4725 commented on Jun 26, 2025 • 0 new comments
qwen3 embedding 微调在评估阶段报错：'NoneType' object has no attribute 'get'
#4720 commented on Jun 26, 2025 • 0 new comments
GKD代码加载模型卡死
#4724 commented on Jun 26, 2025 • 0 new comments
[rank4]: AssertionError: Expected multimodal embeddings to be a list/tuple of 2D tensors, or a single 3D tensor, but got <class 'NoneType'> instead.
#4721 commented on Jun 26, 2025 • 0 new comments
swift infer 设置了temperature，top_p 但是每次生成都是同样的结果
#4627 commented on Jun 26, 2025 • 0 new comments
训练后的RM模型，支持推理引擎sglang/vllm部署
#3610 commented on Jun 26, 2025 • 0 new comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

June 25, 2025 – June 26, 2025

Overview

Could not load contribution data

4 Pull requests merged by 3 people

1 Pull request opened by 1 person

91 Issues closed by 6 people

7 Issues opened by 7 people

10 Unresolved conversations

Insights: modelscope/ms-swift

June 25, 2025 – June 26, 2025

Overview

Could not load contribution data

4 Pull requests merged by 3 people

1 Pull request opened by 1 person

91 Issues closed by 6 people

7 Issues opened by 7 people

10 Unresolved conversations