-
Notifications
You must be signed in to change notification settings - Fork 715
Insights: modelscope/ms-swift
Overview
Could not load contribution data
Please try again later
4 Pull requests merged by 3 people
-
compat transformers==4.52 (vlm)
#4738 merged
Jun 26, 2025 -
[grpo] check liger & sp
#4734 merged
Jun 26, 2025 -
[grpo] fix max_step for dataloader when applying sequence parallel
#4731 merged
Jun 26, 2025 -
[quant] Support fp8
#4729 merged
Jun 26, 2025
1 Pull request opened by 1 person
-
[megatron] support fp8
#4730 opened
Jun 26, 2025
91 Issues closed by 6 people
-
DPO的full微调后Qwen3-4B模型不再输出think
#4701 closed
Jun 27, 2025 -
GRPO怎么自定义format reward
#4667 closed
Jun 26, 2025 -
[grpo] loading BERT model in reward
#4580 closed
Jun 26, 2025 -
GRPO训练中Loss和grad_norm一直为0
#4570 closed
Jun 26, 2025 -
GRPO什么时候支持多机megatron训练
#4558 closed
Jun 26, 2025 -
GRPO训练reward的std始终为0
#4512 closed
Jun 26, 2025 -
多机训练使用--vllm_mode server 会卡死无法运行
#4532 closed
Jun 26, 2025 -
GRPO Qwen3 32B training torch issue
#4491 closed
Jun 26, 2025 -
qwen3强化训练,grpo训练结束后,爆通信错误
#4170 closed
Jun 26, 2025 -
The expanded size of the tensor (8) must match the existing size (5) at non-singleton dimension 0.
#4056 closed
Jun 26, 2025 -
训练结束报错/data/chatglm/retrieval_agent_new/ms_swift_train/ms-swift/swift/cli/rlhf.py FAILED
#4302 closed
Jun 26, 2025 -
dapo时在UserWarning: None of the inputs have requires_grad=True. Gradients will be None一直卡住,直至timeout
#4050 closed
Jun 26, 2025 -
用grpo训练qwen2.5-7b-instruct出现!!!!
#4060 closed
Jun 26, 2025 -
训练正常 eval时报assert error
#4081 closed
Jun 26, 2025 -
Batch size in GRPO.
#4341 closed
Jun 26, 2025 -
grpo训练奖励函数注册失败
#4351 closed
Jun 26, 2025 -
GRPO数据传递失败
#4362 closed
Jun 26, 2025 -
Qwen-Omni 全量微调grpo报错ValueError: `max_new_tokens` must be greater than 0, but is -16384
#4392 closed
Jun 26, 2025 -
GRPO微调多模态训练报错
#4470 closed
Jun 26, 2025 -
双卡A6000使用GRPO微调Qwen2.5-VL-3B会OOM吗?
#4477 closed
Jun 26, 2025 -
Any plans to support megatron for GRPO training?
#3760 closed
Jun 26, 2025 -
LLava 跑GRPO 无法跑通
#3928 closed
Jun 26, 2025 -
QWQ:GRPO训练无法跑通,报错”RuntimeError: ACL stream synchronize failed, error code:107020“
#3932 closed
Jun 26, 2025 -
GRPO训练中间一部分后报错
#3771 closed
Jun 26, 2025 -
grpo训练卡住,一直显示一下问题。
#3794 closed
Jun 26, 2025 -
GRPO训练报错
#3769 closed
Jun 26, 2025 -
Various traceback error during GRPO training
#3836 closed
Jun 26, 2025 -
贡献一个dockerfile吧,这个测试了 多模态的grpo训练 可以基本可以复现示例里面的结果
#3812 closed
Jun 26, 2025 -
GRPO 算法如果设置 reward_model 而不是--reward_funcs ,reward模型和 model都加载到一张卡里去了
#3843 closed
Jun 26, 2025 -
Meet GPU OutOfMemory in GRPO training
#3848 closed
Jun 26, 2025 -
grpo训练32b模型OOM
#3871 closed
Jun 26, 2025 -
GRPO 训练100 steps后性能骤降,请问是什么原因
#3876 closed
Jun 26, 2025 -
if sleep_level > 0, gradient_accumulation_steps will be forced to 1
#3943 closed
Jun 26, 2025 -
The GRPO training process hangs for multi-node training.
#3934 closed
Jun 26, 2025 -
NPU环境训练速度问题
#3331 closed
Jun 26, 2025 -
求一个能8卡A100使用GRPO跑通Qwen2.5 72B模型的脚本
#3416 closed
Jun 26, 2025 -
GRPO 训练时使用2个节点并且设置--num_infer_workers 2 时会报错
#3393 closed
Jun 26, 2025 -
基于qwenvl-7b-instruct训练grpo,eval过程会oom
#3541 closed
Jun 26, 2025 -
单机多卡跑grpo,多个step后会报错
#3576 closed
Jun 26, 2025 -
Loss goes to 0, Gibberish Outputs
#3582 closed
Jun 26, 2025 -
日志怎么添加训练数据中的字段
#3591 closed
Jun 26, 2025 -
多机多卡GRPO assert self.cpu_group is not None
#3583 closed
Jun 26, 2025 -
设置NPROC_PER_NODE后会直接报错 failed (exitcode: -11) local_rank: 1
#3611 closed
Jun 26, 2025 -
GRPO算法训练,后期训练时,显存暴增
#3600 closed
Jun 26, 2025 -
grpo 固定seed,结果依旧不可复现
#3607 closed
Jun 26, 2025 -
gemma3使用grpo用vllm的bug
#3660 closed
Jun 26, 2025 -
【bug】Failed to open local file in cache
#3667 closed
Jun 26, 2025 -
[Bug]: RuntimeError: setup failed!
#3662 closed
Jun 26, 2025 -
使用GRPO训练llava-1.5以及qwen2-vl时,使用vllm推理,在eval时报错
#3666 closed
Jun 26, 2025 -
有没有4*V100能跑起来GRPO的训练脚本和环境配置呀?
#3671 closed
Jun 26, 2025 -
ValueError: RLHF do not support sequence parallel
#3673 closed
Jun 26, 2025 -
Hanging after tqdm starts [COLOCATE MODE]
#3702 closed
Jun 26, 2025 -
GRPO max_grad_norm seems don't work
#3713 closed
Jun 26, 2025 -
It is recommended to use a dedicated device for vLLM
#3719 closed
Jun 26, 2025 -
npu环境GRPO训练,使用vllm时,官方脚本无法正常启动,其他脚本则可以
#3726 closed
Jun 26, 2025 -
GRPO 训练,数据格式解析有bug
#3728 closed
Jun 26, 2025 -
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not NoneType
#3730 closed
Jun 26, 2025 -
Support Ulysses in Swift
#3731 closed
Jun 26, 2025 -
多模态qwen2.5-vl-3B,grpo实验报错
#3398 closed
Jun 26, 2025 -
grpo微调deepseek v2,训练过程中到eval阶段,就会卡住,然后就会停止训练
#3528 closed
Jun 26, 2025 -
请问如何在grpo中配置自定义的数据集路径,并进行数据格式转换?
#3525 closed
Jun 26, 2025 -
2workers_async_iterations2_vllm help
#3522 closed
Jun 26, 2025 -
Bug in GRPO best practices document!
#3501 closed
Jun 26, 2025 -
unhashable type: 'list'
#3490 closed
Jun 26, 2025 -
使用GRPO进行Qwen2.5-vl-7B-Instruct训练,报错:无法多卡训练,只能加载1张卡并oom
#3404 closed
Jun 26, 2025 -
GRPO训练功能建议
#3415 closed
Jun 26, 2025 -
GRPO 训练loss和reward异常
#3372 closed
Jun 26, 2025 -
grpo 多机多卡训练timeout
#3343 closed
Jun 26, 2025 -
GRPO训练LLAVA CUDA Error
#3264 closed
Jun 26, 2025 -
GRPO LLava 训练报错,无法多卡训练,1卡可以
#3228 closed
Jun 26, 2025 -
GRPO 4卡A100训练BUG
#3223 closed
Jun 26, 2025 -
如何对deepseek r1做sft和grpo微调
#3211 closed
Jun 26, 2025 -
使用GRPO 使用我已经训练的LLava模型加载问题
#3195 closed
Jun 26, 2025 -
GRPO deepspeed lmdeploy训练InternVL2d5 报错
#3151 closed
Jun 26, 2025 -
Using Unsloth in conjunction with GRPO to train a model for OOM
#3183 closed
Jun 26, 2025 -
grpo训练如何设置vllm_device使用多张卡
#3098 closed
Jun 26, 2025 -
Does ms-swift support tensor(model)-parallel GRPO training?
#3068 closed
Jun 26, 2025 -
ValueError: Image features and image tokens do not match: tokens: 5589, features 5805
#2460 closed
Jun 26, 2025 -
grad_norm nan
#2280 closed
Jun 26, 2025 -
期望RLHF能支持序列并行(sequence_parallel)
#1958 closed
Jun 26, 2025 -
GRPO训练的old_per_token_logps计算是不是有bug
#4727 closed
Jun 26, 2025 -
rerank 数据加载错误
#4728 closed
Jun 26, 2025 -
Issue with Multi-GPU Training
#4718 closed
Jun 26, 2025 -
Qwen3 Full Sft设置predict_with_generate=true报错keyerror"messages",为false时可以正常训练结束
#4695 closed
Jun 26, 2025
7 Issues opened by 7 people
-
[WARNING:swift] Please install the package: pip install "decord" -U
#4740 opened
Jun 27, 2025 -
Qwen2.5-omni GRPO训练出现内存OOM
#4739 opened
Jun 27, 2025 -
微调DeepSeek模型报错:AssertionError: noaux_tc not supported for training
#4737 opened
Jun 26, 2025 -
Does the packing feature block attention score between different samples?
#4736 opened
Jun 26, 2025 -
a question for rl
#4735 opened
Jun 26, 2025 -
Please open Security Advisories for vulnerability reporting
#4733 opened
Jun 26, 2025 -
在学习全部轮次的SFT训练中,中间轮次结束符号不能被学习,导致训练后的模型无法停止
#4732 opened
Jun 26, 2025
10 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[grpo]Tool rl: add reward func for ToolRL
#4694 commented on
Jun 27, 2025 • 2 new comments -
swift推理精度差异
#4726 commented on
Jun 26, 2025 • 0 new comments -
Swift代码库进行lora checkpoint的continue sft,加载模型和checkpoint后可训练参数为0%
#4723 commented on
Jun 26, 2025 • 0 new comments -
qwen2.5vl lora sft关于freeze_vit
#4722 commented on
Jun 26, 2025 • 0 new comments -
使用lora 训练qwen2.5vl3b之后,lora未合并,使用deploy部署,使用pt, 跟vllm 结果不一致
#4725 commented on
Jun 26, 2025 • 0 new comments -
qwen3 embedding 微调在评估阶段报错:'NoneType' object has no attribute 'get'
#4720 commented on
Jun 26, 2025 • 0 new comments -
GKD代码加载模型卡死
#4724 commented on
Jun 26, 2025 • 0 new comments -
[rank4]: AssertionError: Expected multimodal embeddings to be a list/tuple of 2D tensors, or a single 3D tensor, but got <class 'NoneType'> instead.
#4721 commented on
Jun 26, 2025 • 0 new comments -
swift infer 设置了temperature,top_p 但是每次生成都是同样的结果
#4627 commented on
Jun 26, 2025 • 0 new comments -
训练后的RM模型,支持推理引擎sglang/vllm部署
#3610 commented on
Jun 26, 2025 • 0 new comments