-
Notifications
You must be signed in to change notification settings - Fork 717
Insights: modelscope/ms-swift
Overview
Could not load contribution data
Please try again later
2 Releases published by 1 person
-
v3.5.2 Patch release v3.5.2
published
Jun 20, 2025 -
v3.5.3 Patch release v3.5.3
published
Jun 27, 2025
24 Pull requests merged by 7 people
-
[grpo]Tool rl: add reward func for ToolRL
#4694 merged
Jun 27, 2025 -
compat transformers==4.52 (vlm)
#4738 merged
Jun 26, 2025 -
[grpo] check liger & sp
#4734 merged
Jun 26, 2025 -
[grpo] fix max_step for dataloader when applying sequence parallel
#4731 merged
Jun 26, 2025 -
[quant] Support fp8
#4729 merged
Jun 26, 2025 -
support Kimi-VL-A3B-Thinking-2506 & Kimi-Dev-72B
#4719 merged
Jun 25, 2025 -
[doc] simplify environment variables & update best practices documentation
#4715 merged
Jun 25, 2025 -
[grpo] fix colocate seed
#4712 merged
Jun 25, 2025 -
[megatron] support rednote-hilab/dots.llm1.inst
#4707 merged
Jun 25, 2025 -
[megatron] support DeepseekV2ForCausalLM and DeepseekV3ForCausalLM
#4659 merged
Jun 25, 2025 -
fix links
#4690 merged
Jun 24, 2025 -
[feat] support fine-tuning of reranker models
#4671 merged
Jun 24, 2025 -
[grpo] fix grpo pt
#4683 merged
Jun 24, 2025 -
[rollout] fix dp args
#4678 merged
Jun 23, 2025 -
[doc] fix doc
#4675 merged
Jun 23, 2025 -
[doc] fix image link
#4674 merged
Jun 23, 2025 -
docs: correct typo "resonse" to "response"
#4672 merged
Jun 23, 2025 -
[channel loss]support packing & padding free
#4666 merged
Jun 23, 2025 -
[docs] update docs
#4665 merged
Jun 23, 2025 -
[dataset] fix grounding_dataset
#4664 merged
Jun 23, 2025 -
[grpo] refactor multi turn & support async engine & refactor grpo docs
#4380 merged
Jun 23, 2025 -
[template] optimize remove_unused_columns
#4661 merged
Jun 22, 2025 -
[gkd] support use_logits_to_keep/padding_free/packing & update gkd shell
#4658 merged
Jun 21, 2025 -
[docs] update gkd
#4657 merged
Jun 20, 2025
4 Pull requests opened by 3 people
-
solve the default 'template_backend' bug in llm.tempalte.base.Templte._encode
#4669 opened
Jun 23, 2025 -
Refactor Web-UI
#4687 opened
Jun 24, 2025 -
[megatron] support fp8
#4730 opened
Jun 26, 2025 -
support Tencent-Hunyuan/Hunyuan-A13B-Instruct
#4745 opened
Jun 27, 2025
123 Issues closed by 12 people
-
Megatron不支持GRPO训练
#4744 closed
Jun 27, 2025 -
DPO的full微调后Qwen3-4B模型不再输出think
#4701 closed
Jun 27, 2025 -
GRPO怎么自定义format reward
#4667 closed
Jun 26, 2025 -
[grpo] loading BERT model in reward
#4580 closed
Jun 26, 2025 -
GRPO训练中Loss和grad_norm一直为0
#4570 closed
Jun 26, 2025 -
GRPO什么时候支持多机megatron训练
#4558 closed
Jun 26, 2025 -
GRPO训练reward的std始终为0
#4512 closed
Jun 26, 2025 -
多机训练使用--vllm_mode server 会卡死无法运行
#4532 closed
Jun 26, 2025 -
GRPO Qwen3 32B training torch issue
#4491 closed
Jun 26, 2025 -
qwen3强化训练,grpo训练结束后,爆通信错误
#4170 closed
Jun 26, 2025 -
The expanded size of the tensor (8) must match the existing size (5) at non-singleton dimension 0.
#4056 closed
Jun 26, 2025 -
训练结束报错/data/chatglm/retrieval_agent_new/ms_swift_train/ms-swift/swift/cli/rlhf.py FAILED
#4302 closed
Jun 26, 2025 -
dapo时在UserWarning: None of the inputs have requires_grad=True. Gradients will be None一直卡住,直至timeout
#4050 closed
Jun 26, 2025 -
用grpo训练qwen2.5-7b-instruct出现!!!!
#4060 closed
Jun 26, 2025 -
训练正常 eval时报assert error
#4081 closed
Jun 26, 2025 -
Batch size in GRPO.
#4341 closed
Jun 26, 2025 -
grpo训练奖励函数注册失败
#4351 closed
Jun 26, 2025 -
GRPO数据传递失败
#4362 closed
Jun 26, 2025 -
Qwen-Omni 全量微调grpo报错ValueError: `max_new_tokens` must be greater than 0, but is -16384
#4392 closed
Jun 26, 2025 -
GRPO微调多模态训练报错
#4470 closed
Jun 26, 2025 -
双卡A6000使用GRPO微调Qwen2.5-VL-3B会OOM吗?
#4477 closed
Jun 26, 2025 -
Any plans to support megatron for GRPO training?
#3760 closed
Jun 26, 2025 -
LLava 跑GRPO 无法跑通
#3928 closed
Jun 26, 2025 -
QWQ:GRPO训练无法跑通,报错”RuntimeError: ACL stream synchronize failed, error code:107020“
#3932 closed
Jun 26, 2025 -
GRPO训练中间一部分后报错
#3771 closed
Jun 26, 2025 -
grpo训练卡住,一直显示一下问题。
#3794 closed
Jun 26, 2025 -
GRPO训练报错
#3769 closed
Jun 26, 2025 -
Various traceback error during GRPO training
#3836 closed
Jun 26, 2025 -
贡献一个dockerfile吧,这个测试了 多模态的grpo训练 可以基本可以复现示例里面的结果
#3812 closed
Jun 26, 2025 -
GRPO 算法如果设置 reward_model 而不是--reward_funcs ,reward模型和 model都加载到一张卡里去了
#3843 closed
Jun 26, 2025 -
Meet GPU OutOfMemory in GRPO training
#3848 closed
Jun 26, 2025 -
grpo训练32b模型OOM
#3871 closed
Jun 26, 2025 -
GRPO 训练100 steps后性能骤降,请问是什么原因
#3876 closed
Jun 26, 2025 -
if sleep_level > 0, gradient_accumulation_steps will be forced to 1
#3943 closed
Jun 26, 2025 -
The GRPO training process hangs for multi-node training.
#3934 closed
Jun 26, 2025 -
NPU环境训练速度问题
#3331 closed
Jun 26, 2025 -
求一个能8卡A100使用GRPO跑通Qwen2.5 72B模型的脚本
#3416 closed
Jun 26, 2025 -
GRPO 训练时使用2个节点并且设置--num_infer_workers 2 时会报错
#3393 closed
Jun 26, 2025 -
基于qwenvl-7b-instruct训练grpo,eval过程会oom
#3541 closed
Jun 26, 2025 -
单机多卡跑grpo,多个step后会报错
#3576 closed
Jun 26, 2025 -
Loss goes to 0, Gibberish Outputs
#3582 closed
Jun 26, 2025 -
日志怎么添加训练数据中的字段
#3591 closed
Jun 26, 2025 -
多机多卡GRPO assert self.cpu_group is not None
#3583 closed
Jun 26, 2025 -
设置NPROC_PER_NODE后会直接报错 failed (exitcode: -11) local_rank: 1
#3611 closed
Jun 26, 2025 -
GRPO算法训练,后期训练时,显存暴增
#3600 closed
Jun 26, 2025 -
grpo 固定seed,结果依旧不可复现
#3607 closed
Jun 26, 2025 -
gemma3使用grpo用vllm的bug
#3660 closed
Jun 26, 2025 -
【bug】Failed to open local file in cache
#3667 closed
Jun 26, 2025 -
[Bug]: RuntimeError: setup failed!
#3662 closed
Jun 26, 2025 -
使用GRPO训练llava-1.5以及qwen2-vl时,使用vllm推理,在eval时报错
#3666 closed
Jun 26, 2025 -
有没有4*V100能跑起来GRPO的训练脚本和环境配置呀?
#3671 closed
Jun 26, 2025 -
ValueError: RLHF do not support sequence parallel
#3673 closed
Jun 26, 2025 -
Hanging after tqdm starts [COLOCATE MODE]
#3702 closed
Jun 26, 2025 -
GRPO max_grad_norm seems don't work
#3713 closed
Jun 26, 2025 -
It is recommended to use a dedicated device for vLLM
#3719 closed
Jun 26, 2025 -
npu环境GRPO训练,使用vllm时,官方脚本无法正常启动,其他脚本则可以
#3726 closed
Jun 26, 2025 -
GRPO 训练,数据格式解析有bug
#3728 closed
Jun 26, 2025 -
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not NoneType
#3730 closed
Jun 26, 2025 -
Support Ulysses in Swift
#3731 closed
Jun 26, 2025 -
多模态qwen2.5-vl-3B,grpo实验报错
#3398 closed
Jun 26, 2025 -
grpo微调deepseek v2,训练过程中到eval阶段,就会卡住,然后就会停止训练
#3528 closed
Jun 26, 2025 -
请问如何在grpo中配置自定义的数据集路径,并进行数据格式转换?
#3525 closed
Jun 26, 2025 -
2workers_async_iterations2_vllm help
#3522 closed
Jun 26, 2025 -
Bug in GRPO best practices document!
#3501 closed
Jun 26, 2025 -
unhashable type: 'list'
#3490 closed
Jun 26, 2025 -
使用GRPO进行Qwen2.5-vl-7B-Instruct训练,报错:无法多卡训练,只能加载1张卡并oom
#3404 closed
Jun 26, 2025 -
GRPO训练功能建议
#3415 closed
Jun 26, 2025 -
GRPO 训练loss和reward异常
#3372 closed
Jun 26, 2025 -
grpo 多机多卡训练timeout
#3343 closed
Jun 26, 2025 -
GRPO训练LLAVA CUDA Error
#3264 closed
Jun 26, 2025 -
GRPO LLava 训练报错,无法多卡训练,1卡可以
#3228 closed
Jun 26, 2025 -
GRPO 4卡A100训练BUG
#3223 closed
Jun 26, 2025 -
如何对deepseek r1做sft和grpo微调
#3211 closed
Jun 26, 2025 -
使用GRPO 使用我已经训练的LLava模型加载问题
#3195 closed
Jun 26, 2025 -
GRPO deepspeed lmdeploy训练InternVL2d5 报错
#3151 closed
Jun 26, 2025 -
Using Unsloth in conjunction with GRPO to train a model for OOM
#3183 closed
Jun 26, 2025 -
grpo训练如何设置vllm_device使用多张卡
#3098 closed
Jun 26, 2025 -
Does ms-swift support tensor(model)-parallel GRPO training?
#3068 closed
Jun 26, 2025 -
ValueError: Image features and image tokens do not match: tokens: 5589, features 5805
#2460 closed
Jun 26, 2025 -
grad_norm nan
#2280 closed
Jun 26, 2025 -
期望RLHF能支持序列并行(sequence_parallel)
#1958 closed
Jun 26, 2025 -
GRPO训练的old_per_token_logps计算是不是有bug
#4727 closed
Jun 26, 2025 -
rerank 数据加载错误
#4728 closed
Jun 26, 2025 -
Issue with Multi-GPU Training
#4718 closed
Jun 26, 2025 -
Qwen3 Full Sft设置predict_with_generate=true报错keyerror"messages",为false时可以正常训练结束
#4695 closed
Jun 26, 2025 -
支持 moonshotai/Kimi-VL-A3B-Thinking-2506
#4708 closed
Jun 25, 2025 -
grpo训练qwen2.5-vl报错
#4364 closed
Jun 25, 2025 -
全量微调grpo 相同数量的样本ms-swift效果比unsloth效果差很多
#4393 closed
Jun 25, 2025 -
GRPO OOM USE resume_from_checkpoint
#4406 closed
Jun 25, 2025 -
支持的DeepSeek-R1训练是指671B的模型吗还是蒸馏的模型?
#3132 closed
Jun 25, 2025 -
seq_cls训练时候开启flash_attn指标大幅度低于不开flash_attn
#4384 closed
Jun 25, 2025 -
多回归任务,推理问题
#4705 closed
Jun 25, 2025 -
请问使用zero2/zero3导致max_steps相差八倍的原因是什么?
#4616 closed
Jun 23, 2025 -
请求增加对Qwen3-8B的自我认知训练的NoteBook文件
#4034 closed
Jun 23, 2025 -
InternVL3-9B LoRA微调数据集预处理速度缓慢问题(大约7h)
#4076 closed
Jun 23, 2025 -
单坐标点定位物体位置
#4292 closed
Jun 23, 2025 -
data_load
#4288 closed
Jun 23, 2025 -
Seq CLS Infer 问题咨询
#4325 closed
Jun 23, 2025 -
UI-TARS冻结参数推理无法均匀分配显存导致超出显存
#4359 closed
Jun 23, 2025 -
微调Qwen3在默认脚本上加上zero2/3会OOM
#4371 closed
Jun 23, 2025 -
VLLM Engine Batch 推理咨询
#4386 closed
Jun 23, 2025 -
swift infer这个些命令如何转为python命令运行的,内部原理
#4555 closed
Jun 23, 2025 -
Failing to preprocess hf dataset
#4564 closed
Jun 23, 2025 -
How-to use on Apple Mac?
#4572 closed
Jun 23, 2025 -
Multimodal finetune llava1.6-mistral bug: RuntimeError: Tensors must have same number of dimensions
#4578 closed
Jun 23, 2025 -
关于ms-swift 3.x的template和2.x的不同
#4602 closed
Jun 23, 2025 -
ovis2 微调失败,loss计算时报ValueError: Expected input batch_size (1384) to match target batch_size (16384)
#4611 closed
Jun 23, 2025 -
关于pip install -e '.[all]' 的安装、evalscope的安装的咨询
#4605 closed
Jun 23, 2025 -
loss_scale hermes not work
#4607 closed
Jun 23, 2025 -
华为910B lora qwen2.5vl报错:AssertionError: Torch not compiled with CUDA enabled
#4619 closed
Jun 23, 2025 -
满血版R1/Qwen3-235B-30A HF参数转megatron OOM
#4648 closed
Jun 23, 2025 -
10分钟改变大模型自我认知教程报错'Qwen2_5VLTemplate' object has no attribute 'model'
#4662 closed
Jun 23, 2025 -
DPO训练到 100 步时,遇到 StopIteration ERROR during training 问题
#4644 closed
Jun 23, 2025 -
多卡多进程使用orpo卡死,触发watchdog caught collective operation timeout.
#3564 closed
Jun 20, 2025
52 Issues opened by 43 people
-
基于本地加载数据集进行多卡并行训练,停在Init COMPLETE... 无法进入train阶段
#4743 opened
Jun 27, 2025 -
输入多图的编号问题
#4742 opened
Jun 27, 2025 -
ms swift如何加入early stop
#4741 opened
Jun 27, 2025 -
[WARNING:swift] Please install the package: pip install "decord" -U
#4740 opened
Jun 27, 2025 -
Qwen2.5-omni GRPO训练出现内存OOM
#4739 opened
Jun 27, 2025 -
微调DeepSeek模型报错:AssertionError: noaux_tc not supported for training
#4737 opened
Jun 26, 2025 -
Does the packing feature block attention score between different samples?
#4736 opened
Jun 26, 2025 -
a question for rl
#4735 opened
Jun 26, 2025 -
Please open Security Advisories for vulnerability reporting
#4733 opened
Jun 26, 2025 -
在学习全部轮次的SFT训练中,中间轮次结束符号不能被学习,导致训练后的模型无法停止
#4732 opened
Jun 26, 2025 -
swift推理精度差异
#4726 opened
Jun 26, 2025 -
使用lora 训练qwen2.5vl3b之后,lora未合并,使用deploy部署,使用pt, 跟vllm 结果不一致
#4725 opened
Jun 26, 2025 -
GKD代码加载模型卡死
#4724 opened
Jun 26, 2025 -
Swift代码库进行lora checkpoint的continue sft,加载模型和checkpoint后可训练参数为0%
#4723 opened
Jun 26, 2025 -
qwen2.5vl lora sft关于freeze_vit
#4722 opened
Jun 26, 2025 -
qwen3 embedding 微调在评估阶段报错:'NoneType' object has no attribute 'get'
#4720 opened
Jun 25, 2025 -
添加python示例代码
#4717 opened
Jun 25, 2025 -
如何传入自定义的causal_attention_mask
#4716 opened
Jun 25, 2025 -
local_repo_path参数,在python脚本里如何添加
#4714 opened
Jun 25, 2025 -
hf格式模型文件转megatron报错: CUDA error: operation not supported
#4713 opened
Jun 25, 2025 -
lora 微调 Ovis2-34B loss=0.0
#4711 opened
Jun 25, 2025 -
Deepspeed zero3 多 GPU 训练没法设置 batch_size 为1
#4710 opened
Jun 25, 2025 -
[Bug]: [WARNING:swift] Please install the package: pip install "decord" -U
#4709 opened
Jun 25, 2025 -
多回归任务 输出问题
#4706 opened
Jun 25, 2025 -
序列分类任务,能否多卡训练?
#4704 opened
Jun 25, 2025 -
Swift rollout卡住
#4703 opened
Jun 25, 2025 -
NuminaMath-TIR数据集上训GRPO不work
#4702 opened
Jun 25, 2025 -
使用msswift框架,基于QwQ-32B模型,微调自制的function-call数据集,效果很差,不知道原因
#4700 opened
Jun 25, 2025 -
lora微调qwen3 embedding模型弹出警告find_unused_parameters
#4698 opened
Jun 24, 2025 -
Qwen2-VL merge lora报错
#4697 opened
Jun 24, 2025 -
求问 Qwen 235B A22 训练成本和 Qwen 32B dense 对比
#4696 opened
Jun 24, 2025 -
用CLI推理时,有办法能在推理结果中保存输入的dataset中的额外参数嘛?
#4693 opened
Jun 24, 2025 -
VLLM Engine 咨询
#4692 opened
Jun 24, 2025 -
多机加载大数据集时,会多台机子先后串行加载
#4691 opened
Jun 24, 2025 -
UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
#4689 opened
Jun 24, 2025 -
我想要给PPO设置两个reward model和两个value model,通过两者的value和reward加权计算loss损失,应该怎么做?
#4688 opened
Jun 24, 2025 -
请问是否支持QWenVL等多模态模型的增量预训练?
#4686 opened
Jun 24, 2025 -
改变 IMAGE_FACTOR 是不是意味着视觉部分需要重新训练?
#4685 opened
Jun 24, 2025 -
如何关闭自动模型并行呢?
#4684 opened
Jun 24, 2025 -
Help: Multi Turn SFT
#4681 opened
Jun 24, 2025 -
mllm模型训练,一个epoch训练完任务卡住,gpu利用率100%,无法save checkpoint
#4680 opened
Jun 23, 2025 -
GRPO训练失败,模型似乎学习困难
#4679 opened
Jun 23, 2025 -
奖励函数一直震荡不上升,似乎学不到东西
#4677 opened
Jun 23, 2025 -
SFT训练一个回归任务后,推理使用vllm加速,模型load会报错,有办法解决吗
#4676 opened
Jun 23, 2025 -
持续输出Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.,但是不报错,请问是什么原因?
#4673 opened
Jun 23, 2025 -
关于 rlhf 数据的preprocess
#4670 opened
Jun 23, 2025 -
微调 MiniCPM-o-2_6 报错 assert media_type in {'image', 'video'}
#4668 opened
Jun 23, 2025 -
Any way to run evaluation before training starts?
#4660 opened
Jun 22, 2025 -
'weight' must be 2-D
#4656 opened
Jun 20, 2025
23 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
pretrain报错进度异常问题
#2692 commented on
Jun 20, 2025 • 0 new comments -
更新以后我应该如何获得history呢
#4645 commented on
Jun 23, 2025 • 0 new comments -
Qwen2.5-vl预训练过程中loss突然激增
#4634 commented on
Jun 23, 2025 • 0 new comments -
Fatal Python error: none_dealloc: deallocating None
#4353 commented on
Jun 23, 2025 • 0 new comments -
lora微调后merge完模型进行lmdeploy推理用时比Qwen2.5-VL-7B-Instruct多一倍,原因为何?
#4609 commented on
Jun 23, 2025 • 0 new comments -
能否支持MiniCPM-o 2.6 audio模态训练
#2961 commented on
Jun 23, 2025 • 0 new comments -
采用swift infer 测试qwen2.5-omni模型结果,与官方测试方法结果不一致
#4595 commented on
Jun 23, 2025 • 0 new comments -
qwen3-32B全参数ppo训练一步报错
#4599 commented on
Jun 23, 2025 • 0 new comments -
训练Omni的时候会卡住不动
#4651 commented on
Jun 23, 2025 • 0 new comments -
qwen2.5-7B GRPO训练时卡住,未显示任何报错
#4603 commented on
Jun 23, 2025 • 0 new comments -
为什么没有loss
#4652 commented on
Jun 24, 2025 • 0 new comments -
可以在moe的模型训练中 增加专家并行的参数吗
#1631 commented on
Jun 24, 2025 • 0 new comments -
Any example on training llama on function calling dataset?
#4604 commented on
Jun 24, 2025 • 0 new comments -
lora 微调 ovis2-34B loss=0.0 grad_norm=nan
#3494 commented on
Jun 25, 2025 • 0 new comments -
Error occurred when saving checkpoints during Qwen3 multi-GPU SFT
#4411 commented on
Jun 25, 2025 • 0 new comments -
GRPO的时候怎么保存最后一步的checkpoints
#4574 commented on
Jun 25, 2025 • 0 new comments -
🍭[Roadmap] ms-swift3.6
#4561 commented on
Jun 25, 2025 • 0 new comments -
训练保存checkpoint的时候报错,但本地又有相应的文件。
#3420 commented on
Jun 25, 2025 • 0 new comments -
🚀 Best Practices for Training Qwen3/Qwen3-MoE
#4030 commented on
Jun 25, 2025 • 0 new comments -
支持GME微调么
#3019 commented on
Jun 25, 2025 • 0 new comments -
想问下embedding的训练如何加入system or instructions?
#4638 commented on
Jun 26, 2025 • 0 new comments -
swift infer 设置了temperature,top_p 但是每次生成都是同样的结果
#4627 commented on
Jun 26, 2025 • 0 new comments -
训练后的RM模型,支持推理引擎sglang/vllm部署
#3610 commented on
Jun 26, 2025 • 0 new comments