model-parallelism

Hi ,

I have tried out both loss.backward() and model_engine.backward(loss) for my code. There are several subtle differences that I have observed , for one retain_graph = True does not work for model_engine.backward(loss) . This is creating a problem since buffers are not being retained every time I run the code for some reason.

Please look into this if you could.

Mar	APR	May
	12
2021	2022	2023

model-parallelism

Here are 19 public repositories matching this topic...

microsoft / DeepSpeed

Difference between loss.backward() and model_engine.backward(loss) ?

hpcaitech / ColossalAI

List of feature ideas

kakaobrain / torchgpipe

PaddlePaddle / FleetX

alibaba / EasyParallelLibrary

Oneflow-Inc / libai

ryantd / veloce

atakehiro / 3D-U-Net-pytorch-model-parallel

LER0ever / HPGO

ngrabaskas / Torch-Automatic-Distributed-Neural-Network

EunjuYang / distributed-tf

dscpesu / NetTorrent

zhuangsc / altsplit

d4l3k / axe

mkrdip / alcf

mzj14 / mesh

joelrorseth / HyperTune

ankahira / chainermnx

olk / mnist-performance

Improve this page

Add this topic to your repo