model-parallelism
Here are 19 public repositories matching this topic...
Dear Colossal-AI team,
There are a few features in my mind that I thought would be helpful to the project, and I wanted to ask if there is any of them which might be more useful so I could start implementing them.
Loki-Promtail is a tool for monitoring distributed logs with Grafana. Connecting the Distributed Logger to it and extracting labels from the log structure would be a user-friendly sys
-
Updated
Sep 18, 2020 - Python
-
Updated
Mar 14, 2022 - Shell
-
Updated
Apr 8, 2022 - Python
-
Updated
Apr 12, 2022 - Python
-
Updated
Mar 25, 2022 - Python
-
Updated
Aug 9, 2020 - Python
-
Updated
Mar 26, 2021
-
Updated
Feb 28, 2018 - Lua
-
Updated
Jul 13, 2019 - Python
-
Updated
Aug 25, 2019 - Python
-
Updated
Jun 10, 2020 - C
-
Updated
Jun 17, 2020 - Go
-
Updated
Aug 15, 2019 - Python
-
Updated
Dec 18, 2018 - Python
-
Updated
Jan 12, 2022 - Python
-
Updated
Jan 31, 2020 - Python
Improve this page
Add a description, image, and links to the model-parallelism topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the model-parallelism topic, visit your repo's landing page and select "manage topics."


Hi ,
I have tried out both loss.backward() and model_engine.backward(loss) for my code. There are several subtle differences that I have observed , for one retain_graph = True does not work for model_engine.backward(loss) . This is creating a problem since buffers are not being retained every time I run the code for some reason.
Please look into this if you could.