Making large AI models cheaper, faster and more accessible
-
Updated
Apr 2, 2023 - Python
Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A GPipe implementation in PyTorch
Paddle Distributed Training Examples. 飞桨分布式训练示例 Resnet Bert GPT MOE DataParallel ModelParallel PipelineParallel HybridParallel AutoParallel Zero Sharding Recompute GradientMerge Offload AMP DGC LocalSGD Wide&Deep
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
A curated list of awesome projects and papers for distributed training or inference
WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.
PyTorch implementation of 3D U-Net with model parallel in 2GPU for large model
Adaptive Tensor Parallelism for Foundation Models
Development of Project HPGO | Hybrid Parallelism Global Orchestration
Description of Framework for Efficient Fused-layer Cost Estimation, Legion (2021)
Model parallelism for NN architectures with skip connections (eg. ResNets, UNets)
A decentralized and distributed framework for training DNNs
Torch Automatic Distributed Neural Network (TorchAD-NN) training library. Built on top of TorchMPI, this module automatically parallelizes neural network training.
distributed tensorflow (model parallelism) example repository
The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.
A simple graph partitioning algorithm written in Go. Designed for use for partitioning neural networks across multiple devices which has an added cost when crossing device boundaries.
An MPI-based distributed model parallelism technique for MLP
Contains materials of internship at ALCF during summer of 2019
Add a description, image, and links to the model-parallelism topic page so that developers can more easily learn about it.
To associate your repository with the model-parallelism topic, visit your repo's landing page and select "manage topics."