Skip to content

AliyunContainerService/mpi-operator

Repository files navigation

MPI Operator

The MPI Operator makes it easy to run allreduce-style distributed training.

Deploy

kubectl create -f deploy/

Test

Launch a multi-node tensorflow benchmark training job:

kubectl create -f examples/tensorflow-benchmarks.yaml

Once everything starts, the logs are available in the launcher pod.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages