Skip to content

Commit

Permalink
Email notifications for jobs added to pipeline
Browse files Browse the repository at this point in the history
  • Loading branch information
faressc committed Dec 22, 2024
1 parent 0e0b984 commit 49499d6
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 1 deletion.
6 changes: 6 additions & 0 deletions docs/SETUP.md
Original file line number Diff line number Diff line change
Expand Up @@ -458,6 +458,12 @@ Repeat Steps 1-4 of the Section [Connect SSH Host for Tensorboard (Optional)](#c

## 8 - Test and Debug on the HPC Cluster

First set your email address for SLURM notifications in the [slurm_job.sh](../slurm_job.sh) script:

```sh
#SBATCH --mail-user=<your-email-address>
```

You can run the DVC experiment pipeline on the HPC Cluster by submitting a single SLURM job:

```sh
Expand Down
9 changes: 8 additions & 1 deletion slurm_job.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@
# This file is licensed under the Apache License, Version 2.0.
# See the LICENSE file in the root of this project for details.

# Job name and logs
#SBATCH -J tustu
#SBATCH --output=./logs/slurm/slurm-%j.out

# Resources needed
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --ntasks-per-core=1
Expand All @@ -13,7 +17,10 @@
#SBATCH --mem=100GB
#SBATCH --time=10:00:00
#SBATCH --partition=gpu
#SBATCH --output=./logs/slurm/slurm-%j.out

# Get email notifications for job status
#SBATCH --mail-type=ALL
#SBATCH --mail-user=<your-email-address>

# Default variable values
rebuild_container=false
Expand Down

0 comments on commit 49499d6

Please sign in to comment.