-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fixed permission issues when running the exp_workflow in a docker con…
…tainer
- Loading branch information
Showing
4 changed files
with
68 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,6 +28,6 @@ NOTES.md | |
# Ignore singularity image | ||
*.sif | ||
|
||
|
||
|
||
# Ignore local environment vars | ||
local.env | ||
/exp_logs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -292,17 +292,55 @@ This shell script runs the experiment pipeline (`dvc exp run`) and performs some | |
|
||
### Run the DVC Experiment Pipeline in a Docker Container | ||
|
||
To run the Docker container with repository, SSH and Git-gonfig bindings use the following command with the appropriate image name substituted for the placeholder `<your_image_name>`: | ||
In order to run the DVC experiment pipeline in a Docker container we need to first setup a docker volume containing our local ssh setup. Since the local ssh setup is not visible to the docker container, we will then mount the volume to the container. A simple bind mount will allways work, because the .ssh folder ownership is not changed. | ||
|
||
To create the Docker volume, use the following command: | ||
|
||
```sh | ||
docker volume create --name ssh-config | ||
``` | ||
|
||
In order to copy the local ssh setup to the Docker volume, we are obliged to create a temporary container that binds the volume. | ||
|
||
```sh | ||
docker run -it --rm -v ssh-config:/root/.ssh -v $HOME/.ssh:/local-ssh alpine:latest | ||
# Inside the container | ||
cp -r /local-ssh/* /root/.ssh/ | ||
# Copying the files will change the ownership to root | ||
# Check your the files | ||
ls -la /root/.ssh/ | ||
``` | ||
|
||
> **Info**: This will not change the ownership of the files on your local machine. | ||
Next as dvc needs the git username and email to be set, we will create a `local.env` file in the repository root directory with the following content: | ||
|
||
```env | ||
TUSTU_GIT_USERNAME="Your Name" | ||
TUSTU_GIT_EMAIL="[email protected]" | ||
``` | ||
|
||
> **Info**: This file is git-ignored and is read by the [exp_workflow.sh](./../exp_workflow.sh) script. It will then configure git with the provided username and email every time the script is run. Your local git configuration will not be changed, as this happens only if the [exp_workflow.sh](./../exp_workflow.sh) script is run from within a Docker container. | ||
We can now run the experiment within the docker container with repository and SSH volume mounted: | ||
|
||
```sh | ||
docker run --rm \ | ||
--mount type=bind,source="$(pwd)",target=/home/app \ | ||
--mount type=bind,source="$HOME/.ssh",target=/root/.ssh \ | ||
--mount type=bind,source="$HOME/.gitconfig",target=/root/.gitconfig \ | ||
--mount type=volume,source=ssh-config,target=/root/.ssh \ | ||
<your_image_name> \ | ||
/home/app/exp_workflow.sh | ||
``` | ||
|
||
In case you want to interact with the container, you can run it in interactive mode. `docker run --help` shows you all available options. | ||
|
||
```sh | ||
docker run -it --rm \ | ||
--mount type=bind,source="$(pwd)",target=/home/app \ | ||
--mount type=volume,source=ssh-config,target=/root/.ssh \ | ||
<your_image_name> | ||
``` | ||
|
||
## 6 - SLURM Job Configuration | ||
|
||
This section covers setting up SLURM jobs for the HPC cluster. SLURM manages resource allocation for your task, which we will specify in a batch job script. Our goal is to run the DVC experiment pipeline inside a Singularity Container on the nodes that have been pulled and converted from your DockerHub image. The batch job script template [slurm_job.sh](../slurm_job.sh) handles these processes and requires minimal configuration. | ||
|
@@ -405,3 +443,5 @@ python multi_submission.py | |
``` | ||
|
||
For more information on running and monitoring jobs, refer to the [User Guide](./USAGE.md). | ||
|
||
> **Info**: Singularity is used for containerization on the cluster. In the [slurm_job.sh](./../slurm_job.sh) the image is pulled from DockerHub and converted to a Singularity image. Unlike docker, singularity by default binds the complete home directory of the executing user to the container. Also, when entering a singularity container, the user in a singularity container is the same as the user on the host system. Therefore, we do not get the same permission issues as with docker. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,13 +7,31 @@ | |
# Description: This script runs an experiment with DVC within a temporary directory copy and pushes the results to the DVC and Git remote. | ||
|
||
# Set environment variables defined in global.env | ||
export $(grep -v '^#' global.env | xargs) | ||
set -o allexport | ||
source global.env | ||
set +o allexport | ||
|
||
# Define DEFAULT_DIR in the host environment | ||
export DEFAULT_DIR="$PWD" | ||
|
||
TUSTU_TMP_DIR=tmp | ||
|
||
# Setup a global git configuration if beeing inside a docker container | ||
# Docker containers create a /.dockerenv file in the root directory | ||
if [ -f /.dockerenv ]; then | ||
if [ -f local.env ]; then | ||
source local.env; | ||
fi | ||
if [ -z "$TUSTU_GIT_USERNAME" ] || [ -z "$TUSTU_GIT_EMAIL" ]; then | ||
echo "[ERROR] Please create a local.env with the vars:"; | ||
echo "TUSTU_GIT_USERNAME=MY NAME"; | ||
echo "[email protected]"; | ||
exit 1; | ||
fi | ||
git config --global user.name "$TUSTU_GIT_USERNAME" | ||
git config --global user.email "$TUSTU_GIT_EMAIL" | ||
git config --global safe.directory "$PWD" | ||
fi | ||
|
||
# Create a new sub-directory in the temporary directory for the experiment | ||
echo "Creating temporary sub-directory..." && | ||
# Generate a unique ID with the current timestamp, process ID, and hostname for the sub-directory | ||
|
@@ -31,7 +49,8 @@ if [ -f ".dvc/config.local" ]; then | |
fi; | ||
echo ".git"; | ||
} | while read file; do | ||
rsync -aR "$file" $TUSTU_EXP_TMP_DIR; | ||
# --chown flag is needed for docker to avoid permission issues | ||
rsync -aR --chown $(id -u):$(id -g) "$file" $TUSTU_EXP_TMP_DIR; | ||
done && | ||
|
||
# Change the working directory to the temporary sub-directory | ||
|