Update guidelines to reflect the new Graphistry version (#44)

* update guidelines to reflect the new Graphistry version * update docker image * add NVIDIA GDS info about Azure images * docs(docker compose): update from docker-compose --------- Co-authored-by: Leo Meyerovich <[email protected]>
graphistry · Oct 28, 2024 · 4fcc6a0 · 4fcc6a0
1 parent 450fd09
commit 4fcc6a0
Show file tree

Hide file tree

Showing 24 changed files with 93 additions and 75 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -15,6 +15,9 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
 ### Changed
 
 * Reorganized documentation into thematic areas
+* Update RAPIDS base image
+* Update RAPIDS Python env name to `base` from `rapids`
+* Updated most docker compose command references to `docker compose` from `docker-compose`
 
 ### Infra
 

diff --git a/README.md b/README.md
@@ -29,7 +29,7 @@ Graphistry is the most scalable graph-based visual analysis and investigation au
 You can test your GPU environment via Graphistry's [base RAPIDS Docker image on DockerHub](https://hub.docker.com/r/graphistry/graphistry-forge-base):
 
 ```bash
-docker run --rm -it --entrypoint=/bin/bash graphistry/graphistry-forge-base:latest -c "source activate rapids && python3 -c \"import cudf; print(cudf.DataFrame({'x': [0,1,2]})['x'].sum())\""
+docker run --rm -it --entrypoint=/bin/bash graphistry/graphistry-forge-base:latest -c "source activate base && python3 -c \"import cudf; print(cudf.DataFrame({'x': [0,1,2]})['x'].sum())\""
 ```
 
 =>

diff --git a/docs/app-config/configure-investigation.md b/docs/app-config/configure-investigation.md
@@ -35,8 +35,8 @@ Via `data/pivot-db/config/config.json`:
 
 After setting these, restart your server:
 
-* Full: `[email protected] : /var/graphistry $ docker-compose stop && docker-compose up -d`
-* Pivot: `[email protected] : /var/graphistry $ docker-compose stop nginx pivot && docker-compose up -d`
+* Full: `[email protected] : /var/graphistry $ docker compose stop && docker compose up -d`
+* Pivot: `[email protected] : /var/graphistry $ docker compose stop nginx pivot && docker compose up -d`
 
 
 # Schema

diff --git a/docs/app-config/configure-ontology.md b/docs/app-config/configure-ontology.md
@@ -15,7 +15,7 @@ See below for the list of built-in types they map to.
 ## Define custom ontologies
 
 1. Edit `data/investigations/config/config.json` as per below
-2. Restart docker service `pivot`: `docker-compose restart pivot`
+2. Restart docker service `pivot`: `docker compose restart pivot`
 
 Generally, you can limit the amount of work by mapping custom column names to built-in types, and thereby reuse their preconfigured settings.
 
@@ -73,7 +73,7 @@ For example, to create a new node type `ip`,
 
 2. Restart the pivot service:
 
-```[email protected]:/var/graphistry $ docker-compose stop pivot nginx && docker-compose up -d```
+```[email protected]:/var/graphistry $ docker compose stop pivot nginx && docker compose up -d```
 
 ### Override default node/edge titles
 
@@ -128,7 +128,7 @@ For example, to recognize `src_ip` and `dest_ip` columns as both generating `ip`
 
 2. Restart the pivot service:
 ```
-[email protected]:/var/graphistry $ docker-compose stop pivot nginx && docker-compose up -d
+[email protected]:/var/graphistry $ docker compose stop pivot nginx && docker compose up -d
 ```
 
 ## Built-in types
@@ -190,7 +190,7 @@ You can put any regular expression here:
 Graphistry tries to detect syntax error, and upon one, logs the error and stops. To see what is going on:
 
 `docker ps` <- see if `pivot` is unhealthy or in a restart loop
-`docker-compose logs pivot` <- see the precise error message
+`docker compose logs pivot` <- see the precise error message
 
 2. Satisfactory configuration
 

diff --git a/docs/app-config/configure.md b/docs/app-config/configure.md
@@ -99,7 +99,7 @@ For visualizations to be embeddable in different origin sites, enable `COOKIE_SE
 COOKIE_SAMESITE=None
 ```
 
-... then restart: `docker-compose up -d --force-recreate --no-deps nexus`
+... then restart: `docker compose up -d --force-recreate --no-deps nexus`
 
 
 ### Setup free Automatic TLS
@@ -182,8 +182,8 @@ Custom TLS setups often fail due to the certificate, OS, network, Caddy config,
 * Test the certificate
 * Test a [standalone Caddy static file server](https://www.baty.net/2018/using-caddy-for-serving-static-content/)
 * ... Including on another box, if OS/network issues are suspected
-* Check the logs of `docker-compose logs -f -t caddy nginx`
-* Test whether the containers are up and ports match via `docker-compose ps`, `curl`, and `curl` from within a docker container (so within the docker network namespace)
+* Check the logs of `docker compose logs -f -t caddy nginx`
+* Test whether the containers are up and ports match via `docker compose ps`, `curl`, and `curl` from within a docker container (so within the docker network namespace)
 
 If problems persist, please reach out to your Graphistry counterparts. Additional workarounds are possible.
 
@@ -281,7 +281,7 @@ SPLUNK_HOST=...
 
 2. Restart `graphistry`, or at least the `pivot` service:
 
-`docker-compose stop && docker-compose up -d` or `docker-compose stop nginx pivot && docker-compose up -d`
+`docker compose stop && docker compose up -d` or `docker compose stop nginx pivot && docker compose up -d`
 
 3. Test
 

diff --git a/docs/commands.md b/docs/commands.md
@@ -1,6 +1,6 @@
 # Top commands
 
-Graphistry supports advanced command-line administration via standard `docker-compose`, `.yml` / `.env` files, and `caddy` reverse-proxy configuration.
+Graphistry supports advanced command-line administration via standard `docker compose`, `.yml` / `.env` files, and `caddy` reverse-proxy configuration.
 
 ## Login to server
 
@@ -18,17 +18,17 @@ All likely require `sudo`. Run from where your `docker-compose.yml` file is loca
 |  TASK	| COMMAND 	| NOTES 	|
 |--: |:---	|:---	|
 | **Install** 	| `docker load -i containers.tar.gz` 	| Install the `containers.tar.gz` Graphistry release from the current folder. You may need to first run `tar -xvvf my-graphistry-release.tar.gz`.	|
-| **Start <br>interactive** 	| `docker-compose up` 	| Starts Graphistry, close with ctrl-c 	|
-| **Start <br>daemon** 	| `docker-compose up -d` 	| Starts Graphistry as background process 	|
-| **Start <br>namespaced (concurrent)** 	| `docker-compose -p my_unique_namespace up` 	| Starts Graphistry in a specific namespace. Enables running multiple independent instances of Graphistry. NOTE: Must modify Caddy service in `docker-compose.yml` to use non-conflicting public ports, and likewise change global volumes to be independent. 	|
-| **Stop** 	| `docker-compose stop` 	| Stops Graphistry 	|
+| **Start <br>interactive** 	| `docker compose up` 	| Starts Graphistry, close with ctrl-c 	|
+| **Start <br>daemon** 	| `docker compose up -d` 	| Starts Graphistry as background process 	|
+| **Start <br>namespaced (concurrent)** 	| `docker compose -p my_unique_namespace up` 	| Starts Graphistry in a specific namespace. Enables running multiple independent instances of Graphistry. NOTE: Must modify Caddy service in `docker-compose.yml` to use non-conflicting public ports, and likewise change global volumes to be independent. 	|
+| **Stop** 	| `docker compose stop` 	| Stops Graphistry 	|
 | **Restart (soft)** 	| `docker restart <CONTAINER>` 	| Soft restart. May also need to restart service `nginx`. 	|
 | **Restart (hard)** 	| `docker up -d --force-recreate --no-deps <CONTAINER>` 	|  Restart with fresh state. May also need to restart service `nginx`.	|
-| **Reset**     | `docker-compose down -v && docker-compose up -d` | Stop Graphistry, remove all internal state (including the user account database!), and start fresh .  |
-| **Status** 	 | `docker-compose ps`, `docker ps`, and `docker status` 	|  Status: Uptime, healthchecks, ...	|
+| **Reset**     | `docker compose down -v && docker compose up -d` | Stop Graphistry, remove all internal state (including the user account database!), and start fresh .  |
+| **Status** 	 | `docker compose ps`, `docker ps`, and `docker status` 	|  Status: Uptime, healthchecks, ...	|
 | **GPU Status** | `nvidia-smi` | See GPU processes, compute/memory consumption, and driver.  Ex: `watch -n 1.5 nvidia-smi`. Also, `docker run --rm -it nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi` for in-container test. |
-| **1.0 API Key** | docker-compose exec streamgl-vgraph-etl curl "http://0.0.0.0:8080/api/internal/provision?text=MYUSERNAME" 	|  Generates API key for a developer or notebook user	(1.0 API is deprecated)|
-| **Logs** 	|  `docker-compose logs <CONTAINER>` 	|  Ex: Watch all logs, starting with the 20 most recent lines:  `docker-compose logs -f -t --tail=20 forge-etl-python`	. You likely need to switch Docker to use the local json logging driver by  deleting the two default managed Splunk log driver options in `/etc/docker/daemon.json` and then restarting the `docker` daemon (see below). |
+| **1.0 API Key** | docker compose exec streamgl-vgraph-etl curl "http://0.0.0.0:8080/api/internal/provision?text=MYUSERNAME" 	|  Generates API key for a developer or notebook user	(1.0 API is deprecated)|
+| **Logs** 	|  `docker compose logs <CONTAINER>` 	|  Ex: Watch all logs, starting with the 20 most recent lines:  `docker compose logs -f -t --tail=20 forge-etl-python`	. You likely need to switch Docker to use the local json logging driver by  deleting the two default managed Splunk log driver options in `/etc/docker/daemon.json` and then restarting the `docker` daemon (see below). |
 | **Create Users** | Use Admin Panel (see [Create Users](tools/user-creation.md)) or `etc/scripts/rest` |
 | **Restart Docker Daemon** | `sudo service docker restart` | Use when changing `/etc/docker/daemon.json`, ... |
 | **Jupyter shell**| `docker exec -it -u root graphistry_notebook_1 bash` then `source activate rapids` | Use for admin tasks like global package installs |
diff --git a/docs/debugging/debug-faq.md b/docs/debugging/debug-faq.md
@@ -19,7 +19,7 @@ Visualization page never returns or Nginx "504 Gateway Time-out" due to services
 * Often with first-ever container launch
 * Likely within 60s of launch
 * Can happen even after static homepage loads
-* In `docker-compose up` logs (or `docker logs ubuntu_central_1`):
+* In `docker compose up` logs (or `docker logs ubuntu_central_1`):
   * "Error: Server at maximum capacity...
   * "Error: Too many users...
   * "Error while assigning...
@@ -54,9 +54,9 @@ Visualization page never returns or Nginx "504 Gateway Time-out" due to services
     * ./graphistry-cli/graphistry/bootstrap/ubuntu-cuda9.2/test-20-docker.sh 
     * ./graphistry-cli/graphistry/bootstrap/ubuntu-cuda9.2/test-30-CUDA.sh 
     * ./graphistry-cli/graphistry/bootstrap/ubuntu-cuda9.2/test-40-nvidia-docker.sh
-    * nvidia-docker run --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi
+    * nvidia-docker run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi
     * nvidia-docker exec -it ubuntu_viz_1 nvidia-smi
-      * If `run --rm nvidia/cuda:11.5.0-base-ubuntu20.04` succeeds but `exec` fails, you likely need to update `/etc/docker/daemon.json` to add `nvidia-container-runtime`, and `sudo service docker restart`, and potentially clean stale images to make sure they use the right runtime
+      * If `run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10` succeeds but `exec` fails, you likely need to update `/etc/docker/daemon.json` to add `nvidia-container-runtime`, and `sudo service docker restart`, and potentially clean stale images to make sure they use the right runtime
     * See https://www.npmjs.com/package/@graphistry/cljs
     * In container `ubuntu_viz_1`, create & run `/opt/graphistry/apps/lib/cljs/test/cl node test-nvidia.js`:
 ```

diff --git a/docs/debugging/performance-tuning.md b/docs/debugging/performance-tuning.md
@@ -11,7 +11,7 @@ See also [deployment planning](../planning/deployment-planning.md) and [hw/sw pl
   * Check for both memory compute, and network consumption, and by which process 
 * Check logs for potential errors
   * System: Standard OS logs
-  * App: `docker-compose logs`
+  * App: `docker compose logs`
 * Log level impacts performance
   * TRACE: Slow due to heavy CPU <> GPU traffic
   * DEBUG: Will cause large log volumes that require rotation

diff --git a/docs/install/cloud/aws_marketplace.md b/docs/install/cloud/aws_marketplace.md
@@ -74,9 +74,9 @@ Many `ssh` clients may require you to first run `chmod 400 my_key.pem` or `chmod
 
 Graphistry leverages `docker-compose` and the AWS Marketplace AMI preconfigures the `nvidia` runtime for `docker`.
 
-```
+```bash
 cd ~/graphistry
-sudo docker-compose ps
+sudo docker compose ps
 ```
 
 =>
@@ -119,7 +119,7 @@ Note that `sudo` is unnecessary within the container:
 ubuntu@ip-172-31-0-38:~/graphistry$ docker exec -it -u root graphistry_notebook_1 bash
 root@d4afa8b7ced5:/home/graphistry# apt update 
 root@d4afa8b7ced5:/home/graphistry# apt install golang
-root@d4afa8b7ced5:/home/graphistry# source activate rapids && conda install pyarrow
+root@d4afa8b7ced5:/home/graphistry# source activate base && conda install pyarrow
 ```
 
 **User:**

diff --git a/docs/install/cloud/azure.md b/docs/install/cloud/azure.md
@@ -91,7 +91,7 @@ For steps involving an IP address, see needed IP value at Azure console in `Over
 
 Install docker-compose:
 
-```
+```bash
 sudo curl -L "https://github.com/docker/compose/releases/download/1.23.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
 sudo chmod +x /usr/local/bin/docker-compose
 ```

diff --git a/docs/install/cloud/azure_marketplace.md b/docs/install/cloud/azure_marketplace.md
@@ -138,8 +138,23 @@ graphistry@d4afa8b7ced5:~$ go version
 go version go1.10.4 linux/amd64
 ```
 
+### 8. GPUDirect Storage (GDS)
+**Issue Overview**
+A specific issue has been identified with NVIDIA GPUDirect Storage (GDS) on Azure `Ubuntu 22.04` images using the official NVIDIA CUDA Drivers (version `550`).  While the same operating system and driver versions work correctly on other cloud providers such as AWS AMI and Google Kubernetes Engine (GKE), the Azure platform presents a unique challenge.
 
-### 8. Marketplace FAQ
+**Temporary Workaround**
+To ensure stability and performance of Graphistry on Azure, GDS support has been disabled by default. This adjustment has been applied to all Azure Marketplace images by setting the environment variable `LIBCUDF_CUFILE_POLICY` to `OFF`.  See the official Magnum IO GPUDirect Storage (GDS) documentation here for more information:
+https://docs.rapids.ai/api/cudf/nightly/user_guide/io/io/#magnum-io-gpudirect-storage-integration
+
+**Future Considerations**
+Monitoring of this issue is ongoing to identify a permanent solution.  Once Azure resolves this issue, GDS support will be re-enabled to take full advantage of its performance benefits.  Customers will be informed of any updates or changes regarding this matter.
+
+**Recommendations**
+Users may override the default setting and enable GDS support manually by setting the environment variable in the `data/config/custom.env` file (a Docker Compose environment file). For example: `LIBCUDF_CUFILE_POLICY=ALWAYS`.
+
+Understanding and cooperation are appreciated as work towards a resolution continues. For further assistance, please reach out to the support team.
+
+### 9. Marketplace FAQ
 
 #### No site loads or there is an Nginx 404 error
 

diff --git a/docs/install/on-prem/index.rst b/docs/install/on-prem/index.rst
@@ -28,7 +28,7 @@ Note: In previous versions (< `v2.35`), the file was `containers.tar`
 **2. Launch** from the folder with `docker-compose.yml` if not already up, and likely using `sudo`:
 
 ```bash
-docker-compose up -d
+docker compose up -d
 ```
 
 Note: Takes 1-3 min, and around 5 min, `docker ps` should report all services as `healthy`

diff --git a/docs/install/on-prem/manual.md b/docs/install/on-prem/manual.md
@@ -53,7 +53,7 @@ Skip almost all of these steps by instead running through [AWS Marketplace](../c
 
 * **Start from an Nvidia instace**
 <br>You can skip most of the steps by starting with an Nvidia NGC or Tensorflow instance. 
-  * These still typically require installing `docker-compose` (and testing it), setting `/etc/docker/daemon.json` to default to the `nvidia-docker` runtime, and restarting `docker` (and testing it). See end of [RHEL 7.6](rhel_7_6_setup.md) and [Ubuntu 18.04 LTS](ubuntu_18_04_lts_setup.md) sample scripts for install and test instruction.
+  * These still typically require installing `docker compose` (and testing it), setting `/etc/docker/daemon.json` to default to the `nvidia-docker` runtime, and restarting `docker` (and testing it). See end of [RHEL 7.6](rhel_7_6_setup.md) and [Ubuntu 18.04 LTS](ubuntu_18_04_lts_setup.md) sample scripts for install and test instruction.
 * **Start from raw Ubuntu/RHEL**
 <br>You can build from scratch by picking a fully unconfigured starting point and following the [RHEL 7.6](rhel_7_6_setup.md) and [Ubuntu 18.04 LTS](ubuntu_18_04_lts_setup.md) On-Prem Sample instructions. Contact Graphistry staff for automation script assistance if also applicable.
 
@@ -84,6 +84,6 @@ docker load -i containers.tar
 
 ## 5. Start
 
-Launch with `docker-compose up`, and stop with `ctrl-c`. To start as a background daemon, use `docker-compose up -d`.
+Launch with `docker compose up`, and stop with `ctrl-c`. To start as a background daemon, use `docker compose up -d`.
 
 Congratulations, you have installed Graphistry!
diff --git a/docs/install/on-prem/rhel8_prereqs_install.sh b/docs/install/on-prem/rhel8_prereqs_install.sh
@@ -167,7 +167,7 @@ docker compose version \
 
 BOOTSTRAP_DIR="${GRAPHISTRY_HOME}/etc/scripts/bootstrap"
 CUDA_SHORT_VERSION=${CUDA_SHORT_VERSION:-`cat ${GRAPHISTRY_HOME}/CUDA_SHORT_VERSION`}
-NVIDIA_CONTAINER="nvidia/cuda:11.5.2-base-ubuntu20.04"
+NVIDIA_CONTAINER="docker.io/rapidsai/base:24.04-cuda11.8-py3.10"
 
 
 sudo docker run --rm --gpus all ${NVIDIA_CONTAINER} nvidia-smi \

diff --git a/docs/install/on-prem/rhel_7_6_setup.md b/docs/install/on-prem/rhel_7_6_setup.md
@@ -67,7 +67,7 @@ curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidi
 sudo yum install -y nvidia-container-runtime
 sudo systemctl enable --now docker
 
-sudo docker run --gpus all nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi
+sudo docker run --gpus all docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi
 
 # Nvidia docker as default runtime (needed for docker-compose)
 sudo yum install -y vim
@@ -83,6 +83,6 @@ sudo vim /etc/docker/daemon.json
 }
 sudo systemctl restart docker
 
-sudo docker run --runtime=nvidia --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi
-sudo docker run --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi
+sudo docker run --runtime=nvidia --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi
+sudo docker run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi
 ```
diff --git a/docs/install/on-prem/ubuntu_18_04_lts_setup.md b/docs/install/on-prem/ubuntu_18_04_lts_setup.md
@@ -109,7 +109,7 @@ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
 sudo systemctl restart docker
 
 #_not_ default runtime
-sudo docker run --gpus all nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi
+sudo docker run --gpus all docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi
 
 ####################
 #                  #
@@ -134,6 +134,6 @@ EOF
 
 sudo systemctl restart docker
 
-sudo docker run --runtime=nvidia --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi
-sudo docker run --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi
+sudo docker run --runtime=nvidia --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi
+sudo docker run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi
 ```
diff --git a/docs/install/on-prem/ubuntu_20_04_setup.sh b/docs/install/on-prem/ubuntu_20_04_setup.sh
@@ -296,7 +296,7 @@ sudo docker compose version
 
 # not used:
 # CUDA_SHORT_VERSION=${CUDA_SHORT_VERSION:-`cat ${GRAPHISTRY_HOME}/CUDA_SHORT_VERSION`}
-NVIDIA_CONTAINER="nvidia/cuda:11.0.3-base-ubuntu18.04"
+NVIDIA_CONTAINER="docker.io/rapidsai/base:24.04-cuda11.8-py3.10"
 
 
 sudo docker run --rm --gpus all ${NVIDIA_CONTAINER} nvidia-smi \

diff --git a/docs/install/on-prem/vGPU.md b/docs/install/on-prem/vGPU.md
@@ -17,7 +17,7 @@ A *baremetal OS* (no hypervisor) or *passthrough driver* (hypervisor with non-vG
   * Graphistry already automatically uses all GPUs exposed to it, primarily for scaling to more user sessions
   * New APIs are starting to use multi-GPUs for acceleration as well
 * Multiple Graphistry installs
-  * You can launch concurrent instances of Graphistry using docker: `docker-compose up -p my_unique_namespace_123`
+  * You can launch concurrent instances of Graphistry using docker: `docker compose up -p my_unique_namespace_123`
   * You can configure docker to use different GPUs or share the same ones
 * Isolate Graphistry from other GPU software
   * Docker allows picking which GPUs + CPUs are used
-Original file line number
+Diff line change
@@ Expand Up @@
     You can test your GPU environment via Graphistry's [base RAPIDS Docker image on DockerHub](https://hub.docker.com/r/graphistry/graphistry-forge-base):
     ```bash
-    docker run --rm -it --entrypoint=/bin/bash graphistry/graphistry-forge-base:latest -c "source activate rapids && python3 -c \"import cudf; print(cudf.DataFrame({'x': [0,1,2]})['x'].sum())\""
+    docker run --rm -it --entrypoint=/bin/bash graphistry/graphistry-forge-base:latest -c "source activate base && python3 -c \"import cudf; print(cudf.DataFrame({'x': [0,1,2]})['x'].sum())\""
     ```
     =>
@@ Expand Down @@