diff --git a/CHANGELOG.md b/CHANGELOG.md index 27ca0ab..cf95884 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,6 +15,9 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm ### Changed * Reorganized documentation into thematic areas +* Update RAPIDS base image +* Update RAPIDS Python env name to `base` from `rapids` +* Updated most docker compose command references to `docker compose` from `docker-compose` ### Infra diff --git a/README.md b/README.md index d1e221e..de48f3c 100644 --- a/README.md +++ b/README.md @@ -29,7 +29,7 @@ Graphistry is the most scalable graph-based visual analysis and investigation au You can test your GPU environment via Graphistry's [base RAPIDS Docker image on DockerHub](https://hub.docker.com/r/graphistry/graphistry-forge-base): ```bash -docker run --rm -it --entrypoint=/bin/bash graphistry/graphistry-forge-base:latest -c "source activate rapids && python3 -c \"import cudf; print(cudf.DataFrame({'x': [0,1,2]})['x'].sum())\"" +docker run --rm -it --entrypoint=/bin/bash graphistry/graphistry-forge-base:latest -c "source activate base && python3 -c \"import cudf; print(cudf.DataFrame({'x': [0,1,2]})['x'].sum())\"" ``` => diff --git a/docs/app-config/configure-investigation.md b/docs/app-config/configure-investigation.md index 3da9468..e515631 100644 --- a/docs/app-config/configure-investigation.md +++ b/docs/app-config/configure-investigation.md @@ -35,8 +35,8 @@ Via `data/pivot-db/config/config.json`: After setting these, restart your server: -* Full: `user@server.com : /var/graphistry $ docker-compose stop && docker-compose up -d` -* Pivot: `user@server.com : /var/graphistry $ docker-compose stop nginx pivot && docker-compose up -d` +* Full: `user@server.com : /var/graphistry $ docker compose stop && docker compose up -d` +* Pivot: `user@server.com : /var/graphistry $ docker compose stop nginx pivot && docker compose up -d` # Schema diff --git a/docs/app-config/configure-ontology.md b/docs/app-config/configure-ontology.md index c171d20..28c9f49 100644 --- a/docs/app-config/configure-ontology.md +++ b/docs/app-config/configure-ontology.md @@ -15,7 +15,7 @@ See below for the list of built-in types they map to. ## Define custom ontologies 1. Edit `data/investigations/config/config.json` as per below -2. Restart docker service `pivot`: `docker-compose restart pivot` +2. Restart docker service `pivot`: `docker compose restart pivot` Generally, you can limit the amount of work by mapping custom column names to built-in types, and thereby reuse their preconfigured settings. @@ -73,7 +73,7 @@ For example, to create a new node type `ip`, 2. Restart the pivot service: -```user@server.com:/var/graphistry $ docker-compose stop pivot nginx && docker-compose up -d``` +```user@server.com:/var/graphistry $ docker compose stop pivot nginx && docker compose up -d``` ### Override default node/edge titles @@ -128,7 +128,7 @@ For example, to recognize `src_ip` and `dest_ip` columns as both generating `ip` 2. Restart the pivot service: ``` -user@server.com:/var/graphistry $ docker-compose stop pivot nginx && docker-compose up -d +user@server.com:/var/graphistry $ docker compose stop pivot nginx && docker compose up -d ``` ## Built-in types @@ -190,7 +190,7 @@ You can put any regular expression here: Graphistry tries to detect syntax error, and upon one, logs the error and stops. To see what is going on: `docker ps` <- see if `pivot` is unhealthy or in a restart loop -`docker-compose logs pivot` <- see the precise error message +`docker compose logs pivot` <- see the precise error message 2. Satisfactory configuration diff --git a/docs/app-config/configure.md b/docs/app-config/configure.md index 0ae0b6c..b40f769 100644 --- a/docs/app-config/configure.md +++ b/docs/app-config/configure.md @@ -99,7 +99,7 @@ For visualizations to be embeddable in different origin sites, enable `COOKIE_SE COOKIE_SAMESITE=None ``` -... then restart: `docker-compose up -d --force-recreate --no-deps nexus` +... then restart: `docker compose up -d --force-recreate --no-deps nexus` ### Setup free Automatic TLS @@ -182,8 +182,8 @@ Custom TLS setups often fail due to the certificate, OS, network, Caddy config, * Test the certificate * Test a [standalone Caddy static file server](https://www.baty.net/2018/using-caddy-for-serving-static-content/) * ... Including on another box, if OS/network issues are suspected -* Check the logs of `docker-compose logs -f -t caddy nginx` -* Test whether the containers are up and ports match via `docker-compose ps`, `curl`, and `curl` from within a docker container (so within the docker network namespace) +* Check the logs of `docker compose logs -f -t caddy nginx` +* Test whether the containers are up and ports match via `docker compose ps`, `curl`, and `curl` from within a docker container (so within the docker network namespace) If problems persist, please reach out to your Graphistry counterparts. Additional workarounds are possible. @@ -281,7 +281,7 @@ SPLUNK_HOST=... 2. Restart `graphistry`, or at least the `pivot` service: -`docker-compose stop && docker-compose up -d` or `docker-compose stop nginx pivot && docker-compose up -d` +`docker compose stop && docker compose up -d` or `docker compose stop nginx pivot && docker compose up -d` 3. Test diff --git a/docs/commands.md b/docs/commands.md index 5be3509..40bc2a2 100644 --- a/docs/commands.md +++ b/docs/commands.md @@ -1,6 +1,6 @@ # Top commands -Graphistry supports advanced command-line administration via standard `docker-compose`, `.yml` / `.env` files, and `caddy` reverse-proxy configuration. +Graphistry supports advanced command-line administration via standard `docker compose`, `.yml` / `.env` files, and `caddy` reverse-proxy configuration. ## Login to server @@ -18,17 +18,17 @@ All likely require `sudo`. Run from where your `docker-compose.yml` file is loca | TASK | COMMAND | NOTES | |--: |:--- |:--- | | **Install** | `docker load -i containers.tar.gz` | Install the `containers.tar.gz` Graphistry release from the current folder. You may need to first run `tar -xvvf my-graphistry-release.tar.gz`. | -| **Start
interactive** | `docker-compose up` | Starts Graphistry, close with ctrl-c | -| **Start
daemon** | `docker-compose up -d` | Starts Graphistry as background process | -| **Start
namespaced (concurrent)** | `docker-compose -p my_unique_namespace up` | Starts Graphistry in a specific namespace. Enables running multiple independent instances of Graphistry. NOTE: Must modify Caddy service in `docker-compose.yml` to use non-conflicting public ports, and likewise change global volumes to be independent. | -| **Stop** | `docker-compose stop` | Stops Graphistry | +| **Start
interactive** | `docker compose up` | Starts Graphistry, close with ctrl-c | +| **Start
daemon** | `docker compose up -d` | Starts Graphistry as background process | +| **Start
namespaced (concurrent)** | `docker compose -p my_unique_namespace up` | Starts Graphistry in a specific namespace. Enables running multiple independent instances of Graphistry. NOTE: Must modify Caddy service in `docker-compose.yml` to use non-conflicting public ports, and likewise change global volumes to be independent. | +| **Stop** | `docker compose stop` | Stops Graphistry | | **Restart (soft)** | `docker restart ` | Soft restart. May also need to restart service `nginx`. | | **Restart (hard)** | `docker up -d --force-recreate --no-deps ` | Restart with fresh state. May also need to restart service `nginx`. | -| **Reset** | `docker-compose down -v && docker-compose up -d` | Stop Graphistry, remove all internal state (including the user account database!), and start fresh . | -| **Status** | `docker-compose ps`, `docker ps`, and `docker status` | Status: Uptime, healthchecks, ... | +| **Reset** | `docker compose down -v && docker compose up -d` | Stop Graphistry, remove all internal state (including the user account database!), and start fresh . | +| **Status** | `docker compose ps`, `docker ps`, and `docker status` | Status: Uptime, healthchecks, ... | | **GPU Status** | `nvidia-smi` | See GPU processes, compute/memory consumption, and driver. Ex: `watch -n 1.5 nvidia-smi`. Also, `docker run --rm -it nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi` for in-container test. | -| **1.0 API Key** | docker-compose exec streamgl-vgraph-etl curl "http://0.0.0.0:8080/api/internal/provision?text=MYUSERNAME" | Generates API key for a developer or notebook user (1.0 API is deprecated)| -| **Logs** | `docker-compose logs ` | Ex: Watch all logs, starting with the 20 most recent lines: `docker-compose logs -f -t --tail=20 forge-etl-python` . You likely need to switch Docker to use the local json logging driver by deleting the two default managed Splunk log driver options in `/etc/docker/daemon.json` and then restarting the `docker` daemon (see below). | +| **1.0 API Key** | docker compose exec streamgl-vgraph-etl curl "http://0.0.0.0:8080/api/internal/provision?text=MYUSERNAME" | Generates API key for a developer or notebook user (1.0 API is deprecated)| +| **Logs** | `docker compose logs ` | Ex: Watch all logs, starting with the 20 most recent lines: `docker compose logs -f -t --tail=20 forge-etl-python` . You likely need to switch Docker to use the local json logging driver by deleting the two default managed Splunk log driver options in `/etc/docker/daemon.json` and then restarting the `docker` daemon (see below). | | **Create Users** | Use Admin Panel (see [Create Users](tools/user-creation.md)) or `etc/scripts/rest` | | **Restart Docker Daemon** | `sudo service docker restart` | Use when changing `/etc/docker/daemon.json`, ... | | **Jupyter shell**| `docker exec -it -u root graphistry_notebook_1 bash` then `source activate rapids` | Use for admin tasks like global package installs | \ No newline at end of file diff --git a/docs/debugging/debug-faq.md b/docs/debugging/debug-faq.md index 7782075..a0ea8e9 100644 --- a/docs/debugging/debug-faq.md +++ b/docs/debugging/debug-faq.md @@ -19,7 +19,7 @@ Visualization page never returns or Nginx "504 Gateway Time-out" due to services * Often with first-ever container launch * Likely within 60s of launch * Can happen even after static homepage loads -* In `docker-compose up` logs (or `docker logs ubuntu_central_1`): +* In `docker compose up` logs (or `docker logs ubuntu_central_1`): * "Error: Server at maximum capacity... * "Error: Too many users... * "Error while assigning... @@ -54,9 +54,9 @@ Visualization page never returns or Nginx "504 Gateway Time-out" due to services * ./graphistry-cli/graphistry/bootstrap/ubuntu-cuda9.2/test-20-docker.sh * ./graphistry-cli/graphistry/bootstrap/ubuntu-cuda9.2/test-30-CUDA.sh * ./graphistry-cli/graphistry/bootstrap/ubuntu-cuda9.2/test-40-nvidia-docker.sh - * nvidia-docker run --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi + * nvidia-docker run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi * nvidia-docker exec -it ubuntu_viz_1 nvidia-smi - * If `run --rm nvidia/cuda:11.5.0-base-ubuntu20.04` succeeds but `exec` fails, you likely need to update `/etc/docker/daemon.json` to add `nvidia-container-runtime`, and `sudo service docker restart`, and potentially clean stale images to make sure they use the right runtime + * If `run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10` succeeds but `exec` fails, you likely need to update `/etc/docker/daemon.json` to add `nvidia-container-runtime`, and `sudo service docker restart`, and potentially clean stale images to make sure they use the right runtime * See https://www.npmjs.com/package/@graphistry/cljs * In container `ubuntu_viz_1`, create & run `/opt/graphistry/apps/lib/cljs/test/cl node test-nvidia.js`: ``` diff --git a/docs/debugging/performance-tuning.md b/docs/debugging/performance-tuning.md index 660dfa4..1309cf9 100644 --- a/docs/debugging/performance-tuning.md +++ b/docs/debugging/performance-tuning.md @@ -11,7 +11,7 @@ See also [deployment planning](../planning/deployment-planning.md) and [hw/sw pl * Check for both memory compute, and network consumption, and by which process * Check logs for potential errors * System: Standard OS logs - * App: `docker-compose logs` + * App: `docker compose logs` * Log level impacts performance * TRACE: Slow due to heavy CPU <> GPU traffic * DEBUG: Will cause large log volumes that require rotation diff --git a/docs/install/cloud/aws_marketplace.md b/docs/install/cloud/aws_marketplace.md index 3223b7f..59ad56f 100644 --- a/docs/install/cloud/aws_marketplace.md +++ b/docs/install/cloud/aws_marketplace.md @@ -74,9 +74,9 @@ Many `ssh` clients may require you to first run `chmod 400 my_key.pem` or `chmod Graphistry leverages `docker-compose` and the AWS Marketplace AMI preconfigures the `nvidia` runtime for `docker`. -``` +```bash cd ~/graphistry -sudo docker-compose ps +sudo docker compose ps ``` => @@ -119,7 +119,7 @@ Note that `sudo` is unnecessary within the container: ubuntu@ip-172-31-0-38:~/graphistry$ docker exec -it -u root graphistry_notebook_1 bash root@d4afa8b7ced5:/home/graphistry# apt update root@d4afa8b7ced5:/home/graphistry# apt install golang -root@d4afa8b7ced5:/home/graphistry# source activate rapids && conda install pyarrow +root@d4afa8b7ced5:/home/graphistry# source activate base && conda install pyarrow ``` **User:** diff --git a/docs/install/cloud/azure.md b/docs/install/cloud/azure.md index 4d1d7fa..44ae13e 100644 --- a/docs/install/cloud/azure.md +++ b/docs/install/cloud/azure.md @@ -91,7 +91,7 @@ For steps involving an IP address, see needed IP value at Azure console in `Over Install docker-compose: -``` +```bash sudo curl -L "https://github.com/docker/compose/releases/download/1.23.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose sudo chmod +x /usr/local/bin/docker-compose ``` diff --git a/docs/install/cloud/azure_marketplace.md b/docs/install/cloud/azure_marketplace.md index e39b88b..654805c 100644 --- a/docs/install/cloud/azure_marketplace.md +++ b/docs/install/cloud/azure_marketplace.md @@ -138,8 +138,23 @@ graphistry@d4afa8b7ced5:~$ go version go version go1.10.4 linux/amd64 ``` +### 8. GPUDirect Storage (GDS) +**Issue Overview** +A specific issue has been identified with NVIDIA GPUDirect Storage (GDS) on Azure `Ubuntu 22.04` images using the official NVIDIA CUDA Drivers (version `550`). While the same operating system and driver versions work correctly on other cloud providers such as AWS AMI and Google Kubernetes Engine (GKE), the Azure platform presents a unique challenge. -### 8. Marketplace FAQ +**Temporary Workaround** +To ensure stability and performance of Graphistry on Azure, GDS support has been disabled by default. This adjustment has been applied to all Azure Marketplace images by setting the environment variable `LIBCUDF_CUFILE_POLICY` to `OFF`. See the official Magnum IO GPUDirect Storage (GDS) documentation here for more information: +https://docs.rapids.ai/api/cudf/nightly/user_guide/io/io/#magnum-io-gpudirect-storage-integration + +**Future Considerations** +Monitoring of this issue is ongoing to identify a permanent solution. Once Azure resolves this issue, GDS support will be re-enabled to take full advantage of its performance benefits. Customers will be informed of any updates or changes regarding this matter. + +**Recommendations** +Users may override the default setting and enable GDS support manually by setting the environment variable in the `data/config/custom.env` file (a Docker Compose environment file). For example: `LIBCUDF_CUFILE_POLICY=ALWAYS`. + +Understanding and cooperation are appreciated as work towards a resolution continues. For further assistance, please reach out to the support team. + +### 9. Marketplace FAQ #### No site loads or there is an Nginx 404 error diff --git a/docs/install/on-prem/index.rst b/docs/install/on-prem/index.rst index 02c830f..9b926d0 100644 --- a/docs/install/on-prem/index.rst +++ b/docs/install/on-prem/index.rst @@ -28,7 +28,7 @@ Note: In previous versions (< `v2.35`), the file was `containers.tar` **2. Launch** from the folder with `docker-compose.yml` if not already up, and likely using `sudo`: ```bash -docker-compose up -d +docker compose up -d ``` Note: Takes 1-3 min, and around 5 min, `docker ps` should report all services as `healthy` diff --git a/docs/install/on-prem/manual.md b/docs/install/on-prem/manual.md index 9eee42b..f028362 100644 --- a/docs/install/on-prem/manual.md +++ b/docs/install/on-prem/manual.md @@ -53,7 +53,7 @@ Skip almost all of these steps by instead running through [AWS Marketplace](../c * **Start from an Nvidia instace**
You can skip most of the steps by starting with an Nvidia NGC or Tensorflow instance. - * These still typically require installing `docker-compose` (and testing it), setting `/etc/docker/daemon.json` to default to the `nvidia-docker` runtime, and restarting `docker` (and testing it). See end of [RHEL 7.6](rhel_7_6_setup.md) and [Ubuntu 18.04 LTS](ubuntu_18_04_lts_setup.md) sample scripts for install and test instruction. + * These still typically require installing `docker compose` (and testing it), setting `/etc/docker/daemon.json` to default to the `nvidia-docker` runtime, and restarting `docker` (and testing it). See end of [RHEL 7.6](rhel_7_6_setup.md) and [Ubuntu 18.04 LTS](ubuntu_18_04_lts_setup.md) sample scripts for install and test instruction. * **Start from raw Ubuntu/RHEL**
You can build from scratch by picking a fully unconfigured starting point and following the [RHEL 7.6](rhel_7_6_setup.md) and [Ubuntu 18.04 LTS](ubuntu_18_04_lts_setup.md) On-Prem Sample instructions. Contact Graphistry staff for automation script assistance if also applicable. @@ -84,6 +84,6 @@ docker load -i containers.tar ## 5. Start -Launch with `docker-compose up`, and stop with `ctrl-c`. To start as a background daemon, use `docker-compose up -d`. +Launch with `docker compose up`, and stop with `ctrl-c`. To start as a background daemon, use `docker compose up -d`. Congratulations, you have installed Graphistry! diff --git a/docs/install/on-prem/rhel8_prereqs_install.sh b/docs/install/on-prem/rhel8_prereqs_install.sh index 7e18378..c252ba7 100644 --- a/docs/install/on-prem/rhel8_prereqs_install.sh +++ b/docs/install/on-prem/rhel8_prereqs_install.sh @@ -167,7 +167,7 @@ docker compose version \ BOOTSTRAP_DIR="${GRAPHISTRY_HOME}/etc/scripts/bootstrap" CUDA_SHORT_VERSION=${CUDA_SHORT_VERSION:-`cat ${GRAPHISTRY_HOME}/CUDA_SHORT_VERSION`} -NVIDIA_CONTAINER="nvidia/cuda:11.5.2-base-ubuntu20.04" +NVIDIA_CONTAINER="docker.io/rapidsai/base:24.04-cuda11.8-py3.10" sudo docker run --rm --gpus all ${NVIDIA_CONTAINER} nvidia-smi \ diff --git a/docs/install/on-prem/rhel_7_6_setup.md b/docs/install/on-prem/rhel_7_6_setup.md index 567604a..6001876 100644 --- a/docs/install/on-prem/rhel_7_6_setup.md +++ b/docs/install/on-prem/rhel_7_6_setup.md @@ -67,7 +67,7 @@ curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidi sudo yum install -y nvidia-container-runtime sudo systemctl enable --now docker -sudo docker run --gpus all nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi +sudo docker run --gpus all docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi # Nvidia docker as default runtime (needed for docker-compose) sudo yum install -y vim @@ -83,6 +83,6 @@ sudo vim /etc/docker/daemon.json } sudo systemctl restart docker -sudo docker run --runtime=nvidia --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi -sudo docker run --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi +sudo docker run --runtime=nvidia --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi +sudo docker run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi ``` diff --git a/docs/install/on-prem/ubuntu_18_04_lts_setup.md b/docs/install/on-prem/ubuntu_18_04_lts_setup.md index 6e40fa5..0d644a5 100644 --- a/docs/install/on-prem/ubuntu_18_04_lts_setup.md +++ b/docs/install/on-prem/ubuntu_18_04_lts_setup.md @@ -109,7 +109,7 @@ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker #_not_ default runtime -sudo docker run --gpus all nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi +sudo docker run --gpus all docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi #################### # # @@ -134,6 +134,6 @@ EOF sudo systemctl restart docker -sudo docker run --runtime=nvidia --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi -sudo docker run --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi +sudo docker run --runtime=nvidia --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi +sudo docker run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi ``` diff --git a/docs/install/on-prem/ubuntu_20_04_setup.sh b/docs/install/on-prem/ubuntu_20_04_setup.sh index 774b90e..8dd6cb7 100755 --- a/docs/install/on-prem/ubuntu_20_04_setup.sh +++ b/docs/install/on-prem/ubuntu_20_04_setup.sh @@ -296,7 +296,7 @@ sudo docker compose version # not used: # CUDA_SHORT_VERSION=${CUDA_SHORT_VERSION:-`cat ${GRAPHISTRY_HOME}/CUDA_SHORT_VERSION`} -NVIDIA_CONTAINER="nvidia/cuda:11.0.3-base-ubuntu18.04" +NVIDIA_CONTAINER="docker.io/rapidsai/base:24.04-cuda11.8-py3.10" sudo docker run --rm --gpus all ${NVIDIA_CONTAINER} nvidia-smi \ diff --git a/docs/install/on-prem/vGPU.md b/docs/install/on-prem/vGPU.md index c544053..1b84d9c 100644 --- a/docs/install/on-prem/vGPU.md +++ b/docs/install/on-prem/vGPU.md @@ -17,7 +17,7 @@ A *baremetal OS* (no hypervisor) or *passthrough driver* (hypervisor with non-vG * Graphistry already automatically uses all GPUs exposed to it, primarily for scaling to more user sessions * New APIs are starting to use multi-GPUs for acceleration as well * Multiple Graphistry installs - * You can launch concurrent instances of Graphistry using docker: `docker-compose up -p my_unique_namespace_123` + * You can launch concurrent instances of Graphistry using docker: `docker compose up -p my_unique_namespace_123` * You can configure docker to use different GPUs or share the same ones * Isolate Graphistry from other GPU software * Docker allows picking which GPUs + CPUs are used diff --git a/docs/install/testing-an-install.md b/docs/install/testing-an-install.md index 934439e..c9d8645 100644 --- a/docs/install/testing-an-install.md +++ b/docs/install/testing-an-install.md @@ -6,7 +6,7 @@ Most of the testing and inspection is standard for Docker-based web apps: `docke * To test your base Docker environment for GPU RAPIDS, see the in-depth GPU testing section below. -* For logs throughout your session, you can run `docker-compose logs -f -t --tail=1` and `docker-compose logs -f -t --tail=1 SOME_SERVICE_NAME` to see the effects of your activities. Modify `custom.env` to increase `GRAPHISTRY_LOG_LEVEL` and `LOG_LEVEL` to `DEBUG` for increased logging, and `/etc/docker/daemon.json` to use log driver `json-file` for local logs. +* For logs throughout your session, you can run `docker compose logs -f -t --tail=1` and `docker compose logs -f -t --tail=1 SOME_SERVICE_NAME` to see the effects of your activities. Modify `custom.env` to increase `GRAPHISTRY_LOG_LEVEL` and `LOG_LEVEL` to `DEBUG` for increased logging, and `/etc/docker/daemon.json` to use log driver `json-file` for local logs. NOTE: Below tests use the deprecated 1.0 REST upload API. @@ -54,9 +54,9 @@ f0bc21b5bda2 compose_redis_1 0.05% 6.781MiB / 31.27GiB | Jupyter notebooks | `notebook` (heavy) | | Dashboards | `graph-app-kit-public`, `graph-app-kit-private` | -* It is safe to reset any individual container **except** `postgres`, which is stateful: `docker-compose up -d --force-recreate --no-deps ` +* It is safe to reset any individual container **except** `postgres`, which is stateful: `docker compose up -d --force-recreate --no-deps ` -* For any unhealthy container, such as stuck in a restart loop, check `docker-compose logs -f -t --tail=1000 that_service`. To further diagnose, potentially increase the system log level (edit `data/config/custom.env` to have `LOG_LEVEL=DEBUG`, `GRAPHISTRY_LOG_LEVEL=DEBUG`) and recreate + restart the unhealthy container +* For any unhealthy container, such as stuck in a restart loop, check `docker compose logs -f -t --tail=1000 that_service`. To further diagnose, potentially increase the system log level (edit `data/config/custom.env` to have `LOG_LEVEL=DEBUG`, `GRAPHISTRY_LOG_LEVEL=DEBUG`) and recreate + restart the unhealthy container * Check `data/config/custom.env` has system-local keys (ex: `STREAMGL_SECRET_KEY`) with fallback to `.env` @@ -77,7 +77,7 @@ f0bc21b5bda2 compose_redis_1 0.05% 6.781MiB / 31.27GiB * If points still do not load, or appear and freeze, likely issues with GPU init (driver) or websocket (firewall) * Can also be because preloaded datasets are unavailable: not provided, or externally mounted data sources * In this case, use ETL test, and ensure clustering runs for a few seconds (vs. just initial pageload) -* Check `docker-compose logs -f -t --tail=1` and `docker ps` in case config or GPU driver issues, especially for GPU services listed above +* Check `docker compose logs -f -t --tail=1` and `docker ps` in case config or GPU driver issues, especially for GPU services listed above * Upon failures, see below section on GPU testing ## 4a. Test 1.0 API uploads, Jupyter, and the PyGraphistry client API @@ -87,7 +87,7 @@ Do via notebook if possible, else `curl` * Get a 1.0 API key by logging into your user's dashboard, or generating a new one using host access: ``` -docker-compose exec central curl -s http://localhost:10000/api/internal/provision?text=MYUSERNAME +docker compose exec central curl -s http://localhost:10000/api/internal/provision?text=MYUSERNAME ``` * Install PyGraphistry and check recent version number (Latest: https://pypi.org/project/graphistry/), or use the provided `/notebook` install: @@ -152,7 +152,7 @@ If you cannot do **3a**, test from the host via `curl` or `wget`: Login and get the API key from your dashboard homepage, or run the following: ``` -docker-compose exec central curl -s http://localhost:10000/api/internal/provision?text=MYUSERNAME +docker compose exec central curl -s http://localhost:10000/api/internal/provision?text=MYUSERNAME ``` * Upload your 1.0 API data using the key @@ -190,8 +190,8 @@ Nodes: x y * Set each config in one go so you can test more quickly, vs start/stop. * Run ``` -docker-compose stop -docker-compose up +docker compose stop +docker compose up ``` @@ -203,7 +203,7 @@ docker-compose up #### 5c.ii Connector - Splunk * Edit `data/custom/custom.env` for `SPLUNK_HOST`, `SPLUNK_PORT`, `SPLUNK_USER`, `SPLUNK_KEY` -* Restart the `/pivot` service: `docker-compose restart pivot` +* Restart the `/pivot` service: `docker compose restart pivot` * In `/pivot/connectors`, the `Live Connectors` should list `Splunk`, and clicking `Status` will test logging in * In `Investigations and Templates`, create a new investigation: * Create and run one pivot: @@ -244,18 +244,18 @@ Cloud: * In Route53/DNS: Assign a domain to your IP, ex: `mytest.graphistry.com` * Modify `data/config/Caddyfile` to use your domain name * Unlikely: If needed, run `DOMAIN=my.site.com ./scripts/letsencrypt.sh` and `./gen_dhparam.sh` - * Restart `docker-compose restart caddy`, check pages load + * Restart `docker compose restart caddy`, check pages load * Try a notebook upload with `graphistry.register(...., protocol='https')` ## 7. Quick Testing and Test GPU Most of the below tests can be automatically run by `cd etc/scripts && ./test-gpu.sh`: * Checks `nvidia-smi` works in your OS - * Checks `nvidia-smi` works in Docker, including runtime defaults used by `docker-compose` + * Checks `nvidia-smi` works in Docker, including runtime defaults used by `docker compose` * Checks Nvidia RAPIDS can successfully create CUDA contexts and run a simple on-GPU compute and I/O task of `1 + 1 == 2` `docker ps` reports no "unhealthy", "restarting", or prolonged "starting" services: - * check `docker-compose logs`, `docker-compose logs `, `docker-compose logs -f -t --tail=100 ` + * check `docker compose logs`, `docker compose logs `, `docker compose logs -f -t --tail=100 ` * unhealthy `streamgl-gpu`, `forge-etl-python` on start: likely GPU driver issue * GPU is not the default runtime in `/etc/docker/deamon.json` (`docker info | grep Default`) * `OpenlCL` Initialization error: GPU drivers insufficently setup @@ -265,21 +265,21 @@ Most of the below tests can be automatically run by `cd etc/scripts && ./test-gp * unhealthy `nginx`, `nexus`, `caddy`: * likely config file issue, unable to start due to other upstream services, or public ports are already taken -* If a GPU service is unhealthy, the typical cause is an unhealthy Nvidia host or Nvidia container environment setup. Pinpoint the misconfiguration through the following progression, or run as part of `etc/scripts/test-gpu.sh` (Graphistry 2.33+). For on-prem users, your `container.tar` load will import Nvidia's official `nvidia/cuda:11.5.0-base-ubuntu20.04` container used by Graphistry your version, which can aid pinpointing ecosystem issues outside of Graphistry (v2.33.20+). +* If a GPU service is unhealthy, the typical cause is an unhealthy Nvidia host or Nvidia container environment setup. Pinpoint the misconfiguration through the following progression, or run as part of `etc/scripts/test-gpu.sh` (Graphistry 2.33+). For on-prem users, your `container.tar` load will import Nvidia's official `docker.io/rapidsai/base:24.04-cuda11.8-py3.10` container used by Graphistry your version, which can aid pinpointing ecosystem issues outside of Graphistry (v2.33.20+). * `docker run hello-world` reports a message <-- tests CPU Docker installation * `nvidia-smi` reports available GPUs <-- tests host has a GPU configured with expected GPU driver version number - * `docker run --gpus=all nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi` reports available GPUs <-- tests nvidia-docker installation - * `docker run --runtime=nvidia nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi` reports available GPUs <-- tests nvidia-docker installation - * `docker run --rm nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi` reports available GPUs <-- tests Docker GPU defaults (used by docker-compose via `/etc/docker/daemon.json`) - * ``docker run --rm graphistry/graphistry-forge-base:`cat VERSION`-11.5 nvidia-smi`` + * `docker run --gpus=all docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi` reports available GPUs <-- tests nvidia-docker installation + * `docker run --runtime=nvidia docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi` reports available GPUs <-- tests nvidia-docker installation + * `docker run --rm docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi` reports available GPUs <-- tests Docker GPU defaults (used by docker-compose via `/etc/docker/daemon.json`) + * ``docker run --rm graphistry/graphistry-forge-base:`cat VERSION`-11.8 nvidia-smi`` Reports available GPUs (public base image) <- tests Graphistry container CUDA versions are compatible with host versions - * ``docker run --rm graphistry/etl-server-python:`cat VERSION`-11.5 nvidia-smi`` + * ``docker run --rm graphistry/etl-server-python:`cat VERSION`-11.8 nvidia-smi`` Reports available GPUs (application image) * Repeat the docker tests, but with `cudf` execution. Ex: - ``docker run --rm -it --entrypoint=/bin/bash graphistry/etl-server-python:`cat VERSION`-11.5 -c "source activate rapids && python3 -c \"import cudf; print(cudf.DataFrame({'x': [0,1,2]})['x'].sum())\""`` + ``docker run --rm -it --entrypoint=/bin/bash graphistry/etl-server-python:`cat VERSION`-11.8 -c "source activate base && python3 -c \"import cudf; print(cudf.DataFrame({'x': [0,1,2]})['x'].sum())\""`` Tests Nvidia RAPIDS (VERSION is your Graphistry version) * `docker run graphistry/cljs:1.1 npm test` reports success <-- tests driver versioning, may be a faulty test however - * If running in a hypervisor, ensure `RMM_ALLOCATOR=default` in `data/config/custom.env`, and check the startup logs of `docker-compose logs -f -t --tail=1000 forge-etl-python` that `cudf` / `cupy` are respecting that setting (`LOG_LEVEL=INFO`) + * If running in a hypervisor, ensure `RMM_ALLOCATOR=default` in `data/config/custom.env`, and check the startup logs of `docker compose logs -f -t --tail=1000 forge-etl-python` that `cudf` / `cupy` are respecting that setting (`LOG_LEVEL=INFO`) * Health checks * CLI: Check `docker ps` for per-service status, may take 1-2min for services to connect and warm up * Per-service checks run every ~30s after a ~1min initialization delay, with several retries before capped restart diff --git a/docs/security/configure-security.md b/docs/security/configure-security.md index fc0b6fa..a1e9389 100644 --- a/docs/security/configure-security.md +++ b/docs/security/configure-security.md @@ -59,7 +59,7 @@ JWT_EXPIRATION_DELTA=3600 SESSION_COOKIE_AGE=1209600 ``` -Upon changing, restart the web server with the fresh environment: `docker-compose up -d --force-recreate --no-deps nexus` +Upon changing, restart the web server with the fresh environment: `docker compose up -d --force-recreate --no-deps nexus` ## Recommended network config: TLS, IPs, Ports diff --git a/docs/security/configure-tls-caddy-manual-lets-encrypt-handshake.md b/docs/security/configure-tls-caddy-manual-lets-encrypt-handshake.md index 6c75a5a..63c5f4f 100644 --- a/docs/security/configure-tls-caddy-manual-lets-encrypt-handshake.md +++ b/docs/security/configure-tls-caddy-manual-lets-encrypt-handshake.md @@ -57,10 +57,10 @@ Optionally, there are additional [Caddyfile http/https header settings](https:// From `${GRAPHISTRY_HOME}`, run: -* `docker-compose up -d --force-recreate --no-deps caddy` -* Watch logs with `docker-compose logs -f -t --tail=1 caddy` +* `docker compose up -d --force-recreate --no-deps caddy` +* Watch logs with `docker compose logs -f -t --tail=1 caddy` ### 5. Renewing certs * follow steps 1-4 above -* `docker-compose stop caddy` & remove caddy generate files in data/caddy/{data,config}/* -* `docker-compose up -d --force-recreate --no-deps caddy` +* `docker compose stop caddy` & remove caddy generate files in data/caddy/{data,config}/* +* `docker compose up -d --force-recreate --no-deps caddy` diff --git a/docs/tools/bridge.md b/docs/tools/bridge.md index 951885b..425bca3 100644 --- a/docs/tools/bridge.md +++ b/docs/tools/bridge.md @@ -8,7 +8,7 @@ Graphistry supports bridged connectors, which eases tasks like crossing from a c * Data bridge docker container. You can find `bridge.tar.gz` in your distribution's [release bundle](https://graphistry.zendesk.com/hc/en-us/articles/360033184174-Enterprise-Releases) and, for managed Graphistry users, by logging into the instance and scp'ing `/home/ubuntu/graphistry/bridge.tar.gz`. * Server to use as a bridge (typically on-prem), with admin access - CPU-only OK -- Linux:`docker` and `docker-compose` +- Linux:`docker` and `docker compose` * Firewall permissions between DB <> bridge and bridge <> Graphistry ## Updates @@ -37,7 +37,7 @@ Starting with `2.23.0`, you can use old bridge server versions with new Graphist ### Generate a key * Can be any string -* Ex: Unguessable strings via `docker-compose exec pivot /bin/bash -c "../../../node_modules/uuid/bin/uuid v4"` => `` +* Ex: Unguessable strings via `docker compose exec pivot /bin/bash -c "../../../node_modules/uuid/bin/uuid v4"` => `` ### Graphistry GPU application server @@ -56,7 +56,7 @@ SPLUNK_SERVER_KEY=my_key_1 SPLUNK_PROXY_KEY=my_key_2 ``` -* Launch: Restart Graphistry via `docker-compose stop pivot && docker-compose up -d pivot`. +* Launch: Restart Graphistry via `docker compose stop pivot && docker compose up -d pivot`. The connector will start looking for the data bridge. @@ -94,7 +94,7 @@ Edit `.env`. See above example. 3. Launch ``` -docker-compose up -d +docker compose up -d ``` The data bridge will autoconnect to your Graphistry application server. @@ -119,7 +119,7 @@ The bridge is a standard minimal Docker container (alpine): * Login as root (user 0): `docker exec -it -u 0 ` * Install packages as root: `apt add curl` -* Watch logs via `docker-compose logs -f -t --tail=1` +* Watch logs via `docker compose logs -f -t --tail=1` ## Debugging @@ -132,7 +132,7 @@ DEBUG=* GRAPHISTRY_LOG_LEVEL=TRACE ``` -Watch your bridge's logs and your app server's logs: `docker-compose logs -f -t --tail=1` +Watch your bridge's logs and your app server's logs: `docker compose logs -f -t --tail=1` ### Diagnose diff --git a/docs/tools/nvidia-docker-in-docker.md b/docs/tools/nvidia-docker-in-docker.md index 3bc0202..068feac 100644 --- a/docs/tools/nvidia-docker-in-docker.md +++ b/docs/tools/nvidia-docker-in-docker.md @@ -9,11 +9,11 @@ nvidia-docker run --net=host -it -v /var/run/docker.sock:/var/run/docker.sock gc # https://github.com/NVIDIA/nvidia-docker/issues/380 # curl the docker cli REST api before you name the image and somehow docker will launch nvidia docker containers just fine -docker run -ti --rm `curl -s http://localhost:3476/docker/cli` nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi +docker run -ti --rm `curl -s http://localhost:3476/docker/cli` docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi # https://stackoverflow.com/questions/22944631/how-to-get-the-ip-address-of-the-docker-host-from-inside-a-docker-container export HOST_MACHINE_ADDRESS=$(/sbin/ip route|awk '/default/ { print $3 }') -docker run -ti --rm `curl -s http://$HOST_MACHINE_ADDRESS:3476/docker/cli` nvidia/cuda:11.5.0-base-ubuntu20.04 nvidia-smi +docker run -ti --rm `curl -s http://$HOST_MACHINE_ADDRESS:3476/docker/cli` docker.io/rapidsai/base:24.04-cuda11.8-py3.10 nvidia-smi ``` \ No newline at end of file diff --git a/docs/tools/update-backup-migrate.md b/docs/tools/update-backup-migrate.md index dea59c7..93340f3 100644 --- a/docs/tools/update-backup-migrate.md +++ b/docs/tools/update-backup-migrate.md @@ -36,7 +36,7 @@ volumes: Launch under a unique name using `-p`: ``` -docker-compose -p my_unique_name up -d +docker compose -p my_unique_name up -d ``` ## The config and data files