#Helm Charts for ingest service
Can be run with datawave-helm-charts or standalone with an existing hadoop, zk, and accumulo configuration
# see https://minikube.sigs.k8s.io/docs/handbook/registry/ for more details
# enable the minikube registry at localhost:5000
minikube addons enable registry
# create a pipe to the registry running inside of minikube
docker run --rm -it --network=host alpine ash -c "apk add socat && socat TCP-LISTEN:5000,reuseaddr,fork TCP:$(minikube ip):5000"
# build the images
mvn clean package -Pdocker
# create the minikube registry tags and push to the registry
mvn clean install -Pdocker -Ddocker.registry="localhost:5000/" -DpushImage -DskipDockerBuild
# use localhost:5000/ as the registry for all ingest services
- Running k8s cluster
- helm 3+
- configuration service address
- zookeeper address
- rabbitmq
##Setup: k8s cluster minikube for single instance
# install minikube
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
# start minikube
minikube delete --all --purge && \
minikube start --cpus 8 --memory 30960 --disk-size 20480 --insecure-registry="containeryard.evoforge.org" && \
minikube image load busybox:1.28
# update .bashrc for kubectl
alias kubectl="minikube kubectl --"
# test kubectl command
kubectl get pods
Minikube maintains its own internal registry. In order to push images to the internal minikube registry the minikube docker-env needs to be exposed
# source the minikube docker-env (this will only work for this terminal)
# all docker commands will then be forwarded to the minikube internals
eval $(minikube -p minikube docker-env)
# build services as usual
mvn clean install -Pdocker
# if using the configuration service also build the datawave/config-service image
Updating values.yaml requires the following configuration. See values.example.yaml for an example values.yaml
- default registry name to append to all images
- configuration service address
- zookeeper service address
- accumulo user account secret information
- keystore name, alias, path, and password
- truststore name, path, and password
- pki files (secret recommended)
- hadoop config files
- log directory
- datawave ingest configuration files
- datawave ingest output location
- bundler input location (suggested to match output location)
- configuration service config volume
- messaging service config volume
Each volume must have the following configured
- the name of the resource
- the internal destination for the resource
- one of secret, configmap, hostPath
or .source.path
- when using secret or configmap name must be specified to reference a valid secret or configmap. When type
is hostPath a source.path
must be specified
###Ingest has three parts:
- An list which defines feeds to process. Each feed must have:
- the name of the feed
- the config service profiles to apply to this feed, comma delimited
- the rabbit queue to push files found to
- A list of ingest processing pools. Each pool must have:
- a name to identify the pool
- a queue to read feeder messages from (should match a queue specified in at least one feeder)
- the number of replicas to run for this pool
- the config service profiles to apply, comma delimited
Ingest must have a secret defined for its accumulo username and password. This may be controlled via
- defines what bundlers are in use, if any
- when enabled bundlers will be deployed as a daemonset
- A list of bundlers to run in the daemonsets. Each bundler contains
- a name
- the config server profiles to apply to the bundler
See values.example.yaml
for sample configuration. Any Configuration service must have the additional configs defined in example-config
configuration and messaging services may be deployed as part of this chart or an external configuration and messaging service may be used.
- is the service enabled
- the service name, should be updated in global.services if being used
- the number of replicas to run the service
- the image repository
- the image pull policy
- the image tag
- the directory to serve config from
- the port to expose for the configuration service
Requires global.volumes.rabbitmq
- the amqp port to expose
- the management port to expose
- section for image references
- registry override for image
- comes after registry in image definition
- the image tag, pulled from the Chart if not specified
- sets the pull policy
###Suggested testing environment minikube_helm
helm install zk zookeeper/
helm install hadoop hadoop/
helm install master ingest/ &
helm install accumulo accumulo/
# update config service to pull in example-config in values.yaml, then
helm install web web/
###Install this helm chart
cd chart
helm dependency update
helm install ingest .