-
Notifications
You must be signed in to change notification settings - Fork 39
INSTANCES: versions, REST hosts, clusters, DataBases
- CRAB is a DataBase-centric service, various processes communicate via an Oracle DB and a File Cache service
- We call SERVICE_INSTANCE one "CRAB service" i.e. a set or REST, DB, FileCache
- The REST server offers access to the Oracle DB and points to the FileCache server in use for this SERVICE_INSTANCE
- Multiple REST servers run behind a CMSWEB FrontEnd to provide load balance and high availability behind a common DNS alias which indicates one specific CMSWEB Cluster.
- CRABClient, TaskWorker and Publisher are client of this SERVICE_INSTANCE, they need to use the same instance in order to have TW and Publisher process tasks submitted by that client. For this the relevant configuration parameters are:
Service | configuration parameter |
---|---|
CRABClient | config.General.instance |
TaskWorker | config.TaskWorker.instance |
Publisher | config.General.instance |
- CRABClient, TaskWorker and Publisher will allow to indicate the REST endpoint to use to access CRAB data base as one of a small set of predefined SERVICE_INSTANCES defined in CRABServer/src/python/ServerUtilities.py
instance | REST host | DB instance |
---|---|---|
prod | cmsweb.cern.ch | prod |
preprod | cmsweb-testbed.cern.ch | preprod |
dev | cmsweb-test2.cern.ch | dev |
other | none | none |
When SERVICE_INSTANCE is set to other
the configuration must have one pair of strings indicating the REST host fqdn and the DB instance [prod|preprod|dev]
, this allows full flexibility in connecting pieces and moving server instances around. The parameters to use when picking instance='other'
are
Service | REST host name | database name |
---|---|---|
CRABClient | config.General.restHost |
config.General.dbInstance |
TaskWorker | config.TaskWorker.restHost |
config.TaskWorker.dbInstance |
Publisher | config.General.restHost |
config.General.dbInstance |
NOTE CRABClient users do not need to specify the CRAB service instance, in which case it defaults to "prod"
-
instance
is instead a mandatory parameter for TaskWorker and Publisher
- CRABServer service is a REST interface to CRAB Oracle Data Base
- A given REST service is usually a DNS alias for a set of actual hosts which implement load balance and HighAvailability. The external users only see the DNS alias which will be called restHost.
- CRABServer runs inside CMSWEB framework, so it is part of a given CMSWEB cluster
- Numerous CMSWEB clusters exist
- cmsweb.cern.ch aka main production one
- cmsweb-testbed.cern.ch aka testbed
- cmsweb-k8s-testbed.cern.ch supposedly identical to cmsweb-testbed
- cmsweb-test.cern.ch K8s developemnt (Valentin's playground)
- cmsweb-test[1-6].cern.ch test (developers') clusters for application developers
- cmsweb-test2.cern.ch is reserved for CRAB usage
- private VM's like stefanovm.cern.ch or stefanovm2
- Numerous CMSWEB clusters exist
- One CMSWEB cluster is the interface to a particular service instance of CRAB. I.e. a full set of services which make it possible to submit, track, execute, bookkeep, one CRAB task.
- REST server contains and serves informations about itself, the CRAB File Cache server, the HTCondor pool to use for submissions, who will be allowed to use credentials uploaded by users to myproxdy etc. Many such informations can change frequently and are thus stored in a remote, web-accessible, file divided in sections, one for each cluster.
- Oracle Data Base has several instances, meaning "different data bases"
- Production on CMS Production Oracle cluster
cmsr
- Preprod on
devdb11
username:cmsweb_analysis_preprod
- Dev on
devdb11
username:cmsweb_analysis_dev
- private like Stefano's or Diego's private DB's on
devdb11
- Production on CMS Production Oracle cluster
- while Oracle DBA's usually refer to
cmsr
ordevdb11
as instances (again) - CRABClient allows to submit to a given 'CRAB instance' which means a given Data Base instance: global (i.e. production) or preprod or dev etc.
- CRAB was developed at a time when it was easy to get multipl DB instances, but almost inconceivable to have more than two cmsweb clusters (cmsweb.cern.ch and cmsweb-testbed.cern.ch) therefore
- one CRABServer REST instance is capable to connect to multiple DataBases, i.e. support multiple DB instances
- So the DataBase instance (prod/preprod/dev) could not be part of the CRABServer Rest configuration, but it was specified as something that the client (clients of the CRBServer REST are CRABClient and CRABTaskWorker) indicates in the URL (API) used. Which is constructed as
hostname
/crabserver/dbinstance
/API
- e.g. both these URL's work: https://cmsweb.cern.ch/crabserver/prod/info and https://cmsweb.cern.ch/crabserver/preprod/info
- in the initial design the view was: the CRABClient (i.e. who submits) is only interested in deciding if to submit to the production or preproduction DataBase (or some private test instance) so the CRABClient configuration file accepts the parameter
config.General.instance
and "CRAB" would figure out everything - in the migration to K8s we have multiple several cmsweb clusters, i.e. multiple REST instances which may e.g. all connect to the same DB instance and want to be able to connect explicitly to one or another such clusters in order to test specific REST instances.
CrabClient and TaskWorker communicate via the Oracle CRAB DataBase (so need a REST hostname and a DB instance name), but the CRABClient also needs to upload a sandbox to be used in job submission. FOr this it uses a dedicated CRABCache service, which also has a REST interface. The URL to be used for the CRABCache file is obtained by querying the CRABServer REST. In other words, each CRABServer REST instance knows which CRABCache service should be used and communicates this to both CRAB Client and CRAB TaskWorker via the query
https://<restHost>/crabserver/<dbInstance>/info?subresource=backendurls
e.g.
https://cmsweb.cern.ch/crabserver/prod/info?subresource=backendurls
- each CRABServer host has a configuration file
/data/srv/current/config/crabserver/config.py
which among other things has a way to indicate which "service cluster" this crabserver process will be part of (remember, there are multiple processes running on multiple hosts in the same service cluster), via the parameterdata.mode
which points to one particular section of thedata.extconfigurl
file where informations are kept about the CRABCache service to be used, ASO config, and HTCondor resources to be used. - the possible DataBase instances it can connect to are specified via the file
/data/srv/current/auth/crabserver/CRABServerAuth.py
which is not part of CRAB source code in this github repository but in principle is written ad-hoc for every machine where CRABServer is installed (see https://twiki.cern.ch/twiki/bin/view/CMSPublic/CMSCrabRESTInterface#Authentication_with_CERN_Oracle ). E.g. the CRAB REST production instance in cmsweb.cern.ch uses this (passwords have been removed)
import cx_Oracle as DB
import socket
fqdn = socket.getfqdn().lower()
dbconfig = {'preprod': {'.title': 'Pre-production',
'.order': 1,
'*': {'clientid': 'cmsweb-preprod@%s' %(fqdn),
'dsn': 'devdb11',
'liveness': 'select sysdate from dual',
'password': '*****' ,
'schema': 'cmsweb_analysis_preprod',
'timeout': 300,
'trace': True,
'type': DB,
'user': 'cmsweb_analysis_preprod'}},
'prod': {'.title': 'Production',
'.order': 0,
'GET': {'clientid': 'cmsweb-prod-r@%s' %(fqdn),
'dsn': 'cmsr',
'liveness': 'select sysdate from dual',
'password': '*****',
'schema': 'cms_analysis_reqmgr_r',
'timeout': 300,
'trace': True,
'type': DB,
'user': 'cms_analysis_reqmgr_r'},
'*': {'clientid': 'cmsweb-prod-w@%s' %(fqdn),
'dsn': 'cmsr',
'liveness': 'select sysdate from dual',
'password': '******',
'schema': 'cms_analysis_reqmgr_w',
'timeout': 300,
'trace': True,
'type': DB,
'user': 'cms_analysis_reqmgr_w'}}}
since this CRABServerAuth.py
file contains passwords, they are not kept in publicly available repositories.
- the CRABServer REST API machinery detects the Data Base instance from the URL in the HTTP request and selects the appropriate Oracle connection instance.
- CRABClient configuration file accepts the parameter
config.General.instance
which can also be passed as an option in the command line, and e.g.crab submit --help
lists this option:
--instance=INSTANCE Running instance of CRAB service. Valid values are
['test1', 'test3', 'test2', 'prod', 'preprod', 'test',
'k8s'].
where it is apparent how in January we added some K8s cluster overloading the parameter "instance" to indicate a particular REST instance instead of the DB instance.
- this was justified since already
config.General.instance
was used to indicate a particular REST host in order to support submission to private developer VM's via thins likeGeneral.instance = 'stefanovm2.cern.ch'
- this require that there is always a 1:1 mapping between Data Base instance and REST Host instance, so that the CRAB Client can figure out the two (needed to build the HTTP queries) from a single parameter.
- the code which maps the
General.instance
parameter into a REST hostname and a DataBase instance is in https://github.com/dmwm/CRABClient/blob/301de634b1fe16bf11696d975133487cd0094d37/src/python/CRABClient/ClientUtilities.py#L195
As an user of CRAB DataBase each TaskWorker instance need to identify one REST host to talk to and the DB instance to use.
- there is a set of pre-defined host/instance pair in the code, each TW instance can pick one of those via the configuration parameter
config.TaskWorker.mode
in theTaskWorkerConfig.py
file. Relevant code is inMasterWorker.py
where the value of this configuration parameter is calledMODEURL
:
MODEURL = {'cmsweb-dev': {'host': 'cmsweb-dev.cern.ch', 'instance': 'dev'},
'cmsweb-test': {'host': 'cmsweb-test.cern.ch', 'instance': 'preprod'},
'cmsweb-preprod': {'host': 'cmsweb-testbed.cern.ch', 'instance': 'preprod'},
'cmsweb-prod': {'host': 'cmsweb.cern.ch', 'instance': 'prod'},
'test' :{'host': None, 'instance': 'preprod'},
'private': {'host': None, 'instance': 'dev'},
}
- if
mode
is set to'test'
or'private'
, then the host name for the REST needs to be specified in theTaskWorkerConfig.py
configuration file via the (badly named) parameterconfig.TaskWorker.resturl
e.g.:
config.TaskWorker.resturl = 'stefanovm.cern.ch'
Modify CRAB Client so that the submitter can select REST host and Data Base instance independently
- be backward compatible with pre-2020 use (it is OK to break compatibility for K8s clusters)
- introduce
instance='other'
as a switch to allow specifying restHost and dbInstance - get rid of old
instance='private'
which in the end was a confusing way to allow a indicate restHost while forcing data base instance todev
Should do like for CrabClient, while taking advantage that here we have
freedome with configuration file.
keep a smaller set of nicknames (MODEURLs) where both REST host and DB instance are hardcoded.
Support also MODERUL='other'
which incorporates old test/private in which case both instance and url
must be specified :
- take this change to rename
config.TaskWorker.resturl
toconfig.TaskWorker.resthost
- introduce
config.TaskWorker.dbinstance
This should require changes to: