Recommend `DiracFile` over `MassStorageFile` #162

betatim · 2016-01-05T12:46:20Z

Related to a discussion on failed large file downloads from the grid when using the MassStorageFile setup from 11-eos-storage.md

For a MassStorageFile ganga will download the file to the local machine and then copy it to EOS. This is quite fragile, especially as the error messages you get don't really help in diagnosing what went wrong.

We should switch to recommending to use DiracFiles instead. These can be created directly by the worker node instead of having to round trip to the machine where ganga is running.

The text was updated successfully, but these errors were encountered:

alexpearce · 2016-01-05T12:56:44Z

I think Chris Jones's response was pretty good:

People need to get over this idea... As far as the CERN site is concerned the 'grid' is EOS, so having your data available on the grid, with a replica at CERN is identical to having it on EOS.

I would suggest you update the starter kit to use Ulrik's suggestion, and just save the file as a DiracFile. If for some reason (and actually I think this is also redundant, see next) people want a replica at CERN (aka on EOS) then either replicate your data to the CERN site after the fact, or require this as part of your job description to start with.

Lastly, people should also remember its just as easy to run over your data, interactively, with your data on any site using the XRootD protocol, which is available for all sites, and which you can get for any LFN using the Dirac dirac-dms-lfn-accessURL method. So really its not actually necessary at all to demand your data is all at CERN on EOS to start with...

I think there are two main reasons why people want their files on EOS:

You can organise everything in to a nice directory structure and the paths are (somewhat) memorable;
It would make sense if EOS was faster than accessing a file in Spain when you're at CERN

For 1, I think as long as there's an obvious way to get the list of file paths you need, it's not too bad. I tink you'd need a little loop in Ganga for this.

For 2, I don't know if having the files at CERN actually is any faster, maybe the internet connection to the grid sites is so 🔥 fast that there's some other bottleneck. If people want, they could replicate the file to CERN anyway.

And: if you can easily get a list of files, you could also xrdcp everything from whatever grid site to EOS without too much pain (though you could just ask for replication, as that's what you'd be doing).

I dunno. I think the most important thing is that you want the most automation possible; you don't want to be handing out scripts or functions for .ganga.py. Maybe we just need to change people's opinions that you need the file ‘locally’.

betatim · 2016-01-05T12:59:15Z

For MassStorageFiles the path where to find the output is easy to predict, and can be accessed either via xrootd or the eosmount eos trick.

For DiracFiles it seems they get stored at a location like: ~/eos/lhcb/grid/user/lhcb/user/a/another with sub directories for each ganga job and subjob ID. ls'ing a random dir there:

~/eos/lhcb/grid/user/lhcb/user/t/thead/452.7
$ ls -R
.:
2014_05

./2014_05:
76328

./2014_05/76328:
76328968

./2014_05/76328/76328968:
HLT.xdst

Do we know what those numbers correspond to? You can look them up and generate a file list, but I fear the self-made paths from MassStorageFile are still nicer to use.

alexpearce · 2016-01-05T13:05:59Z

You can look them up and generate a file list, but I fear the self-made paths from MassStorageFile are still nicer to use.

For sure 😞

I think the number is the ID of the grid job, which should be unique across All Grid Jobs Ever. It might be stored in the job's backend object.

If you have a job and want the list of LFNs, this should do it:

job = jobs(job_id)
for sj in job.subjobs:
    for df in sj.outputfiles.get(DiracFile):
        print df.lfn

If the file is replicated at CERN-USER, the LFN can be mapped to an XRootD path I think (I don't have an example job to play around with).

saschastahl · 2016-01-05T13:11:50Z

Yeah, for me it basically boils down to the fact, that I want to be able to look into my ntuple within seconds. And I really cannot be bothered to find out some obscure LFNs every time :(.
Though I have never used MassStorageFile as I was aware of the several fragile steps it includes and always copy to eos by hand.

alexpearce · 2016-01-05T13:16:18Z

Yeah, people want to be able to do root <file>; new TBrowser.

With a file on the CERN grid site/EOS you can do

$ root
root [0] TFile *f = TFile::Open("root://eoslhcb.cern.ch//eos/…")
root [1] TBrowser tb

which isn't awful.

saschastahl · 2016-01-05T13:18:50Z

No, it is ok. But you have to remember the path :-).

egede · 2016-01-05T15:30:22Z

Two comments.

$ root
root [0] TFile *f = TFile::Open("root://eoslhcb.cern.ch//eos/…")
root [1] TBrowser tb

The above syntax works also if the file is not at CERN. The name to use can be obtained either from the dirac-dms-lfn-accessURL command line prompt or from using the accessURL method on a DiracFile object in Ganga

The second comment is that there is a long standing request to allow the user to decide on the directory structure of DiracFile objects. It is pending a change on the Dirac side as far as I understand.

betatim · 2016-02-16T08:21:32Z

Does someone know the state of the ganga issue related to this?

egede · 2016-02-16T09:32:31Z

You mean the user side decision of directory that Dirac stores the file in? There is a missing feature in the LHCbDirac API. Before that is available there is nothing that can be done from the Ganga side.

betatim · 2016-02-16T09:57:16Z

Jupp. Do the dirac guys use github to track the progress on this or is there a issue in the ganga repository we can track to keep informed?

egede · 2016-02-16T12:26:03Z

tracing this further the ball is in the Ganga camp now (where it has been forgotten). I have created a new issue on Github to follow this, ganga-devs/ganga#201

alexpearce · 2016-10-13T09:26:03Z

There is now a method on DiracFile for getting the full URL to a file, no matter what Grid site it's on:

Ganga In [7]: df.accessURL?
Type:       function
String Form:<function accessURL at 0x7f68f7183320>
File:       /afs/cern.ch/lhcb/software/releases/GANGA/GANGA_v602r2/install/ganga/python/Ganga/GPIDev/Base/Proxy.py
Definition: df.accessURL(*args, **kwargs)
Docstring:
Attempt to find an accessURL which corresponds to the specified SE. If no SE is specified then
return a random one from all the replicas.

For example;

Ganga In [8]: df.accessURL()
['root://clhcbdlf.ads.rl.ac.uk//castor/ads.rl.ac.uk/prod/lhcb/user/a/apearce/2016_09/139512/139512234/DVntuple.root?svcClass=lhcbUser']

I think we should advise people to use this, rather than copying everything to 'CERN-USER' and only using eoslhcb.cern.ch URLs. Does that make sense @egede?

egede · 2016-10-13T09:30:04Z

@alexpearce Yes, I think that is a good idea - at least if performance is not harmed. The most important thing is to get rid of the recommendation to use MassStorageFile.

alexpearce · 2016-10-13T09:36:49Z

OK, thanks. We'll push to get this done before the next workshop.

egede · 2016-10-13T09:38:17Z

Should some comment be made that this in effect makes the analysis chain less "CERN-centric" - if you no longer copy files to CERN, there is nothing special about lxplus.

alexpearce · 2016-10-13T09:40:41Z

Indeed. We should keep in mind that AFS will soon be 💀. I don't know if there's a know replacement for the interactive environment, so if user's are already able to do things without lxplus (on their local cluster, or on their laptops) that will make the transition easier.

saschastahl · 2016-10-24T11:00:48Z

I have been playing around with this workflow in the last days and it sometimes a bit tricky. And one problem I encountered is that you have to have a valid grid proxy to use these files. This makes it complicated to use on your own PC or in a batch job.

egede · 2016-10-24T22:21:37Z

@saschastahl Your comment that it takes a bit more effort to get read access to these files is a valid one. However, you can obtain a long life proxy that should more or less get rid of that. Placing the files on EOS will not make your life easier unless you do subsequent analysis inside CERN which I thought we were in general discouraging.

alexpearce · 2016-10-25T07:20:17Z

We do indeed want to discourage location-specific analysis. People should be able to do their work on their own machines wherever they are

Do you know how to generate this "long life proxy" @egede? That might make things slightly easier. Although there are stills hoops to jump through when running on the batch system or on your local machine (that doesn't have the usual Grid machinery on it).

Perhaps we could also provide instructions on how to access these non-CERN Grid files 'locally'? (For me, some sites I could access my Grid files from without a proxy, others not. It seems the access policy isn't uniform, so we just need a solution that works everywhere)

saschastahl · 2016-10-25T07:28:26Z

Yes, I was specifically referring to jobs on a batch system. It involved several steps to transfer the grid proxy to my jobs. I can provide the instructions I found a twiki page but it is a bit cumbersome.

egede · 2016-10-25T08:15:48Z

With Ganga 6.3, each object that needs a GridProxy will contain information about it. This does not solve the problem in itself, but paves the way for the a job sent to Batch automatically would forward the proxy to it (and fail to submit if no valid proxy was avaiable).

renaudin · 2016-11-29T17:02:52Z

In order to update the lesson, should we delete the MassStorage setup and use completely or just leave a sidenote on it ?

cmarinbe · 2016-11-29T23:12:47Z

This should have been solved by #209 Any comments?

alexpearce added the fix-before-november-2016 label Oct 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommend `DiracFile` over `MassStorageFile` #162

Recommend `DiracFile` over `MassStorageFile` #162

betatim commented Jan 5, 2016

alexpearce commented Jan 5, 2016

betatim commented Jan 5, 2016

alexpearce commented Jan 5, 2016

saschastahl commented Jan 5, 2016

alexpearce commented Jan 5, 2016

saschastahl commented Jan 5, 2016

egede commented Jan 5, 2016

betatim commented Feb 16, 2016

egede commented Feb 16, 2016

betatim commented Feb 16, 2016

egede commented Feb 16, 2016

alexpearce commented Oct 13, 2016

egede commented Oct 13, 2016

alexpearce commented Oct 13, 2016

egede commented Oct 13, 2016

alexpearce commented Oct 13, 2016

saschastahl commented Oct 24, 2016

egede commented Oct 24, 2016

alexpearce commented Oct 25, 2016

saschastahl commented Oct 25, 2016

egede commented Oct 25, 2016

renaudin commented Nov 29, 2016

cmarinbe commented Nov 29, 2016

Recommend DiracFile over MassStorageFile #162

Recommend DiracFile over MassStorageFile #162

Comments

betatim commented Jan 5, 2016

alexpearce commented Jan 5, 2016

betatim commented Jan 5, 2016

alexpearce commented Jan 5, 2016

saschastahl commented Jan 5, 2016

alexpearce commented Jan 5, 2016

saschastahl commented Jan 5, 2016

egede commented Jan 5, 2016

betatim commented Feb 16, 2016

egede commented Feb 16, 2016

betatim commented Feb 16, 2016

egede commented Feb 16, 2016

alexpearce commented Oct 13, 2016

egede commented Oct 13, 2016

alexpearce commented Oct 13, 2016

egede commented Oct 13, 2016

alexpearce commented Oct 13, 2016

saschastahl commented Oct 24, 2016

egede commented Oct 24, 2016

alexpearce commented Oct 25, 2016

saschastahl commented Oct 25, 2016

egede commented Oct 25, 2016

renaudin commented Nov 29, 2016

cmarinbe commented Nov 29, 2016

Recommend `DiracFile` over `MassStorageFile` #162

Recommend `DiracFile` over `MassStorageFile` #162