3 days 21 hours ago
A job run time histogram can be found here:
http://mcplots-dev.cern.ch/production.php?view=revision&rev=2743
The data table behind the histogram shows 0 or 1 jobs for 5-min-bins greater than a 4h run time.
The peak is shown in the 5-min run time bin containing 18369 jobs and the average run time is 158.5 min.
Based on that jobs hitting the 10 day limit are
really rare.
I must be getting all of them.
4 days 3 hours ago
You have another computer attached to the project that runs Arch Linux [6.8.9-arch1-1|libc 2.39].
That one successfully runs Theory native even with the runc version from grid.cern.ch.
So what you can do is to compare the setup of both to find out what's different or you replace Ubuntu 22.04.4 with ArchLinux.
1 week 3 days ago
All the new Theory tasks are failing immediately on openSUSE Leap 15.5, and presumably also on all previous versions as well.
It appears as if this is because the new tasks require a version of glibc which is not yet available in the main repos (ver 2.31 is the current version while the tasks appear to require 2.34).
It looks like this situation will prevail until Leap 15.6 is released in early June; that release should include glibc 2.38 -- at least, that is what I am reading in the 15.6 repos.
Anyone running openSUSE 15.5 or earlier has two options:
1) stop fetching Theory tasks until you have upgraded your system;
2) replace the OS with either the slowroll version (http://http://download.opensuse.org/slowroll/) or with Tumbleweed.
2 weeks 1 day ago
3 weeks 1 day ago
Thanks for having a look into that. I've meanwhile compared the VM output of this task to other tasks I have running (they log 30-50.000 events after a few hours), so, sadly, I have aborted this task.
4 weeks 1 day ago
I am not getting any for Windows now, I have the project selected and Vbox installed, site says there are loads but nothing downloading?Do you have 'native' selected in
your project preferences?
1 month 1 week ago
As I said, it's been aborted now.
I had only left it alone because I've had others run long but were otherwise working normally.
I don't make a habit of digging through workunit logs without cause.
1 month 2 weeks ago
Solved. I ended up installing a new up-to-date OS (Fedora 39). Then followed this
guide to install BOINC on Fedora and also this guide to
install CVMFS on Fedora.
I install CVMFS:
dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-2.11.0/cvmfs-2.11.0-1.fc34.x86_64.rpm https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm http://ecsft.cern.ch/dist/cvmfs/cvmfs-2.11.0/cvmfs-libs-2.11.0-1.fc34.x86_64.rpm
cvmfs_config setup
Then added CVMFS configuration to
/etc/cvmfs/default.local:
CVMFS_REPOSITORIES="atlas,atlas-condb,grid,cernvm-prod,sft,alice"
CVMFS_HTTP_PROXY="auto;DIRECT"
CVMFS_USE_CDN=yes
CVMFS_CLIENT_PROFILE=single
And ran the
prepare_theory_native_environment script from
this board:
sudo /bin/bash -c "export script=\"prepare_theory_native_environment\" && wget https://lhcathome.cern.ch/lhcathome/download/\$script -O /tmp/\$script && chmod u+x /tmp/\$script && /tmp/\$script && rm /tmp/\$script"
1 month 2 weeks ago
2 months ago
Cheers, I think I might just stop doing theory altogether, the project seems a mess at the moment, even the native apps are failing almost instantly.
2 months 1 week ago
Native theories are still failing for me, allowed the machine to grab some and they all instantly failed, any ideas?
https://lhcathome.cern.ch/lhcathome/result.php?resultid=407019270
Any way to force the box to get Vbox theories whilst still getting native Atlas tasks?, the native tasks seem to complete so much faster.
2 months 2 weeks ago
In your tasks:
Command:
VBoxManage -q closemedium "/var/lib/boinc/projects/lhcathome.cern.ch_lhcathome/Theory_2023_12_13.vdi"
Output:
VBoxManage: error: Cannot close medium '/var/lib/boinc/projects/lhcathome.cern.ch_lhcathome/Theory_2023_12_13.vdi' because it has 2 child mediaWhen it cannot close the medium, it will also not open it. The problem is not with BOINC, but VirtualBox.
You have to use VirtualBox Manager to solve your problem. See
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6112&postid=49574,
but instead of the CMS-vdi, remove Theory_2023_12_13.vdi, but don't delete the file itself.
That seems to have done it. When I looked in Tools/Media I found there were 2 orphaned Theory tasks listed under the .vdi file -- I assume those were the 2 child media the log referred to.
The Theory .vdi couldn't be removed from the VB manager until those were gone.
Thanks for the detailed reply. I would never have found this without it.
2 months 3 weeks ago
After >5 hours 100% CPU and running.log last lines:
0 events processed
PYTHIA Warning in DireSpace::branch_IF: used up beam momentum; discard splitting.
PYTHIA Info in DireTimes::pT2nextQCD_FF: Found large acceptance weight for Dire_fsr_qed_11->11&22_notPartial
PYTHIA Info in DireTimes::pT2nextQCD_FF: Found large acceptance weight for Dire_fsr_qcd_21->1&1a
PYTHIA Warning in DireTimes::branch_FI: used up beam momentum; discard splitting.
PYTHIA Warning in Pythia::check: not quite matched particle energy/momentum/mass I aborted this job. Desciption: pp wy 13000 - - pythia8 8.303 dire-default
This was the second resend and the initial wingman let it run the entire 10 days without succes.
Workunit:
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=219321083
2 months 3 weeks ago
2 months 3 weeks ago
Theory task running complete, but the outputfile is not generated.
Don't know what the reason can be.
2 months 3 weeks ago
My system also was grinding to halt with virtualbox.
I had to use vmware workstation linux vm to run lhcathome.
3 months ago
Thanks.
So, the link on the project website also needs to be changed back from mcplots.cern.ch to mcplots-dev.cern.ch
Menu: Jobs -> Theory Jobs
3 months ago
Although the task in question indeed might got stuck the "failed" list is not helpful to decide whether any task should be killed or not.
That's because 1 important fact has not been mentioned:
"Therefore no task of them was successful so far..."
Especially when a new mcplots revision starts there are always a couple of runspecs that fail or get lost before they report their first success.
To get removed from the "failed" list a runspec needs to report at least 1 successful result.
3 months 1 week ago
Any news on how to get these working on newer distros yet? thanks
LHC@home: Theory Application
Subscribe to Test4Theory feed