CMS@Home message board

Does anyone else never restart your machine running CMS?

2 weeks 4 days ago
VirtualBox apps keep complex configuration problems away from volunteers who can't deal with them.

All because you didn't install (and configure) a local CVMFS client.


Perhaps the solution is not to require oddball filesystems and configurations from the users? To my knowledge no other project requires this junk. Or if they do, they handle it well like Rosetta so I don't notice it.


As for my errors on windows, I had about 8 tasks fail on lasts nights reboot at roughly 10 hours per task already run. Just like I said. 80 core * hours lost.

None of those things are *required*. They're a choice we the users make to run the native application instead of the Virtualbox one. While I have a number of errors in my history I also accept that the vast majority are my own fault, usually from playing around trying to make things work "better" but instead making them worse.

As for your errors on Windows, just because a work unit has clocked up any amount of time (wall-clock time) while active doesn't mean it's been using the CPU for that same time. They're tracked separately for a reason.

no new WUs available

4 weeks 2 days ago
new new tasks since last night :-(sorry, should read "NO new tasks ..."
Tja, I was waiting to see if Daniele's multi-core workflow would spawn new jobs, in case my old one was holding it back. Turned out not to happen, so I'm submitting smaller job batches until we work out how to get the multi-core jobs into the system. There may be intermittent disruptions over the next few days if my sleep cycle disagrees with the job queues' needs for attention.

Could not get X509 credentials

2 months ago
Can this problem with getting a proxy credential from LHC be avoided by installing a local proxy server?
If so, how would one go about this in Linux?
In this case, the two meanings of the word "proxy" are quite different. The proxy credential is an authorisation to connect to the service; the (squid) proxy server is a caching server that saves requested files so that they don't need to be transported again if re-requested.

No monitoring with ALT-F2

2 months 1 week ago
Still not fixed.
I'll remind Laurence of this -- my current job does show output for all other Alt-Fn (n=1-6).. Also the running.log isn't being refreshed in the "show graphics" web page.

Virtualbox tasks failing on Linux

2 months 1 week ago
Seem to have sorted it, adding an override file didn't seem to work but after some searching here editing this file /etc/systemd/system/multi-user.target.wants/boinc-client.service and changing ProtectSystem=strict to full seems to have fixed it, ill keep an eye on outputs.

Tasks have started to fail on Windows again

2 months 1 week ago
I have reset the project and cleared out any old entry on Vbox and the units still instantly fail, honastatly with CMS and Theory failing on Linux for me and now this I am tempted to give up on the project :(

Example task https://lhcathome.cern.ch/lhcathome/result.php?resultid=405102137

Looks like the key points are :

VBoxManage.exe: error: Cannot close medium 'E:\Boinc\data\projects\lhcathome.cern.ch_lhcathome\CMS_2022_09_07_prod.vdi' because it has 1 child media VBoxManage.exe: error: Details: code VBOX_E_OBJECT_IN_USE (0x80bb000c), component MediumWrap, interface IMedium, callee IUnknown VBoxManage.exe: error: Context: "Close()" at line 1875 of file VBoxManageDisk.cpp 2024-02-03 13:35:46 (22644): Could not create VM 2024-02-03 13:35:46 (22644): ERROR: VM failed to start 2024-02-03 13:35:46 (22644): Powering off VM. 2024-02-03 13:35:46 (22644): Deregistering VM. (boinc_4b854b63c685b678, slot#67) 2024-02-03 13:35:46 (22644): Removing network bandwidth throttle group from VM. 2024-02-03 13:35:46 (22644): Removing VM from VirtualBox.[code] Checking the Vbox Gui shows nothing, not tasks there at all.[/quote]

ConsoleWrap Error

3 months 1 week ago
You got 10 CMS-tasks and probably you started all 10 at once. Vbox don't like starting many machines in the same second.

Maybe the other 6 are still running. Hopefully they are running well.

Oh, thank you. Yes, other 6 runned correctly.
I'll keep an eye out

CMS (vbox) tasks failing

3 months 2 weeks ago
Ok, I've noticed two things, neither of which might be relevant or they could be ... I don't know.

I ran out of Atlas tasks. Every CMS task that failed did so while I also had Atlas tasks (three at a time) running.
I checked the permissions on the Slots folder, and issued "chmod +wrx -R slots" to address a suspected inconsistency.
The next CMS task I allowed to start has been running for over ten hours now so I started eleven more to push the issue and those additional tasks have all been running for about an hour without issue.

There are also two long running Theory Native tasks currently running and the rest of my 24 threads are taken up with Asteroids tasks (nine) and one thread kept free for system stuff.
Checked
CMS@Home message board
LHC@home: CMS Application
Subscribe to CMS@Home message board feed