CERN Accelerating science

1 month ago

1/10/2026 3:16:01 PM | LHC@home | No tasks are available for CMS Simulation

(after trying for an hour I finally got the one I was trying to get) this seems to be a common problem lately. Happens also here once in while. In most cases it then takes between 20 and 30 minutes until tasks finally come in. No idea what's going on.

This gonna be long

1 month 1 week ago

Your 2 linked tasks: The first one was too late but on time before your resend turned in. The second task may have restarted task's VM several times maybe starting from scratch.

Both tasks are still running and it's not clear whether:
- the first task, which already had a valid result, will grant me credits if I finish it?
- if I won't finish the second task in 11 days, will it get cancelled and will I lose all 11 days of running time? If a third replicate gets sent out after 10 days I don't think anyone will finish it within 24 hours (before my hard deadline) so that shouldn't be an issue.

Edit: after re-reading your reply several times I think I figured out the misunderstanding. In both WUs that I linked, I'm running the resends, not the initial tasks.

Lost in Atlas......

1 month 3 weeks ago

CMS mostly seem to be working ok.
That's wrong.
Your CMS VMs are running empty tasks without any scientific value.
As said, this is because of an error in CERN's backend queue which does not send out any scientific job.
You can't do anything against it as it must be solved by CERN staff after their holidays.

Indicators are:
1. short runtimes
2. CMS Grafana pages:
https://lhcathome.cern.ch/lhcathome/cms_job.php

https://monit-grafana.cern.ch/d/o3dI49GMz/cms-job-monitoring-12m?viewPanel=49&orgId=11&var-group_by=CMS_JobType&var-Tier=All&var-CMS_WMTool=All&var-CMS_SubmissionTool=All&var-CMS_CampaignType=All&var-Site=T3_CH_Volunteer&var-Site=T3_CH_CMSAtHome&var-Type=All&var-CMS_JobType=All&var-CMSPrimaryDataTier=All&var-adhoc=data.RecordTime%7C%3E%7Cnow-7d&var-ScheddName=All&from=now-7d&to=now

If you want to deliver work with scientific value, switch to Theory.

VM Hypervisor failed to enter an online state in a timely fashion

1 month 4 weeks ago

aurum@Rig-02:~$ id -Gn boinc |grep vboxusers
boinc video render docker vboxusers
Remove docker/podman and all related packages.
Then reboot.

Recent BOINC clients complain about missing docker/podman.
Ignore this!

New 1000 event tasks

2 months ago

Same here too. Got more than a dozen tasks cancelled while running for hours (some >50% in progress). Some did get cancel before the tasks ran and I'm fine with that.

In addition, got tasks with validation error but it was only a few minutes of running, so that's not as bad when compare to those already running for hours and then got cancelled.
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=237892957
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=237896161
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=237891554

VBox Theory task failed upon start up

2 months ago

Your computer runs VirtualBox 5.2.20.
The Theory guest runs Linux kernel 5.14.
VirtualBox introduced initial support for kernel 5.14 (host and guest) with version 6.1.28.
https://www.virtualbox.org/wiki/Changelog-6.1

CVMFS reacts very slowly

2 months 2 weeks ago

Can anyone confirm this?
Yes.
There's nothing we can do on the BOINC side.

CMS GFAL upload error

2 months 2 weeks ago

BOINC client 8.2.8 released

BOINC news

2 months 2 weeks ago

Version 8.2.8 of the BOINC client has been released for all platforms. Download it here. Release notes are here.

Events count less easily monitored: eventLoopHeartBeat.txt stays stuck.

2 months 3 weeks ago

Hi!
I've been away for a while. Now I see that the file eventLoopHeartBeat.txt in the [...]/boinc-client/slots/?*/PanDA_Pilot-* directory is no more constantly updated, so it always reports "1 event read so far". It's possible to find multiple updated eventLoopHeartBeat.txt files, one for each worker, in [...]/boinc-client/slots/?*/PanDA_Pilot-*/athenaMP-workers-EVNTtoHITS-sim/worker_?* subdirs. However you have to sum up the number of events to get the total...

I don't think this has been done on purpose, am I wrong?
--
Bye, Lem

Hung Theory task?

2 months 3 weeks ago

There's no 'obvious error' reported back to the project.
In cases like that there is no log file from the scientific app sent back to the project.
Hence, there is nothing to analyse and the task is either marked as 'failed' or 'lost' after the due date.

Even the log snippets you posted do not clearly explain if/why the tasks got stuck.

So, how should the project decide what caused the failure.
It could be either (may be incomplete):
- hardware
- the OS
- VirtualBox
- BOINC
- vboxwrapper
- data from CVMFS
- scientific app

From the project's perspective there's only the overall task failure rate for the computer itself.
As already mentioned for this computer it is less than 1 % covering all possible reasons.

Tasks failing with "App is not supported"

3 months 1 week ago

Yes, I noticed that myself, but with the deployment of a new wrapper script it soon went away.

XBoinc

SixTrack message board

3 months 2 weeks ago

New version of Xtrack (the base of Xboinc) released however - no tasks :-(

Theory CPU Scheduling oddness