4 days 12 hours ago
The Atlas queue has been empty for a while now. There are still some resend tasks sent out every now and then. If you get one, the server seems to give you 24 hours to finish the task before it gets cancelled even if the deadline is still 7 days away. So keep your cache low and push the possible Atlas tasks to get them crunched asap.
2 weeks ago
Today I found two VirtualBox ATLAS tasks in a sort of zombie state with stderr containing:
2025-05-26 22:19:39 (7140): Guest Log: [INFO] Probing /cvmfs/atlas.cern.ch... OK
2025-05-26 22:19:39 (7140): Guest Log: [INFO] Detected branch: prod
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] Failed to copy ATLASJobWrapper-prod.sh
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] VM early shutdown initiated due to previous errors.
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] Cleanup will take a few minutes...
2025-05-26 23:46:54 (7140): Status Report: Elapsed Time: '6000.000000'
2025-05-26 23:46:54 (7140): Status Report: CPU Time: '31.187500'
[...]
2025-05-28 07:44:48 (7140): Status Report: Elapsed Time: '114000.000000'
2025-05-28 07:44:48 (7140): Status Report: CPU Time: '343.546875'
The other log is similar.
Cleanup seems to have failed so I will abort both.
https://lhcathome.cern.ch/lhcathome/result.php?resultid=422888662
https://lhcathome.cern.ch/lhcathome/result.php?resultid=422887835
Other ATLAS tasks have been completing successfully on the same system. Does anyone have an explanation for this behavior?
2 weeks 2 days ago
Kernel 6.12.x loads a module that stops VB from starting. See
this post for details.
2 weeks 4 days ago
You are correct that I experienced the same problem on a previous Ubuntu setup, however I didn't remember it.
I have implemented the workaround, and now my Atlas_native workunit is running correctly now.
1 month 1 week ago
I encountered a compute error on an ATLAS native task that ran on my system for nearly two weeks before failing. Here's a summary of the failure:
Workunit: 231716588
Application version: ATLAS Simulation v3.01 (native_mt)
Client state: Compute error
Exit status: 195 (0x000000C3) EXIT_CHILD_FAILED
Host: ID 10873401
System: Linux Mint 22.1 (based on Ubuntu 22.04), CVMFS and Apptainer functional
Threads assigned: 5
Stderr shows this likely culprit:
mv: ‘ATLAS.root_0’ and ‘EVNT.44075481._002898.pool.root.1’ are the same file
This appears to crash the job early in its actual processing step, causing the wrapper to return an exit status of 195.
All CVMFS probes passed and the Apptainer container was loaded successfully from CVMFS. This seems to be an issue with the job script logic, likely in start_atlas.sh or its handling of the mv command near the start of execution.
Please let me know if you'd like additional logs or diagnostic output.
Thanks for all your hard work!
1 month 2 weeks ago
Over 500 validate errors during the past couple of days.
Yeah, I had to set my boxes to "No New Tasks" so they waste electricity on the bad WU's
2 months 1 week ago
...What I might do then is to upgrade from 7.1.4 to 7.1.6, maybe it helps. I'll tell you I upgraded, and now is seems to work :-) So, my Win11 is able to crunch also Atlas (besides CMS and Theory which have not been a problem before anyway).
Good!
2 months 2 weeks ago
today, all ATLAS tasks received by several of my hosts errored out after about 5 minutes with stderr showing
"pilotErrorDiag": "Failed to execute payload:/bin/bash: Sim_tf.py: command not found"
for complete info see:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=420216659same problem with some of the tasks my hosts downloaded today.
No idea why these faulty tasks are still being sent out to us
2 months 2 weeks ago
Thanks, that resolved the issue.
The second ATLAS task has started.
3 months 1 week ago
I don't know if an automatic end is expected to happen after a certain period (?)
I thought that 807,403.00 seconds (more than 9 days and 8 hours) was enough time to give the chance...For ATLAS is no automatic end set, but using only CPU time 54 min 33 sec in over 9 days is enough sign, that the task is not doing well.
3 months 1 week ago
Then check the settings for the default value.
It looks like it works a expected for other computers (including mine).
As long as you are the only one experiencing/reporting that issue it's most likely not a server issue.
It is also not related to your local BOINC client since you configure it on the server rather than on the client.
LHC@home: ATLAS application
Subscribe to ATLAS@Home message board feed