ATLAS@Home message board

queue is empty

5 days 10 hours ago
The Atlas queue has been empty for a while now. There are still some resend tasks sent out every now and then. If you get one, the server seems to give you 24 hours to finish the task before it gets cancelled even if the deadline is still 7 days away. So keep your cache low and push the possible Atlas tasks to get them crunched asap.

Failed tasks not cleaning up and exiting in reasonable time

2 weeks 1 day ago
Today I found two VirtualBox ATLAS tasks in a sort of zombie state with stderr containing:

2025-05-26 22:19:39 (7140): Guest Log: [INFO] Probing /cvmfs/atlas.cern.ch... OK
2025-05-26 22:19:39 (7140): Guest Log: [INFO] Detected branch: prod
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] Failed to copy ATLASJobWrapper-prod.sh
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] VM early shutdown initiated due to previous errors.
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] Cleanup will take a few minutes...
2025-05-26 23:46:54 (7140): Status Report: Elapsed Time: '6000.000000'
2025-05-26 23:46:54 (7140): Status Report: CPU Time: '31.187500'
[...]
2025-05-28 07:44:48 (7140): Status Report: Elapsed Time: '114000.000000'
2025-05-28 07:44:48 (7140): Status Report: CPU Time: '343.546875'

The other log is similar.

Cleanup seems to have failed so I will abort both.
https://lhcathome.cern.ch/lhcathome/result.php?resultid=422888662
https://lhcathome.cern.ch/lhcathome/result.php?resultid=422887835

Other ATLAS tasks have been completing successfully on the same system. Does anyone have an explanation for this behavior?

Apptainer error

2 weeks 5 days ago
You are correct that I experienced the same problem on a previous Ubuntu setup, however I didn't remember it.
I have implemented the workaround, and now my Atlas_native workunit is running correctly now.

ATLAS Task Failure – mv: source and destination are the same file (WU 231716588)

1 month 1 week ago
I encountered a compute error on an ATLAS native task that ran on my system for nearly two weeks before failing. Here's a summary of the failure:

Workunit: 231716588

Application version: ATLAS Simulation v3.01 (native_mt)

Client state: Compute error

Exit status: 195 (0x000000C3) EXIT_CHILD_FAILED

Host: ID 10873401

System: Linux Mint 22.1 (based on Ubuntu 22.04), CVMFS and Apptainer functional

Threads assigned: 5

Stderr shows this likely culprit:

mv: ‘ATLAS.root_0’ and ‘EVNT.44075481._002898.pool.root.1’ are the same file

This appears to crash the job early in its actual processing step, causing the wrapper to return an exit status of 195.

All CVMFS probes passed and the Apptainer container was loaded successfully from CVMFS. This seems to be an issue with the job script logic, likely in start_atlas.sh or its handling of the mv command near the start of execution.

Please let me know if you'd like additional logs or diagnostic output.

Thanks for all your hard work!

Extreme event processing times

3 months 1 week ago
I don't know if an automatic end is expected to happen after a certain period (?)
I thought that 807,403.00 seconds (more than 9 days and 8 hours) was enough time to give the chance...For ATLAS is no automatic end set, but using only CPU time 54 min 33 sec in over 9 days is enough sign, that the task is not doing well.

Why does server give me atlas tasks if I enable theory tasks and If no work for selected applications is available, accept work from other applications?

3 months 2 weeks ago
Then check the settings for the default value.

It looks like it works a expected for other computers (including mine).
As long as you are the only one experiencing/reporting that issue it's most likely not a server issue.

It is also not related to your local BOINC client since you configure it on the server rather than on the client.
Checked
ATLAS@Home message board
LHC@home: ATLAS application
Subscribe to ATLAS@Home message board feed