CERN Accelerating science

1 week 1 day ago

the queue has run dry :-(

Thousands of workunits ready to send, but "No tasks are available for Theory Simulation".

2 weeks ago

Is it known whether Theory_native is going to return at some point in the future, or has it been permanently retired?

Failed tasks not cleaning up and exiting in reasonable time

2 weeks 3 days ago

Today I found two VirtualBox ATLAS tasks in a sort of zombie state with stderr containing:

2025-05-26 22:19:39 (7140): Guest Log: [INFO] Probing /cvmfs/atlas.cern.ch... OK
2025-05-26 22:19:39 (7140): Guest Log: [INFO] Detected branch: prod
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] Failed to copy ATLASJobWrapper-prod.sh
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] VM early shutdown initiated due to previous errors.
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] Cleanup will take a few minutes...
2025-05-26 23:46:54 (7140): Status Report: Elapsed Time: '6000.000000'
2025-05-26 23:46:54 (7140): Status Report: CPU Time: '31.187500'
[...]
2025-05-28 07:44:48 (7140): Status Report: Elapsed Time: '114000.000000'
2025-05-28 07:44:48 (7140): Status Report: CPU Time: '343.546875'

The other log is similar.

Cleanup seems to have failed so I will abort both.
https://lhcathome.cern.ch/lhcathome/result.php?resultid=422888662
https://lhcathome.cern.ch/lhcathome/result.php?resultid=422887835

Other ATLAS tasks have been completing successfully on the same system. Does anyone have an explanation for this behavior?

Atlas tasks have started to fail.

2 weeks 5 days ago

Kernel 6.12.x loads a module that stops VB from starting. See this post for details.

Apptainer error

3 weeks ago

You are correct that I experienced the same problem on a previous Ubuntu setup, however I didn't remember it.
I have implemented the workaround, and now my Atlas_native workunit is running correctly now.

Network issue?

3 weeks 1 day ago

Since this morning three CMS-Tasks are working.
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10664116

CMS Simulation error

1 month ago

I have had problem
VM Completion Message: glidein exited with return value 1.

Solution, in my case is changing network connection

New tasks all failing?

1 month 1 week ago

Most of theory task take a very short time to be accomplished (less than a minute) and the Jobs chart show a high rate of failure (~60-70%). How to know if a task as due been completed ?

025-05-02 11:47:07 (10908): Guest Log: Environment HTTP proxy: not set
2025-05-02 11:47:08 (10908): Guest Log: job: htmld=/var/www/lighttpd
2025-05-02 11:47:08 (10908): Guest Log: job: unpack exitcode=0
2025-05-02 11:48:40 (10908): Guest Log: job: run exitcode=1
2025-05-02 11:48:40 (10908): Guest Log: job: diskusage=5704
2025-05-02 11:48:40 (10908): Guest Log: job: logsize=4 k
2025-05-02 11:48:40 (10908): Guest Log: job: times=
2025-05-02 11:48:40 (10908): Guest Log: 0m0.002s 0m0.004s
2025-05-02 11:48:40 (10908): Guest Log: 0m0.424s 0m0.248s
2025-05-02 11:48:40 (10908): Guest Log: job: cpuusage=1
2025-05-02 11:48:40 (10908): Guest Log: Job Finished
2025-05-02 11:48:40 (10908): Guest Log: boinc_shutdown called with exit code 0
2025-05-02 11:48:40 (10908): Guest Log: sd_delay: 845
2025-05-02 11:48:40 (10908): Guest Log: ETA: 2025-05-02 10:02:44 UTC
2025-05-02 12:02:45 (10908): VM Completion File Detected.
2025-05-02 12:02:45 (10908): Powering off VM.
2025-05-02 12:02:45 (10908): Successfully stopped VM.
2025-05-02 12:02:45 (10908): Deregistering VM. (boinc_1114d9ba70cb8796, slot#0)
2025-05-02 12:02:46 (10908): Removing network bandwidth throttle group from VM.
2025-05-02 12:02:46 (10908): Removing VM from VirtualBox.
2025-05-02 12:02:51 (10908): called boinc_finish(0)

ATLAS Task Failure – mv: source and destination are the same file (WU 231716588)

1 month 2 weeks ago

I encountered a compute error on an ATLAS native task that ran on my system for nearly two weeks before failing. Here's a summary of the failure:

Workunit: 231716588

Application version: ATLAS Simulation v3.01 (native_mt)

Client state: Compute error

Exit status: 195 (0x000000C3) EXIT_CHILD_FAILED

Host: ID 10873401

System: Linux Mint 22.1 (based on Ubuntu 22.04), CVMFS and Apptainer functional

Threads assigned: 5

Stderr shows this likely culprit:

mv: ‘ATLAS.root_0’ and ‘EVNT.44075481._002898.pool.root.1’ are the same file

This appears to crash the job early in its actual processing step, causing the wrapper to return an exit status of 195.

All CVMFS probes passed and the Apptainer container was loaded successfully from CVMFS. This seems to be an issue with the job script logic, likely in start_atlas.sh or its handling of the mv command near the start of execution.

Please let me know if you'd like additional logs or diagnostic output.

Thanks for all your hard work!

Tasks available / tasks not available

SixTrack message board

1 month 2 weeks ago

well, it seems that Sixtrack is no longer alive :-(

Waiting for XBoinc...

ERROR: failed to run pythia8 8.313

1 month 3 weeks ago

I will note that not all pythia8 tasks are failing. Very rarely do I get to sit down and look at what the VM is actually doing. But I've found some tasks give that error and some that don't.

CP5-CR2 is generating events
default-CD is generating events
default-noRap is generating events

qcdcr0 failed (by failed I mean they didn't generate any events)
tune-A2 failed I got 2 of them that failed.
tune-AU2 failed
tune-AU2lox failed
vincia-default failed (X2)
ropes failed

I haven't gotten any pythia6, sherpa or herwig tasks so I don't know about those.

Failed to execute payload:/bin/bash: Sim_tf.py: command not found

1 month 3 weeks ago

Over 500 validate errors during the past couple of days.
Yeah, I had to set my boxes to "No New Tasks" so they waste electricity on the bad WU's

Windows 10 Theory task stalled or ...?

1 month 3 weeks ago

2025-04-16 20:55:35 (9456): Guest Log: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY
2025-04-16 20:55:35 (9456): Guest Log: [INFO] 2.7.2.0 http://s1ihep-cvmfs.openhtc.io:8080 http://192.168.1.125:3128
These lines confirm:
- that openhtc.io is used for CVMFS (good!)
- that your local proxy 192.168.1.125 is used (even better!)

New version v300.95

1 month 3 weeks ago

This new version provides improved handling for configuring CVMFS proxies.

Some Theory tasks on VirtualBox hang Probing /cvmfs/alice.cern.ch...

2 months ago

This happened to me as well but a project reset appears to have fixed it.

Missing "sixtrack test" column on the leaderboards

SixTrack message board

2 months ago

Thanks Laurence, however as of posting they have not been reinstated.

This will just cause confusion when 'B' appears again.

As it stands, there is already confusion, as the credits now shown on the leader-board do not add up to ones total credit (if the user ran sixtrack test and/or ATLAS long)
and if sixtrack test does appear again, would users credit start from 0 or from the previous score (assuming it is still in the database)? Transparency is needed in where credit is given to keep users faith in the credit scoring

CMS 70.91

2 months ago

OK that explains why that change happened

New native version v300.08