- Log in to post comments
Hi,
HWRF output task has been failing for a storm that i started two days ago.
Looks like its memory issue
Error messgae:
ExitStatusException: batchexe('/apps/nco/4.9.1/intel/18.0.5.274/bin/ncks')['-4','-L','6','-O',u'/lfs4/HFIP/hwrfv3/Agnes.Lim/pytmp/H220_ctrl_intel18/2020070612/05E/intercom/fgat.t202007061200/wrfanl/wrfanl_d02_2020-07-06_12_00_00',u'/lfs4/HFIP/hwrfv3/Agnes.Lim/pytmp/H220_ctrl_intel18/com/2020070612/05E/tmp.wrfanl_d02_2020-07-06_12_00_00.part.B0OkrO'].in('/dev/null',string=False): non-zero exit status (returncode=-9)
slurmstepd: error: Detected 1 oom-kill event(s) in StepId=62374467.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
tjet is being used for this task. How can i switch to other jets for this task.
Thanks
Agnes