Terminating LSF Batch Jobs
To cancel or terminate an Ansys EM LSF batch job, we recommend using the job monitoring UI to terminate jobs cleanly, rather than using the bkill commands. Using this approach will allow the batch job to shut down in an orderly fashion.
Using the LSF bkill command without the -s SIGTERM option or simply terminating the job processes may cause some of the following problems:
- Some engine processes are not shut down and continue to run.
- LSF job is not fully removed.
- Project .lock file is not removed.
- MainWin core service processes (watchdog, mwrpcss and/or regss) are not stopped.
Some of these may interfere with submission of additional LSF batch jobs. For example, it may be necessary to manually remove the project lock file to submit another batch job for the same project. MainWin core service processes may also interfere with starting subsequent Ansoft batch jobs. Normally, these processes should timeout and end 15 seconds after the Ansys Electromagnetics product shuts down. Any MainWin core service processes (watchdog, mwrpcss and/or regss) that continue to run for more than 15 seconds after the product has stopped may be hung. The hung processes may need to be manually killed, after ensuring that these processes are associated with an Ansys EM job that has finished or terminated.
Stop a job cleanly - ensures that the results obtained until now are preserved:
bkill -s TERM <jobid>.
Stop an job abruptly - results are most likely lost. You have to manually remove the project lock file:
bkill <jobid>