To set up custom code references, you will perform the following steps on the client machine:
To begin, you must create a custom keyword (short word or phrase) that represents your
custom cluster type, and create custom copies of the job script, HPC commands file and
command scripts to be used for the job execution. For this example, our
<keyword> will be CUS_CLIENT. We will
start from the example scripts, which can be identified by their suffix
_CIS, which is short for "Client Integration Sample.".
This sample is based on an LSF cluster implementation; however, you can be modify it for any
cluster type.
Perform the following steps on the client machine:
Locate the directory [RSMInstall]\Config\xml. This directory contains the base scripts for all of the supported cluster types.
Make a copy of hpc_commands_CIS.xml and call the copy hpc_commands_CUS_CLIENT.xml, using your keyword in place of
CUS_CLIENT.Navigate to [RSMINSTALL]\RSM\Config\scripts.
Make a copy of cancelGeneric.py, statusGeneric.py, and submitGeneric.py. Rename the copies by replacing instances of Generic with _CUS_CLIENT (or your specific keyword). For example, rename cancelGeneric.py to cancel_CUS_CLIENT.py, using your keyword in place of CUS_CLIENT.
To customize the copied code to include desired changes, you will perform the following steps on the client machine:
As part of the setup, you must add an entry for your custom cluster keyword in the jobConfiguration.xml file, and reference the files that are needed for this cluster job type.
Navigate to [RSMInstall]/RSM/Config/xml.
Open the jobConfiguration.xml file and add an entry for your custom cluster job type. The sample entry below is for the
CUS_CLIENTkeyword that we established earlier, and points to the custom hpc_commands_CUS_CLIENT.xml file. Use your own keyword and HPC commands file name where appropriate.<keyword name="CUS_CLIENT"> <hpcCommands name="hpc_commands_CUS_CLIENT.xml"> </hpcCommands> </keyword>
The hpc_commands file provides the information on how commands or queries related to job execution are executed. The file can also refer to a number of environment variables.
Go to [RSMInstall]\Config\xml.
Edit your custom hpc_commands_CUS_CLIENT.xml file:
Referring to the example below, replace all of the Generic references with _CUS_CLIENT references (or your specific keyword), as was done in Making Copies of Sample Code Files Using a Custom Keyword above.
The example script is set up to be run on a modified LSF cluster. If you are running on a different cluster type, you will need to choose a different parsing script (or write a new one) depending on your cluster type. Parsing scripts are available for supported cluster types: LSF, PBS (Pro or TORQUE), UGE, and MSCC. They are named lsfParsing.py, pbsParsing.py, ugeParsing.py, and msccParsing.py respectively. If you are using an unsupported cluster type, you will need to write your own parsing script (see Parsing of the Commands Output in the Remote Solve Manager User's Guide). You will then need to edit references to the parsing script in your custom hpc_commands file.
The example below uses the
%RSM_HPC_SCRIPTS_DIRECTORY_LOCAL%variable, which is appropriate for custom client integrations where you are using the RSM scripts directory location. This variable is set automatically by RSM to the [RSMInstall/RSM/Config/scripts directory.
<?xml version="1.0" encoding="utf-8"?> <jobCommands version="3" name="Custom Cluster Commands"> <environment> <env name="RSM_HPC_PARSE">LSF</env> <env name="RSM_HPC_PARSE_MARKER">START</env> <!-- Find "START" line before parsing according to parse type --> <env name="RSM_HPC_SSH_MODE">ON</env> <env name="RSM_HPC_CLUSTER_TARGET_PLATFORM">Linux</env> <!-- Still need to set RSM_HPC_PLATFORM=linx64 on Local Machine --> </environment> <submit> <primaryCommand name="submit"> <properties> <property name="MustRemainLocal">true</property> </properties> <application><pythonapp>%RSM_HPC_SCRIPTS_DIRECTORY_LOCAL%/submit_CUS_ClIENT.py</pythonapp></application> <arguments> </arguments> </primaryCommand> <postcommands> <command name="parseSubmit"> <properties> <property name="MustRemainLocal">true</property> </properties> <application><pythonapp>%RSM_HPC_SCRIPTS_DIRECTORY_LOCAL%/lsfParsing.py</pythonapp></application> <arguments> <arg>-submit</arg> <arg> <value>%RSM_HPC_PARSE_MARKER%</value> <condition> <env name="RSM_HPC_PARSE_MARKER">ANY_VALUE</env> </condition> </arg> </arguments> <outputs> <variableName>RSM_HPC_OUTPUT_JOBID</variableName> </outputs> </command> </postcommands> </submit> <queryStatus> <primaryCommand name="queryStatus"> <properties> <property name="MustRemainLocal">true</property> </properties> <application><pythonapp>%RSM_HPC_SCRIPTS_DIRECTORY_LOCAL%/status_CUS_ClIENT.py</pythonapp></application> <arguments> </arguments> </primaryCommand> <postcommands> <command name="parseStatus"> <properties> <property name="MustRemainLocal">true</property> </properties> <application><pythonapp>%RSM_HPC_SCRIPTS_DIRECTORY_LOCAL%/lsfParsing.py</pythonapp></application> <arguments> <arg>-status</arg> <arg> <value>%RSM_HPC_PARSE_MARKER%</value> <condition> <env name="RSM_HPC_PARSE_MARKER">ANY_VALUE</env> </condition> </arg> </arguments> <outputs> <variableName>RSM_HPC_OUTPUT_STATUS</variableName> </outputs> </command> </postcommands> </queryStatus> <cancel> <primaryCommand name="cancel"> <properties> <property name="MustRemainLocal">true</property> </properties> <application><pythonapp>%RSM_HPC_SCRIPTS_DIRECTORY_LOCAL%/cancel_CUS_CLIENT.py</pythonapp></application> <arguments> </arguments> </primaryCommand> </cancel> </jobCommands>
Note:
If you want to use other types of code such as C++, that is acceptable if you simply place your compiled (executable) code in the
<app> </app>section, arguments are not required. For Python, an interpreter is included in the Ansys Workbench install, so that is what you see referenced. If you want to use Python you can simply replace <app> </app> with<pythonapp> </pythonapp>as shown and enter the Python code file name.Any custom code that you want to provide as part of the customization should also be located in the [RSMInstall]\RSM\Config\scripts directory corresponding to your local (client) installation. Alternatively, a full path to the script must be provided along with the name.
For this custom-client integration, the code is required and provided for all of the functions in the HPC Commands file, as in the LSF examples above. However, we will provide simple overviews for only the Submit and Cancel commands scripts, illustrating their inner workings to help you modify them to suit your specific needs.
For a complete list of commands that can be customized, see Customizable Commands in the Remote Solve Manager User's Guide. This includes both job-specific commands such as Submit and Cancel, and file storage commands such as Upload and Download.
Important: The scripts submitGeneric.py and cancelGeneric.py that you have copied and renamed to submit_CUS_CLIENT.py and cancel_CUS_CLIENT.py actually contain fully functional code. However, the code could be considered to be quite complex, and going over it in detail is beyond the scope of this tutorial. These scripts are intended for more advanced programmers in customizing the code.
Here we have provided simpler, commented versions of these scripts with only basic functionality, so that the scripts may be more easily understood by newer programmers. We have illustrated the inner workings of these scripts so that you can modify them or write your own scripts based on your specific needs.
If you want to use the simpler scripts, you can simply replace the content in the original scripts with the following examples for submit_CUS_CLIENT.py and cancel_CUS_CLIENT.py.
"""
Copyright (C) 2015 ANSYS, Inc. and its subsidiaries. All Rights Reserved.
$LastChangedDate$
$LastChangedRevision$
$LastChangedBy$
"""
import sys
import ansLocale
import os
import tempfile
import os.path
import shutil
import glob
import shlex
import subprocess
import time
import platform
print('RSM_HPC_DEBUG=Submitting job...')
# See Below #1
print('Custom Coding goes here')
# SSH needs to use a username to login, either this is defined in 'RSM_HPC_PROTOCOL_OPTION1'
# which is the account name from the Cluster Tab or we will use the currently logged in username.
_sshUser = os.getenv("RSM_HPC_PROTOCOL_OPTION1")
if _sshUser == None:
_sshUser = "%USERNAME%"
print('RSM_HPC_DEBUG=SSH account: ' + _sshUser);
# RSM_HPC_PROTOCOL_OPTION2 is the name of the cluster node that was entered in the Cluster Tab.
# We will reference 'RSM_HPC_PROTOCOL_OPTION2' below, and the command will not succeed
# if its not defined so check it and give a specific error if it is not set.
if os.getenv("RSM_HPC_PROTOCOL_OPTION2") == None:
print("RSM_HPC_ERROR=RSM_HPC_PROTOCOL_OPTION2 (Remote Cluster Node Name) not defined")
sys.exit(1)
# Check to see if the computer is using windows if so PuTTY will need to be
# installed and we reference the putty command 'plink' here to connect to the remote machine.
# See Below #2
if platform.system() == 'Windows':
_plinkArgs = "plink.exe -i \"%KEYPATH%\" " + _sshUser + "@%RSM_HPC_PROTOCOL_OPTION2%
\" ''cd \"%RSM_HPC_STAGING%\"; bsub"
else:
# NOTE: entire command sent to SSH is wrapped in quotes \" \"
_plinkArgs = "ssh " + _sshUser + "$RSM_HPC_PROTOCOL_OPTION2 \" ''cd \"$RSM_HPC_STAGING\"; bsub"
# Check that various environment variables are automatically set by RSM
# and use their values to determine what command line options need to be added to submission command.
# See Below #3
_numcores = os.getenv("RSM_HPC_CORES")
if not _numcores == None:
_plinkArgs += " -n " + _numcores
_jobname = os.getenv("RSM_HPC_JOBNAME")
if not _jobname == None:
_plinkArgs += " -J \\\"" + _jobname + "\\\""
_queue = os.getenv("RSM_HPC_QUEUE")
if not _queue == None and not _queue == "":
_plinkArgs += " -q " + _queue
_staging = os.getenv("RSM_HPC_STAGING")
_plinkArgs += " -cwd \"" + _staging + "\""
_distributed = os.getenv("RSM_HPC_DISTRIBUTED")
if _distributed == None or _distributed == "FALSE":
_plinkArgs += " -R 'span[hosts=1]'"
_nativeOptions = os.getenv("RSM_HPC_NATIVEOPTIONS")
if not _nativeOptions == None:
_plinkArgs += " " + _nativeOptions
_stdoutfile = os.getenv("RSM_HPC_STDOUTFILE")
if not _stdoutfile == None:
_plinkArgs += " -o " + _stdoutfile
_stderrfile = os.getenv("RSM_HPC_STDERRFILE")
if not _stderrfile == None:
_plinkArgs += " -e " + _stderrfile
# Some environment variables were written directly into the string '_plinkArgs'
# and we want to replace those references with their actual values before submission.
print('RSM_HPC_DEBUG=plink arguments: ' + _plinkArgs);
_plinkArgs = os.path.expandvars(_plinkArgs);
# Other variables, like RSM_HPC_COMMAND, have environment variables
# referenced internally. For instance, this command has $AWP_ROOT242 embedded, and we want to keep
# that environment variable to expand on the cluster since $AWP_ROOTxxx is used on cluster not locally.
# See Below #4
_bsubCommand = os.getenv("RSM_HPC_COMMAND")
if not _bsubCommand == None:
# NOTE: entire command sent to SSH is wrapped in quotes \" \". See same note above
# NOTE: Staging Directory and Command are also wrapped in quotes \" \"...
_plinkArgs += " /bin/sh \"" + _staging + "/" + _bsubCommand + "\" \""
print('RSM_HPC_DEBUG=plink arguments: ' + _plinkArgs);
# See Below #5
_process = subprocess.Popen(shlex.split(_plinkArgs), stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, cwd=os.getcwd())
try:
while _process.poll() == None:
time.sleep(1)
except:
pass
print("RSM_HPC_DEBUG=bsub completed")
# It is optional to print the 'START' output, usually this is only done if
# previous command output could be confused by the parser with the intended output.
print('START')
for line in _process.stdout:
print line
sys.exit(0)Note: This code references many RSM-set environment variables. For more information on what environment variables are available and their contents, see Custom Integration Environment Variables in the Remote Solve Manager User's Guide.
You can add any code you want to this section; code placed here will execute before the job is submitted. Also, you can stop the job from submitting with some controls on the Submit command, if desired.
Basic LSF command line starting point; we will continuously append arguments to this line as necessary to complete the command.
Most blocks are composed of three parts: storing an environment variable to a local variable, testing to ensure that a variable either is not empty or contains a special value, and then appending some flag to the command line based on the findings.
One of the final actions is to read the RSM_HPC_COMMAND variable and append it to the submission command. This command is created by RSM and contains the command line to run the ClusterJobs script which can complete the submission process. It creates the full command line for Ansys solvers by using the controls file created by the individual add-ins. Ansys suggests that you always use the RSM_HPC_COMMAND to submit a job whenever possible because of the complexities of the Ansys solver command line for different solvers and on different platforms.
Popen finally "runs" the command we have been building. Then we wait for it to finish.
Since this script is a Submit script, there are many options for bsub command, and because this is a custom client integration, the commands are being wrapped in an SSH command to submit from the local machine to the remote machine. However, it is much simpler to create a custom script for the Cancel command, although it contains the same basic parts. This process is addressed in the next section.
"""
Copyright (C) 2015 ANSYS, Inc. and its subsidiaries. All Rights Reserved.
$LastChangedDate$
$LastChangedRevision$
$LastChangedBy$
"""
import sys
import ansLocale
import os
import tempfile
import os.path
import shutil
import glob
import shlex
import subprocess
import time
import platform
print('RSM_HPC_DEBUG=Cancelling job...')
# See Below #1
print('Custom Coding goes here')
# SSH needs to use a username to login, either this is defined in 'RSM_HPC_PROTOCOL_OPTION1'
# which is the account name from the Cluster Tab or we will use the currently logged in username.
_sshUser = os.getenv("RSM_HPC_PROTOCOL_OPTION1")
if _sshUser == None:
_sshUser = "%USERNAME%"
print('RSM_HPC_DEBUG=SSH account: ' + _sshUser);
# RSM_HPC_PROTOCOL_OPTION2 is the name of the cluster node that was entered in the Cluster Tab.
# We will reference 'RSM_HPC_PROTOCOL_OPTION2' below, and the command will not succeed
# if its not defined so check it and give a specific error if it is not set.
if os.getenv("RSM_HPC_PROTOCOL_OPTION2") == None:
print("RSM_HPC_ERROR=RSM_HPC_PROTOCOL_OPTION2 (Remote Cluster Node Name) not defined")
sys.exit(1)
# Code below is for cancelling a job on a standard LSF cluster.
# On Windows we must use the third party PuTTY interface "plink" to
# access the remote machine, in linux we can just use SSH.
# See Below #2
if platform.system() == 'Windows':
_plinkArgs = "plink.exe -i \"%KEYPATH%\" " + _sshUser + "@%RSM_HPC_PROTOCOL_OPTION2%
\" ''cd \"%RSM_HPC_STAGING%\"; bkill "
else:
# NOTE: entire command sent to SSH is wrapped in quotes \" \"
_plinkArgs = "ssh " + _sshUser + "$RSM_HPC_PROTOCOL_OPTION2 \" ''cd \"$RSM_HPC_STAGING\"; bkill "
# Check that various environment variables are automatically set by RSM
# and use their values to determine what command line options need to be added to submission command.
# See Below #3
_jobid = os.getenv("RSM_HPC_JOBID")
if _jobid == None:
print('RSM_HPC_JOBID not set')
sys.exit(1)
else:
_plinkArgs += _jobid
# NOTE: entire command sent to SSH is wrapped in quotes \" \".
# See same note above.
if platform.system() != 'Windows':
_plinkArgs += " \""
_plinkArgs = os.path.expandvars(_plinkArgs);
print('RSM_HPC_DEBUG=' + _plinkArgs);
# See Below #4
_process = subprocess.Popen(shlex.split(_plinkArgs), bufsize=-1, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, cwd=os.getcwd())
try:
while _process.poll() == None:
time.sleep(1)
except:
pass
print("RSM_HPC_DEBUG=bkill completed")
# See Below #5
for line in _process.stdout:
print('RSM_HPC_DEBUG='+line)
_error = False
# See Below #5
for line in _process.stderr:
print('RSM_HPC_ERROR='+line)
_error = True
if _error:
sys.exit(1)
sys.exit(0)
Note: This code references many RSM-set environment variables. For more, information on what environment variables are available and their contents, see Custom Integration Environment Variables in the Remote Solve Manager User's Guide.
You can add any code you want to this section; code placed here will happen before the job is cancelled. Also, some code could be run at the end of the script just before sys.exit(0), if some extra precautions are to be taken after the job has been cancelled through the scheduler.
Basic LSF command line starting point. You would type bkill <job ID> at the command line in order to cancel a job in LSF. We will continuously append arguments to this line as necessary to complete the command. In this case, it’s only the job number being added in block #4.
Most blocks are composed of three parts: storing an environment variable to a local variable, testing to ensure that a variable isn’t empty, and then appending some flag to the command line (or stopping the command if an error is found) based on the findings. This environment variable is set by RSM. A list of these useful variables can be found in Environment Variables Set by RSM in the Remote Solve Manager User's Guide.
Popen finally "runs" the command we have been building. Then we wait for it to finish.
Finally, we simply print out all of the output along with a line that says that the command has finished, just so we know it has run properly through RSM. Unlike the Submit command, the Cancel command has no output requirements, as shown in the Cancel Command section of the Remote Solve Manager User's Guide.