3.3.2. Specifying RSM Configuration Settings

RSM provides an intelligent and responsive interface for defining configurations.

As you define an RSM configuration for an HPC resource, RSM validates paths and machine names that you enter, and presents settings that are specific to the HPC type and options that you have chosen.

The process consists of the following main steps:

  1. On the HPC Resource tab, specifying HPC information such as the HPC type, machine name of the cluster submit host, or URL of the Cloud portal

  2. On the File Management tab, specifying file management properties that determine how job input files get transferred to the HPC staging directory

  3. On the Queues tab, importing or adding HPC queues, and mapping them to RSM queues

When you finish specifying information on a tab, click Apply to validate the information and apply it to the configuration. Colored icons indicate the status of information on each tab:

If information is missing or invalid, a warning icon   is displayed on the tab.

For detailed instructions and examples of specific configuration types, refer to the following:

3.3.2.1. Specifying HPC Resource Information

The first step in defining an RSM configuration is specifying information about the HPC resource, and how the client communicates with the resource. This information is specified on the HPC Resource tab in the editing pane:

General settings include:

HPC Configuration
Name

The name of the configuration as it appears in the HPC Resources list in the left pane. Do not use the name of an existing configuration.

HPC type

Choose one of the following from the drop-down:

  • ARC (Ansys RSM Cluster)

    Choose this option if you are not integrating with a third-party cluster or Cloud portal.

  • Windows HPC

  • LSF

  • PBS Pro

  • SLURM

  • UGE (SGE)

  • Custom

To configure RSM for job submission to a third-party Cloud, contact Ansys Customer Support for assistance.

If UGE (SGE) is selected as the HPC type, settings become available to specify Parallel Environment (PE) names:

Shared memory parallel processing enables you to distribute solve power over multiple processors on the same machine. Distributed parallel processing enables you to distribute solve power across multiple cores on a single node, or across multiple nodes. For information on configuring parallel environments, consult the documentation of the simulation product you are using.

RSM integrates with Windows HPC, LSF, PBS Pro, SLURM, and Altair Grid Engine (UGE) without requiring job script customization. For custom cluster types, customization will likely be necessary to make job submission work. Refer to RSM Custom Integration.

If integrating with a cluster:

Submit host

Identify the machine that serves as the cluster submit host. This is the machine that handles job scheduling. In other words, it is the machine on which scheduling software is installed, or, in the case of an Ansys RSM Cluster (ARC), the machine on which the ARC Master service has been installed.

  • If jobs will be submitted to the cluster submit host from any other machine, enter the submit host's full domain name (for example, machineName.company.com), even if the machine on which you are currently working (the local machine) is the submit host.

  • If the machine on which you are currently working (the local machine) is the cluster submit host, and jobs will not be submitted to it from any other machine, you can enter localhost in this field.


Important:
  • Correctly identifying the submit host is a crucial step, as this is the key piece of information that enables RSM to communicate with the cluster.

  • If the current (local) machine is the submit host, do not enter localhost in the Submit host field if jobs will be submitted to this machine from other machines. You must use the full domain name in this case.

  • If you specify the full domain name for the Submit host and choose to use the RSM internal socket-based file transfer method, you will see a file transfer error such as "port number cannot be zero" when you try to use this configuration locally. If you intend to submit only local jobs using this configuration, change the Submit host value to localhost and use the OS file transfer method. Otherwise, ensure that the client machine and submit host are different machines.


Job submission arguments

Scheduler-specific arguments that will be added to the job submission command line of the job scheduler. For example, you can enter job submission arguments to specify the queue (LSF, PBS, SGE) or the node group (MS HPC) name. For valid entries, see the documentation for your job scheduler.

Use SSH protocol for inter and intra-node communication (Linux only)

This setting is used for distributed computing with multiple nodes involved. It specifies that RSM and solvers use SSH for communications between Linux execution nodes, and within the nodes themselves. If left deselected, this indicates that RSH is used.

This setting will be applied to all Linux HPC nodes, allowing for solvers to run in distributed parallel mode.

When Ansys Fluent, Ansys CFX, Ansys Mechanical, and Ansys Mechanical APDL are configured to send solves to RSM, their solvers will use the same RSH/SSH settings as RSM.

If integrating with a custom cluster, portal, or Cloud:

Submit host or URL

If integrating with a custom cluster, enter the domain name of the cluster submit host (for example, machineName.company.com), then select the platform of this machine in the adjacent drop-down.

If integrating with a custom portal or Cloud, enter the URL of the resource to which jobs will be submitted, then make any random selection from the platform drop-down. (Since Custom is a general option that can be used for a variety of HPC resource types, a selection must be made in the OS drop-down in order to proceed. In the case of a portal or Cloud, the value selected is unimportant and will simply not be used.)

Custom HPC type

This is a keyword of your choosing that represents the HPC resource. It is a short word or phrase that you will append to the file names of your custom integration files. For more information, see RSM Custom Integration.

How does the client communicate with [HPC resource]?

(Available if you have selected ARC, LSF, PBS Pro, SLURM, UGE (SGE), or Custom as the HPC type)


Note:  Communication on the HPC Resource tab refers mainly to the submission of jobs to the cluster submit host. The transfer of job files is handled independently according to the settings on the File Management tab. For example, if you select Use SSH or custom communication to the submit host on the HPC Resource tab, this does not mean that you have to use SSH for file transfers. You may instead want to use a custom mechanism or different file transfer method altogether. See Specifying File Management Properties.


Able to directly submit and monitor HPC jobs

Specifies that the RSM client can use the RSM internal communication mechanism to directly submit jobs to the HPC resource, and monitor HPC jobs. This requires that an IT administrator open ports and adjust firewall settings on the HPC submit host to allow communication from the RSM client.

When the submit host is a remote machine, the RSM launcher service launches a user proxy process on the submit host which performs operations such as job submission, monitoring, and file transfer on the user's behalf. The RSM launcher service will use one port, while each user proxy process will use a separate port chosen by RSM. Ports for user proxy processes are chosen from a port range if one has been specified in the RSM application settings (see Specifying a Port Range for User Proxy Processes). Otherwise, RSM will randomly select a port that is free.

Use SSH or custom communication to the submit host

This option is only available when the submit host is a Linux machine and the RSM client is a Windows machine.

When a job from Windows client is submitted to a remote Linux cluster, this specifies that SSH will be used to communicate with the submit host instead of RSM's internal communication mechanism. Use this option if your IT administrator does not want to open ports and adjust firewall settings to allow communication from the RSM client, in adherence with your organization's IT policy. In this scenario, no RSM services need to be running on the remote submit host.

In the Account name field, specify the account name that the Windows RSM client will use to access the remote Linux submit host.


Note:
  • This account must be set up before this mode can be used. For information on configuring SSH to allow access from a Windows machine, see Configuring PuTTY SSH.

  • This is not an account that is specified in the Credentials section of the RSM Configuration application. The accounts listed there are RSM client accounts, not user proxy accounts.


3.3.2.2. Specifying File Management Properties

For jobs to be successfully executed on the HPC resource, client job files need to be staged in a location that the HPC resource can access. Also, job output files may need to be transferred back to the client after a job has been executed.

When you submit a job to RSM in a client application, a client working directory is created to which all necessary job files are written. The location of this directory is configured in the client application. For more information, refer to Setting Up Client Working Directories to Eliminate the Need for File Transfers.

If the client working directory is created under a shared directory that is visible to all HPC nodes (in other words, it is already inside the shared HPC staging directory), then it is possible for the job to be run directly in the working directory. Otherwise, if files need to be transferred from the client working directory to an HPC staging directory, you will need to specify this in your RSM configuration.

You will also need to specify where jobs will run on the HPC side.

File management properties are specified on the File Management tab in the editing pane.

File management properties will differ depending on the HPC type:

3.3.2.2.1. File Management Properties for a Cluster

When integrating with an Ansys RSM Cluster or third-party cluster, you will need to specify client-to-HPC and HPC-side file management properties:


Tip:  Use the Tell me more options to view detailed information about each file transfer method so that you can select the method that best suits your IT environment, file storage strategy, and simulation requirements.


Client-to-HPC File Management

Specify how files will get to the HPC staging directory.


Important:  The HPC staging directory must a shared directory that is visible to all HPC nodes.


Operating system file transfer to existing network share (Samba, CIFS, NFS)

Use this option when the HPC staging directory is a shared location that client machines can access.

The RSM client finds the HPC staging directory via a Windows network share or Linux mount point that has been set up on the client machine, and copies files to it using the built-in operating system copy commands.

In the Staging directory path from client field, specify the path to the shared file system as the RSM client sees it. A Windows client will see the shared file system as a UNC path (for example, \\machine\shareName), while a Linux client will see a mounted directory (for example, /mounts/cluster1/staging).

In the Staging directory path on cluster field, specify a path to the HPC staging directory which all execution nodes can see. This is a path on the HPC side (for example, \\machine\staging on a Windows machine, or /staging on a Linux machine). This is the path to which the client-side network share or mount point is mapped. The minimum path is the location shared to the network (for example, from the Samba configuration). The rest of the path can include subdirectories or they will be inferred from the client path.

If jobs will be running directly on the client machine or a single-node cluster (for example, ARC operating in basic mode), the staging area may just be a preferred local scratch area, and may not need to be a shared path.

When using this option, you must ensure that the HPC staging directory is both visible to and writable by the client machine. For more information, see Enabling OS Copy to the HPC Staging Directory.

No file transfer needed. Client files will already be in an HPC staging directory.

Use this option if the client files are already located in a shared file system that is visible to all cluster nodes.

When the client and cluster are running on the same platform, or the submit host is localhost, further action is not required in most cases.

When the client and cluster platforms differ, it is necessary to map the client-visible path to a cluster-visible path. The most common scenario is a user working on a Windows client, but their work files are located in a network shared 'home' directory. For example, they work with their files using \\homeServer\myhome, but on the Linux cluster side, this can be referred to as $HOME.

The Network share paths on the HPC resource table is displayed if the submit host is a Linux machine that is not localhost, and SSH is not being used. Use the table to specify network paths that map to HPC paths:

Client working directories in Windows UNC format (\\machine\shareName) are mapped to Linux format using these mapped directories. On the cluster side, each network share path is the root of the HPC staging directory, meaning that the Linux mapping directory is substituted for only the \\machine\shareName portion of the client UNC path. For example,

Client directory: \\homeServer\myHome\projects\project1\model1

Mapping directory: $HOME (expands to /nfs/homes/joed). This is the path to which the client-side network share or mount point is mapped. The minimum path is the location shared to the network (from the Samba configuration, for example), as shown in the example here. The rest of the path can include subdirectories or they will be inferred from the client path.

Resulting Linux directory: /nfs/homes/joed/projects/project1/model1


Note:
  • The client directory must be visible to all cluster nodes.

  • If jobs will be submitted from Linux clients to a Linux submit host, you may not need to enter a path in the Network share paths on the HPC resource table if the client working directory can be used as an HPC staging directory.


In all other cases (for example, SSH is being used), you will be prompted to specify the Staging directory path on cluster (or nothing at all):

For information on creating client working directories under a shared HPC directory, see Setting Up Client Working Directories to Eliminate the Need for File Transfers.

RSM internal file transfer mechanism

Use this option when the HPC staging directory is in a remote location that is not visible to client machines.

RSM uses TCP sockets to stream files from the client machine to the submit host machine. In this case you must specify the path to the directory where job files will be staged (as the cluster sees it):

When transferring files to a single-node cluster, it may not be necessary for the staging directory to be a shared path (for example, UNC path on Windows).

External mechanism for file transfer (SCP, Custom)

Use this option when the HPC staging directory is in a remote location that is not visible to client machines, and you need to use an external mechanism such as SCP for file transfers

For a Linux cluster, you can use either SCP via SSH or a Custom mechanism for file transfers to the HPC staging directory. For a Windows cluster, only the Custom option is available.

RSM has built-in support for SCP transfer. In using the SCP protocol for communication and file transfer, it is not necessary to have any RSM components running on the remote submit host.

If using a Custom mechanism, the RSM launcher service may or may not need to be running on the submit host. This will depend on whether or not the client needs to communicate with RSM's launcher service on the remote side to handle the file transfer.

Whether you are using SCP via SSH or a Custom mechanism, you will need to specify the path to the HPC staging directory as the cluster sees it (for example, /staging on a Linux machine, or \\server\staging on a Windows machine):

If using a Custom mechanism for file transfers, you will also need to specify the Custom transfer type. This is a keyword of your choosing that represents the transfer type. It is a short word or phrase that you will append to the file names of your custom integration files:

If using SCP via SSH, the Custom transfer type is predetermined and cannot be edited.

Finally, specify the Account name that the RSM client will use to access the remote submit host. For a Linux cluster, if you selected Use SSH or custom communication to the submit host on the HPC Resource tab, the account name that you specified on that tab will automatically populate the Account field on the File Management tab, and cannot be edited. If you selected Able to directly submit and monitor HPC jobs on the HPC Resource tab, the Account field on the File Management tab can be edited. Note, however, that if you are using a Custom mechanism, the Account name is optional.

If using SCP via SSH, you can customize the SSH-specific cluster integration files to suit your needs. If using a Custom mechanism, you will need to create custom versions of these files. For details refer to Configuring SSH/Custom File Transfers.

HPC Side File Management

Available if the HPC type is set to a cluster type, or Custom.

Specify the working directory on the cluster side where job (or solver) commands will start running.

HPC staging directory

This option is recommended if one or both of the following is true:

  • There is a fast network connection between the cluster nodes and the HPC staging directory.

  • You are using a solver that produces fewer, relatively small files as part of the solution and does not make heavy use of local scratch space (for example, the CFX or the Fluent solver).

Scratch directory local to the execution node(s)

This option is recommended to optimize performance when one or both of the following is true:

  • There is a slower network connection between the cluster nodes and the HPC staging directory.

  • You are using a solver that produces numerous, relatively large files as part of the solution and makes heavy use of local scratch space (for example, Mechanical solvers).

All input files will be copied from the HPC staging directory into that local scratch directory. Then, when the job finishes running, the requested output files generated by the job will be copied back to the HPC staging directory.

In the Local HPC scratch directory field, enter the local path of a scratch directory on the cluster node (for example, C:\Shares\Local_Share\ScratchDir on Windows). You can enter the path of the scratch directory manually, or use an environment variable in the format %VAR%.

If the cluster is running on Windows, you must create a network share path for the local scratch directory on each node. In the Share path for local scratch field, enter the network share path of the local scratch directory. This path starts with a non-editable [ExecutionNode] variable. When a job is submitted, the [ExecutionNode] variable will be replaced with the actual machine name of each execution node assigned to the job.

By default, job files will be deleted from the HPC staging directory after the job has run. Choosing Keep job files in staging directory when job is complete may be useful for troubleshooting failed jobs. However, retained job files will consume disk space, and require manual removal.

3.3.2.3. Defining and Testing RSM Queues

When you choose to submit a job to RSM, you must choose an RSM queue for the job. An RSM queue maps to a queue on the HPC side, and provides a way to link to the RSM configuration.

RSM queues are defined on the Queues tab in the editing pane:

Defining an RSM Queue

RSM queues are the queues that users will see in client applications when they choose to submit jobs to RSM. RSM queue names can match the names of queues or queueing mechanisms defined on the HPC resource. The important thing to remember is that each RSM queue name must be unique.

RSM provides two ways of defining RSM queues: you can either import a list of HPC queues and define an RSM queue for each HPC queue, or you can manually add an RSM queue and assign an HPC queue to it.

  • To import a list of HPC queues, or refresh the list if you have imported HPC queues previously, click  . Then, for each HPC queue, double-click in the RSM Queue field and specify a unique RSM queue name.

  • To add an RSM queue to the list, click  , then specify a unique name for the queue in the RSM Queue field.

    Double-click in the HPC Queue field and enter the name of an existing HPC queue. RSM will check to see if the HPC queue is valid.

Enabling/Disabling an RSM Queue

When you create an RSM queue it is enabled by default. This means that it will be available for selection in client applications.

To control whether or not an RSM queue is available for use, select or clear the queue's Enabled check box.

Testing an RSM Queue

When you test an RSM queue, RSM sends a test job to the HPC resource via the associated HPC queue.

To test an RSM queue, click   in the queue's Test column, or right-click the queue in the tree in the left pane and select Submit Test.


Note:
  • You may need to click Apply on the Queues tab before being able to submit test jobs.

  • Only enabled queues can be tested.


The status of the test is displayed in the Status column:

 Job is being submitted
 Job is queued
 Job is in progress
 Job completed successfully
 Job completed successfully and released
 Job aborted
 Job aborted and released
 Job failed
 Job failed and released

When a job is running, the   button is replaced by an   button, enabling you to abort the test job if desired.

Performing an Advanced Queue Test

In an advanced test of an RSM queue you can select a client working directory in which to run the test. This is a good way of testing whether or not files are being transferred to the HPC staging directory (if, for example, the client working directory is a network share of the HPC staging directory).

To perform an advanced RSM queue test:

  1. Right-click the queue in the tree in the left pane, then select Advanced Test.

  2. In the Advanced Test dialog box, select or specify the client directory that you want to use for the test job. You can leave it set to the default %TEMP% environment variable, or enter a path or environment variable manually. Manually entered items will be added as drop-down options.

  3. If you want to clean up the client directory after the test job is done, enable the Cleanup Client Directory check box.

  4. Click Submit.

The status of the test is displayed in the Status column of the queue table, as described in Testing an RSM Queue.

Viewing a Test Job Report

If you have submitted a test job to an RSM queue, you can view a detailed test report by clicking   in the queue's Report column.

Saving a Test Job Report

You can save a job report to an HTML file that can be shared with others.

To save the job report:

  1. Click   in the job report window.

  2. Accept or specify the save location, file name, and content to include.

  3. Click Save.

Deleting an RSM Queue

You can delete an RSM queue that appears on the Queues tab in one of three ways:

  • Select the queue(s) in the queue list, then click   on the queues toolbar.

  • Right-click the queue in the queue list, then select Delete Selected RSM Queue(s).

  • Right-click the queue in the tree in the left pane, then right-click and select Delete Queue. Note that only enabled queues appear in the tree.