Large Scale DSO Deployment/Configuration
Major Limitation
In the RSM environment, large scale DSO can only be enabled for one product.
Troubleshooting hints (RSM environment only)
The “shared drive read/write” requirement is a new constraint introduced in large scale DSO. If you experience a situation where regular DSO jobs run and large scale DSO jobs fail, one possible cause for the failure would be that the RSM service does not have privileges to read and write to the project folder located on the shared drive.
Windows Cluster Configuration
- Shared drive for projects – The cluster must provide a shared drive that hosts job inputs – the submitted project must be located on a shared drive. The shared drive must be accessible using the same path on every node of the cluster.
- The temp directory configuration
Temp directory is either on "local storage" or on storage that has equivalent speed characteristics; that is, the I/O rates of the storage should be invariant to network traffic.
The temp directory on a host has sufficient space to hold results database for the variations that are solved on it. This storage is freed at the end of the analysis.
The amount of required space depends on the number of engines per node and the cumulative variations solved on this node.
- RSM environment – In the case of
supported scheduler environments, there is no extra configuration needed.
In the case of RSM environment, the following additional steps
are needed:
RSM must be running on all the nodes of the cluster. The credentials of "RSM service" allow read/write to the shared drive. This is because the remote engine processes launch using the credentials of RSM service.
When registering desktopjob.exe with RSM service, the desktopjob program must be registered with Ansys Electromagnetics RSM using desktopjob -regserver. To ensure that the registration is successful, check that the desktopjob entry in <RSM-installation-folder>/AnsoftRSMService.cfg file is valid.
- RSM and Ansys Electromagnetics products either are installed locally on each node of the cluster (local installation), or installed on a single shared drive available to all nodes of the cluster (network installation).
- Registration of desktopjob.exe with RSM service
is either network or local.
- Network installation:desktopjob.exe is registered with RSM service once, on any of the nodes of the cluster.
- Local installation –
Since each node has its own RSM installation, desktopjob.exe
must be registered with RSM on each node.
Note:
IMPORTANT! The RSM service must be started using the credentials of a non-system admin account, which has read/write permissions to the project's shared drive. If the RSM service runs as a system user, large scale DSO jobs will fail.
Related Topics