Autoscaling Cluster Properties

The following overview describes the configurable properties of autoscaling clusters in traditional HPC and Kubernetes environments.

Table 1: Configurable Autoscaling Properties for LSF, PBS, Slurm, and UGE Clusters

CategoryProperty in web UIProperty in scaling_json.fileDescriptionNotes
GeneralName"name"The name of the cluster as it appears in the list of compute resources. 
Task Directory Cleanup"evaluator_task_directory_cleanup"Cleanup policy for task execution/working directories. Possible values are Always ("always"), On success ("on_success"), and Never ("never"). The default is Always.
BackendShared Directory"shared_dir"The cluster working directory for job files (inputs and outputs).

The settings in the "backend" section are global defaults for the cluster. Properties such as Local Scratch, Exclusive, and Distributed can also be specified in the properties of individual applications and in job definitions, enabling you to override the global default for specific applications or jobs.

Property values specified in the job definition take precedence over values specified in application settings. If values have not been specified in either of those places, the global default values are used.

Local Scratch"use_local_scratch"Whether a scratch directory local to the compute node(s) should be used as the job working directory ("true" or "false").
Local Scratch Directory"local_scratch_dir"The path of the local scratch directory (for example, /tmp/scratch).
Scheduler Type"scheduler_type"The scheduler to which the autoscaler will submit jobs. Options are LSF, PBS, Slurm, or UGE.
Default Scheduler Queue"scheduler_queue_default"

Specifies the queue to use if one is not specified in a task definition. Default is None ("null").

Default Number of Cores"num_cores_default"CPU limit applied to each evaluator instance.
Exclusive"exclusive_default"Request that scheduler hold the node(s) exclusively for one request. The default is false.
Distributed"distributed_default"Allow the scheduler to provide multiple machines to fulfill the request. Default is true.
Process RunnerType"plugin_name"

When set to Service User, orchestrator jobs will run as the service user.

When set to REST Launcher in the UI or "process_launcher_service" in the configuration file, orchestrator jobs will run as the submitting user. (Applies to job schedulers only — LSF, PBS, Slurm, UGE).) This requires deployment of the Process Launcher service.

When using the Process Launcher, additional properties are available in the Process Runner section (described below).

 
Timeout (s)"timeout"The maximum amount of time to wait for the Process Launcher to run the command. Default is 30 seconds.
Launcher URL"launcher_url"The URL of the Process Launcher service (for example, https://hostname:4913).
Verify SSL"verify_ssl"Whether SSL checking is enabled. Default is false.
Shell"shell"Whether or not to start the Process Launcher in a shell. Default is true. Ansys does not recommend changing this value.
ScalingType"plugin_name"The backend plugin that defines the scaling strategy. The default is Maximum Available Resources ("max_available_resource_scaling"). 
Compute ResourcesInstances"num_instances"Maximum number of evaluator instances that can run at one time. Default is blank (no limit). 
Num Cores"num_cores"Maximum number of cores that can be launched to the backend resource. Default is blank (no limit).
Memory [B]"memory"Maximum amount of memory that can be launched to the backend resource. Default is blank (no limit).
Disk Space [B]"disk_space"Maximum amount of disk space that can be launched to the backend resource. Default is blank (no limit).
Platform"platform"Only Linux clusters are currently supported.
Custom"custom"Allows the addition of custom properties.
ApplicationsApplications"available_applications"Ansys applications will be automatically detected if installed in a standard location. You can also manually add applications to the autoscaling configuration if necessary.

Each application has a set of properties that can be defined specifically for that application. Properties such as Local Scratch, Exclusive, and Distributed are also available in the "backend" section of the configuration file (as global defaults) and in job definitions.

Property values specified in the job definition take precedence over values specified in application settings. If values have not been specified in either of those places, the global default values are used.


Table 2: Configurable Autoscaling Properties for Kubernetes Clusters

CategoryProperty in web UIProperty in scaling_json.fileDescription
GeneralName"name"The name of the cluster as it appears in the list of compute resources.
Task Directory Cleanup"evaluator_task_directory_cleanup"Cleanup policy for task execution/working directories. Possible values are Always ("always"), On success ("on_success"), and Never ("never"). The default is Always.
BackendShared Directory"shared_dir"The cluster working directory for job files (inputs and outputs).
Default number of cores"num_cores_default"CPU limit applied to each evaluator instance.
Default memory limit"memory_limit"Memory limit applied to each evaluator instance. For a list of supported formats, go to https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/.
Namespace"namespace"Namespace where target resource objects will be scaled up/down using KEDA.
Target Resource Kind"target_resource_kind"

The Kubernetes resource kind (this should scale).

If evaluators need to be spawned to execute the pending jobs of a specific project and task definition, set this to Job ("job"). In the properties of available applications, the Resource Name ("resource_name") should be set to the image name with tag.

If evaluators need to be spawned to execute the pending jobs of all projects that specify the same application in the task definition, set this to Deployment ("deployment"). In the properties of available applications, the Resource Name ("resource_name") should be set to the deployment name.

See also: Autoscaling with KEDA

ScalingType"plugin_name"The backend plugin that defines the scaling strategy. The default is Kubernetes Resource Scaling ("kubernetes_resource_scaling").
ApplicationsApplications"available_applications"Ansys applications will be automatically detected if installed in a standard location. You can also manually add applications to the autoscaling configuration if necessary.