17.4.1. Optimizing Mesh Partitioning

Partitioning in Ansys CFX Parallel is a pre-processing step. In CFX-Solver Manager, you can choose one of several partitioning methods (see Partitioner Tab in the CFX-Solver Manager User's Guide), and you can specify the maximum number of partitions (up to 16384).

In general, you should try to follow these guidelines wherever possible:

Do not run small jobs in parallel

For tetrahedral meshes, you may want to use a minimum of 30,000 nodes per partition. For partitions smaller than this, you are unlikely to see any significant performance increase and may even see parallel slow down.

For hexahedral meshes, good parallel performance improvements are usually not seen until a minimum of 75,000 nodes per partition is reached.

These numbers are machine dependent and can be higher or lower. Dual CPU PCs usually give poorer performance due to lack of bandwidth in the bus. Essentially the two CPUs can demand more memory access than the memory bus can provide.

Always try to use the MeTiS partitioning method

MeTiS generates the best partitions with respect to the size of the overlap regions between the partitions, and is therefore the favored, and the default, partitioning method in CFX. Use one of the alternative partitioning methods only if MeTiS fails, or if the memory overheads are unacceptably high.

Use a sensible number of partitions

The partitioning of a mesh leads to the creation of overlap regions at the partition interfaces. These regions are responsible for communication and memory overhead during a parallel run. During partitioning, CFX prints partitioning diagnostic information to the CFX-Solver Manager text window and CFX-Solver Output file about partition overlaps. The percentage of overlap nodes to the total number of mesh nodes should ideally be less than 10% for efficient partitioning. Values greater than 20% will impair performance and are not recommended.

Store the partition file

As mentioned above, partitioning is a preprocessing step. By default, Ansys CFX performs mesh partitioning automatically in combination with a parallel run. The creation of a partition file is therefore treated as an intermediate step, and is deleted at the end of a parallel run. However, if the machine that runs the leader process does not have enough memory to run the partitioner, or several parallel runs are performed with the same number of partitions, it is advisable to first store the partition file permanently with a partition-only run of CFX-Solver.

Select an appropriate partitioning mode

For multi-domain cases, you can set Multidomain Option on the Partitioner tab by selecting Automatic, Independent Partitioning, or Coupled Partitioning. For details, see Partitioner Tab in the CFX-Solver Manager User's Guide.

If the case does not involve particle transport, the Automatic option is the same as the Coupled Partitioning option; otherwise it is the same as the Independent Partitioning option.

Coupled Partitioning provides better partitions with respect to the size of overlap regions between partitions and thus exhibits improved parallel performance. However, this requires more memory for the partitioning run. The partitioning memory scales with the size of the largest domain for Independent Partitioning and with the total size of all domains for Coupled Partitioning. Furthermore, performance of particle transport calculations may be made worse when using Coupled Partitioning.

Perform partition smoothing

Partition smoothing attempts to minimize the surface area of partition boundaries by swapping vertices between partitions. Partition smoothing reduces communication overhead and improves solver robustness.

During partitioning, "Iso-Partition Connection Statistics" are written to the CFX-Solver Output file. These include:

  • the number of smoothing sweeps performed

  • the number of times that any vertices are swapped between partitions

  • the number of vertices that have a certain percentage of their connections in the same partition

Vertices with a low percentage of neighbors in the same partition tend to reduce numerical stability. The smoothing algorithm performs successive sweeps until either the number of these poorly connected vertices has been minimized, or the maximum number of sweeps has been performed.

By default, the smoothing algorithm usually performs a sufficient number of sweeps to optimally smooth your partitions. If necessary, you can increase the maximum number of sweeps by increasing the value of Partition Smoothing > Max. Smooth. Sweeps. Decreasing the maximum number of sweeps is not recommended.

Use appropriate partition node weighting

For meshes with mixed element types, it can be beneficial to use partition node weighting based on element type. For details, see Partition Node Weighting in the CFX-Solver Manager User's Guide.