Note: As of Ansys Polyflow 2021 R1, the classic direct solver is no longer supported.
Ansys Polyflow uses a frontal method to solve Equation 31–1 (with potentially multiple fronts created by domain decomposition). That is, it uses a direct method based on Gaussian elimination.
The basic principle of the frontal method is that the assembly of each element’s contribution to the matrix and the vector can be immediately followed by a Gaussian elimination process on some of the element variables. In fact, the full matrix is never formed. The frontal procedure instead goes through the following steps for each element:
Assemble the element contribution to and .
Eliminate by Gaussian elimination the element degrees of freedom that will no longer contribute to and .
The frontal method requires only the currently active (incompletely assembled) equations to be stored in central memory. Following the complete transformation of into an upper-triangular form, the solution vector is obtained by back-substitution.
By default, Ansys Polyflow incorporates a very efficient “cache-optimized" direct solver, which improves the computer efficiency by a factor of 2 to 5 by improving the locality of the data during the frontal elimination. Instead of performing a sequential elimination of the variables, unknowns ready for elimination are clustered in blocks, a (small) matrix is inverted, and the complete block is eliminated from subsequent equations. This "block" elimination of the variables dramatically improves the performance on most hardware platforms, which are extremely cache-sensitive. Another very important factor is that the performance remains mostly unaffected by the problem size.
This block solver makes use of a BLAS3 (Basic Linear Algebra Subroutine, see
www.netlib.org/blas
, especially
blas3-paper.ps
, for details) implementation of the
Gaussian elimination, and specifically of the DGEMM subroutine. If necessary,
the size of the blocks can be adjusted by modifying the
BLOCS
entry of the .p3rc file.
The default corresponds to
BLOCS 64 BLAS LEVEL3
and this is the recommended value for most superscalar and even vector computers.
Although you can set the BLAS
level to other values,
it is unlikely to improve performance.
Optimization of the element numbering is essential in order to obtain the best performance from the Ansys Polyflow solver. Ansys Polyflow performs the optimization automatically by default, as described in Mesh Decomposition and Optimization.
For large 3D meshes, decomposition of the mesh into subparts is also recommended to improve the solver performance and to take advantage of the multiple fronts. For very large meshes, the speedup and memory savings can be extremely large, typically 2 to 10, possibly up to 20; for this reason, Ansys Polyflow performs mesh decomposition automatically, as described in Mesh Decomposition and Optimization.