The PCG Lanczos eigensolver uses the Lanczos algorithm to compute eigenvalues and eigenvectors (frequencies and mode shapes) for modal analyses, but replaces matrix factorization and multiple solves with multiple iterative solves. In other words, it replaces a direct sparse solver with the iterative PCG solver while keeping the same Lanczos algorithm. An iterative solution is faster than matrix factorization, but usually takes longer than a single block solve. The real power of the PCG Lanczos method is experienced for very large models, usually above a few million DOFs, where matrix factorization and solves become very expensive.
The PCG Lanczos method will automatically choose an appropriate default level of
difficulty, but experienced users may improve solution time by manually specifying the
level of difficulty via the PCGOPT command. Each successive increase
in level of difficulty (Lev_Diff
value on
PCGOPT command) increases the cost per iteration, but also
reduces the total iterations required. For Lev_Diff
= 5, a
direct matrix factorization is used so that the number of total iterations is the same
as the number of load cases. This option is best for smaller problems where the memory
required for factoring the given matrix is available, and the cost of factorization is
not dominant.
The performance summary for PCG Lanczos is contained in the file Jobname.pcs, with additional information related to the Lanczos solver. The first part of the .pcs file contains information specific to the modal analysis, including the computed eigenvalues and frequencies. The second half of the .pcs file contains similar performance data as found in a static or transient analysis. As highlighted in the next two examples, the important details in this file are the number of load cases (A), total iterations in PCG (B), level of difficulty (C), and the total elapsed time (D).
The number of load cases corresponds to the number of Lanczos steps required to obtain the specified number of eigenvalues. It is usually 2 to 3 times more than the number of eigenvalues desired, unless the Lanczos algorithm has difficulty converging. PCG Lanczos will be increasingly expensive relative to Block Lanczos as the number of desired eigenvalues increases. PCG Lanczos is best for obtaining a relatively small number of modes (up to 100) for large models (over a few million DOF).
The next two examples show parts of the PCS file that report performance statistics
described above for a 2 million DOF modal analysis that computes 10 modes. The
difference between the two runs is the level of difficulty used
(Lev_Diff
on the PCGOPT command).
Example 6.4: PCS File for PCG Lanczos, Level of Difficulty = 3 uses PCGOPT,3. The output shows
that Lev_Diff
= 3 (C), and the total iterations required for
25 Lanczos steps (A) is 2355 (B), or an average of 94.2 iterations per step (E). Example 6.5: PCS File for PCG Lanczos, Level of Difficulty = 5 shows that increasing Lev_Diff
to 5 (C) on PCGOPT reduces the iterations required per Lanczos step
to just one (E).
Though the solution time difference in these examples shows that a
Lev_Diff
value of 3 is faster in this case (see (D) in
both examples), Lev_Diff
= 5 can be much faster for more
difficult models where the average number of iterations per load case is much higher.
The average number of PCG iterations per load case for efficient PCG Lanczos solutions
is generally around 100 to 200. If the number of PCG iterations per load case begins to
exceed 500, then either the level of difficulty should be increased in order to find a
more efficient solution, or it may be more efficient to use the Block Lanczos
eigensolver (assuming the problem size does not exceed the limits of the system).
This example shows a model that performed quite well with the PCG Lanczos eigensolver. Considering that it converged in under 100 iterations per load case, the Lev_Diff value of 3 is probably too high for this model (especially at higher core counts). In this case, it might be worthwhile to try Lev_Diff = 1 or 2 to see if it improves the solver performance. Using more than one core would also certainly help to reduce the time to solution.
Example 6.4: PCS File for PCG Lanczos, Level of Difficulty = 3
Lanczos Solver Parameters ------------------------- Lanczos Block Size: 1 Eigenpairs computed: 10 lowest Extra Eigenpairs: 1 Lumped Mass Flag: 0 In Memory Flag: 0 Extra Krylov Dimension: 1 Mass Matrix Singular Flag: 1 PCCG Stopping Criteria Selector: 4 PCCG Stopping Threshold: 1.000000e-04 Extreme Preconditioner Flag: 0 Reortho Type: 1 Number of Reorthogonalizations: 7 nExtraWorkVecs for computing eigenvectors: 4 Rel. Eigenvalue Tolerance: 1.000000e-08 Rel. Eigenvalue Residual Tolerance: 1.000000e-11 Restart Condition Number Threshold: 1.000000e+15 Sturm Check Flag: 0 Shifts Applied: 1.017608e-01 Eigenpairs Number of Eigenpairs 10 --------------------------------------- No. Eigenvalue Frequency(Hz) --- ---------- ------------- 1 1.643988e+03 6.453115e+00 2 3.715504e+04 3.067814e+01 3 5.995562e+04 3.897042e+01 4 9.327626e+04 4.860777e+01 5 4.256303e+05 1.038332e+02 6 7.906460e+05 1.415178e+02 7 9.851501e+05 1.579688e+02 8 1.346627e+06 1.846902e+02 9 1.656628e+06 2.048484e+02 10 2.050199e+06 2.278863e+02 Number of cores used: 1 Degrees of Freedom: 2067051 DOF Constraints: 6171 Elements: 156736 Assembled: 156736 Implicit: 0 Nodes: 689017 Number of Load Cases: 25 <---A (Lanczos Steps) Nonzeros in Upper Triangular part of Global Stiffness Matrix : 170083104 Nonzeros in Preconditioner: 201288750 *** Precond Reorder: MLD *** Nonzeros in V: 12401085 Nonzeros in factor: 184753563 Equations in factor: 173336 *** Level of Difficulty: 3 (internal 2) *** <---C (Preconditioner) Total Operation Count: 3.56161e+12 Total Iterations In PCG: 2355 <---B (Convergence) Average Iterations Per Load Case: 94.2 <---E (Iterations per Step) Input PCG Error Tolerance: 0.0001 Achieved PCG Error Tolerance: 9.98389e-05 DETAILS OF PCG SOLVER SETUP TIME(secs) Cpu Wall Gather Finite Element Data 0.40 0.40 Element Matrix Assembly 96.24 96.52 DETAILS OF PCG SOLVER SOLUTION TIME(secs) Cpu Wall Preconditioner Construction 1.74 1.74 Preconditioner Factoring 51.69 51.73 Apply Boundary Conditions 5.03 5.03 Eigen Solve 3379.36 3377.52 Eigen Solve Overhead 172.49 172.39 Compute MQ 154.00 154.11 Reorthogonalization 123.71 123.67 Computation 120.66 120.61 I/O 3.05 3.06 Block Tridiag Eigen 0.00 0.00 Compute Eigenpairs 1.63 1.63 Output Eigenpairs 0.64 0.64 Multiply With A 1912.52 1911.41 Multiply With A22 1912.52 1911.41 Solve With Precond 1185.62 1184.85 Solve With Bd 89.84 89.89 Multiply With V 192.81 192.53 Direct Solve 880.71 880.38 ****************************************************************************** TOTAL PCG SOLVER SOLUTION CP TIME = 3449.01 secs TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 3447.20 secs <---D (Total Time) ****************************************************************************** Total Memory Usage at Lanczos : 3719.16 MB PCG Memory Usage at Lanczos : 2557.95 MB Memory Usage for Matrix : 0.00 MB ****************************************************************************** Multiply with A Memory Bandwidth : 15.52 GB/s Multiply with A MFLOP Rate : 833.13 MFlops Solve With Precond MFLOP Rate : 1630.67 MFlops Precond Factoring MFLOP Rate : 0.00 MFlops ****************************************************************************** Total amount of I/O read : 6917.76 MB Total amount of I/O written : 6732.46 MB ******************************************************************************
Example 6.5: PCS File for PCG Lanczos, Level of Difficulty = 5
Lanczos Solver Parameters ------------------------- Lanczos Block Size: 1 Eigenpairs computed: 10 lowest Extra Eigenpairs: 1 Lumped Mass Flag: 0 In Memory Flag: 0 Extra Krylov Dimension: 1 Mass Matrix Singular Flag: 1 PCCG Stopping Criteria Selector: 4 PCCG Stopping Threshold: 1.000000e-04 Extreme Preconditioner Flag: 1 Reortho Type: 1 Number of Reorthogonalizations: 7 nExtraWorkVecs for computing eigenvectors: 4 Rel. Eigenvalue Tolerance: 1.000000e-08 Rel. Eigenvalue Residual Tolerance: 1.000000e-11 Restart Condition Number Threshold: 1.000000e+15 Sturm Check Flag: 0 Shifts Applied: -1.017608e-01 Eigenpairs Number of Eigenpairs 10 --------------------------------------- No. Eigenvalue Frequency(Hz) --- ---------- ------------- 1 1.643988e+03 6.453116e+00 2 3.715494e+04 3.067810e+01 3 5.995560e+04 3.897041e+01 4 9.327476e+04 4.860738e+01 5 4.256265e+05 1.038328e+02 6 7.906554e+05 1.415187e+02 7 9.851531e+05 1.579690e+02 8 1.346626e+06 1.846901e+02 9 1.656620e+06 2.048479e+02 10 2.050184e+06 2.278854e+02 Number of cores used: 1 Degrees of Freedom: 2067051 DOF Constraints: 6171 Elements: 156736 Assembled: 156736 Implicit: 0 Nodes: 689017 Number of Load Cases: 25 <---A (Lanczos Steps) Nonzeros in Upper Triangular part of Global Stiffness Matrix : 170083104 Nonzeros in Preconditioner: 4168012731 *** Precond Reorder: MLD *** Nonzeros in V: 0 Nonzeros in factor: 4168012731 Equations in factor: 2067051 *** Level of Difficulty: 5 (internal 0) *** <---C (Preconditioner) Total Operation Count: 4.34378e+11 Total Iterations In PCG: 25 <---B (Convergence) Average Iterations Per Load Case: 1.0 <---E (Iterations per Step) Input PCG Error Tolerance: 0.0001 Achieved PCG Error Tolerance: 1e-10 DETAILS OF PCG SOLVER SETUP TIME(secs) Cpu Wall Gather Finite Element Data 0.42 0.43 Element Matrix Assembly 110.99 111.11 DETAILS OF PCG SOLVER SOLUTION TIME(secs) Cpu Wall Preconditioner Construction 26.16 26.16 Preconditioner Factoring 3245.98 3246.01 Apply Boundary Conditions 5.08 5.08 Eigen Solve 1106.75 1106.83 Eigen Solve Overhead 198.14 198.15 Compute MQ 161.40 161.28 Reorthogonalization 130.51 130.51 Computation 127.45 127.44 I/O 3.06 3.07 Block Tridiag Eigen 0.00 0.00 Compute Eigenpairs 1.49 1.49 Output Eigenpairs 0.62 0.62 Multiply With A 7.89 7.88 Multiply With A22 7.89 7.88 Solve With Precond 0.00 0.00 Solve With Bd 0.00 0.00 Multiply With V 0.00 0.00 Direct Solve 908.61 908.68 ****************************************************************************** TOTAL PCG SOLVER SOLUTION CP TIME = 4395.84 secs TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 4395.96 secs <---D (Total Time) ****************************************************************************** Total Memory Usage at Lanczos : 3622.87 MB PCG Memory Usage at Lanczos : 2476.45 MB Memory Usage for Matrix : 0.00 MB ****************************************************************************** Multiply with A Memory Bandwidth : 39.94 GB/s Solve With Precond MFLOP Rate : 458.69 MFlops Precond Factoring MFLOP Rate : 0.00 MFlops ****************************************************************************** Total amount of I/O read : 11853.51 MB Total amount of I/O written : 7812.11 MB ******************************************************************************