37.13. Resolving GPU Solver Performance Issues

When experiencing GPU performance issues while using the Fluent GPU Solver with a single GPU, the following procedure is recommended to resolve any issues:

  • Check to see if the GPU is in a free mode by executing nvidia-smi in the command prompt window as outlined in Starting the Fluent GPU Solver from the Command Line.

  • Increase the reporting interval for residuals and monitors with the following tui command.

    /solve set report-interval 20

    A reporting interval of 20 is recommended.

  • Check if the convergence is as expected, specifically when the residuals plot appears irregular. You can turn on the AMG verbosity within the Advanced Solution Controls dialog box (see Setting the Verbosity in the Fluent User's Guide) to see if the linear solver is converging normally and if the solver is calculating through all 50 maximum cycles.

When experiencing GPU performance issues while using the Fluent GPU Solver with multiple GPUs, first perform the procedure above for resolving single GPU performance issues, then perform the following additional steps:

  • Calculating the solution for too small of a case will cause performance issues when solving on multiple GPUs. For multiple GPU calculations it is recommended to have no less than two million cells per GPU.

  • Check if the load balance is satisfactory by first reading in your case and then printing the load balance using the following TUI command:

    /parallel/partition/print-active

    If the load balance is not good, it is recommended to use the stored cell partition scheme by entering the following TUI commands:

    /parallel/partition/method metis

    /parallel/partition/use-stored-partitions

    For more information on checking the load balance and interpreting the partition statistics see Interpreting Partition Statistics.

  • Check if the communication between GPUs is satisfactory using the following TUI commands:

    /parallel/bandwidth

    /parallel/latency

    For more information on interpreting the bandwith and latency data see Checking Latency and Bandwidth.