41.15. System Resource Usage

You can print several reports of memory and processor usage from the Parallel ribbon tab.

41.15.1. Processor Information

You can print out a table summarizing the processor usage on each machine that has compute processes spawned for the current session by clicking CPU Info and selecting CPU Info in the Parallel ribbon tab (System group box). For example, if processes have been spawned on three machines, a table similar to the following is displayed:

---------------------------------------------------------------------------------------
                    | CPU                                  | System Mem (GB)
Hostname            | Sock x Core  Clock (MHz)  Load       | Total        Available
---------------------------------------------------------------------------------------
host23              | 8 x 8, HT    2399.85      0.07       | 129.039      89.565
host24              | 8 x 8, HT    2399.95      1.08       | 129.039      32.516
host25              | 8 x 8, HT    2400.4       3.14       | 129.039      62.531
---------------------------------------------------------------------------------------
Total               | 192          -            -          | 387.117      184.612
---------------------------------------------------------------------------------------

Under CPU:

Sock x Core: displays the number of processor sockets, number of cores per socket, and whether hyper-threading is used.
Clock (MHz): is the processor speed.
Load: is the work load on the machine.

Under System Mem (GB):

Total: is the total system memory on the machine.
Available: is the available system memory on the machine.

You can print out information about the GPUs available on the machine by clicking CPU Info and selecting GPU Info in the Parallel ribbon tab (System group box). If your machine has one or more suitable GPUs present, you can use these to accelerate parallel processing computations as described in Using General Purpose Graphics Processing Units (GPGPUs) With the Algebraic Multigrid (AMG) Solver.

CUDA visible GPUs on host01
  CUDA runtime version 5000
  Driver version 6000
  Number of GPUs 1
    0. Quadro K2100M 
       3 SMs
       0.6665 GHz
       2.14748 GBytes

41.15.2. Memory Information

You can print out tables summarizing memory usage by node and by host by clicking CPU Info and selecting Memory Usage in the Parallel ribbon tab (System group box).

---------------------------------------------
       | Virtual Mem Usage (GB)|            
ID     | Current      Peak     | Page Faults
---------------------------------------------
host   | 0.0680117    0.0695   | 2.667e+04  
n0     | 0.0855195    0.114609 | 7.229e+04  
n1     | 0.0869922    0.115367 | 6.988e+04  
n2     | 0.0881836    0.117613 | 7.318e+04  
n3     | 0.0840781    0.11384  | 6.999e+04  
---------------------------------------------
Total  | 0.412785     0.53093  | 3.12e+05   
---------------------------------------------

-----------------------------------------------------------------
                    | Virtual Mem Usage (GB)    | System Mem (GB)          
Hostname            | Current      Peak         |                          
-----------------------------------------------------------------
host01              | 0.412785     0.53093      | 32.673       
-----------------------------------------------------------------
Total               | 0.412785     0.53093      |            
-----------------------------------------------------------------

Under Virtual Mem Usage (GB):

Current: is the virtual memory usage at the time the report is generated.
Peak: is the peak virtual memory usage.

(Linux only) Under Resident Mem Usage (GB):

Current: is the resident memory usage at the time the report is generated.
Peak: is the peak resident memory usage.

Page Faults: is the number of page faults that have occurred.

You can use these commands to plan Ansys Fluent jobs and machines accordingly. They may also be useful to diagnose performance problems.

41.15.3. Process and Model Timers

You can print out detailed information about the CPU timings for the current session as well as solver timings by clicking CPU Info and selecting Time Usage in the Parallel ribbon tab (System group box). The solver timings presented are described in Checking Parallel Performance.

---------------------------------------------
       | CPU Time Usage (Seconds)         
ID     | User         Kernel   Elapsed      
---------------------------------------------
host   | 1.34161      0.452403 2328.63      
n0     | 3.47882      0.702005 2325.88      
n1     | 19.9525      1.15441  2325.87      
n2     | 19.8433      0.873606 2325.87      
n3     | 19.8589      0.764405 2325.86      
---------------------------------------------
Total  | 64.4752      3.94683  -            
---------------------------------------------

Model Timers (Host)
  Other Models Time:                                  2.858 sec
  Total Time:                                         2.858 sec

Model Timers
  Other Models Time:                                  2.871 sec
  Total Time:                                         2.871 sec

Performance Timer for 472 iterations on 4 compute nodes
  Average wall-clock time per iteration:                0.823 sec
  Global reductions per iteration:                         93 ops
  Global reductions time per iteration:                 0.000 sec (0.0%)
  Message count per iteration:                           3842 messages
  Data transfer per iteration:                         17.226 MB
  LE solves per iteration:                                  7 solves
  LE wall-clock time per iteration:                     0.242 sec (29.4%)
  LE global solves per iteration:                           3 solves
  LE global wall-clock time per iteration:              0.001 sec (0.1%)
  LE global matrix maximum size:                           24
  AMG cycles per iteration:                             8.017 cycles
  Relaxation sweeps per iteration:                        374 sweeps
  Relaxation exchanges per iteration:                       0 exchanges
  LE early protections (stall) per iteration:           0.000 times
  LE early protections (divergence) per iteration:      0.000 times
  Total SVARS touched:                                    385
  Time-step updates per iteration:                       0.11 updates
  Time-step wall-clock time per iteration:              0.003 sec (0.4%)

  Total wall-clock time:                              388.669 sec


Simulation wall-clock time for 472 iterations             426 sec