Ansys Forte uses a specific Intel MPI version or versions on Windows systems and on Linux systems. The supported versions are detailed at the Platform Support page on the Ansys website. Note that some versions of MPI are compatible with certain versions of the UCX library and Infiniband. See the following Intel web pages for more details:
and
Your I.T. department may need to help you determine if a specific Intel MPI version is compatible with your cluster. The subsequent sections of this Appendix may help you and your IT support diagnose and fix issues you have when running Forte on your cluster. If after following the advice and suggestions in these sections you are still having issues, please contact Ansys Support and send us the following information:
The OS version of your cluster
Details of the errors you are experiencing
Details of the workflow you are using to submit the job to the cluster, including all job submission scripts and job output and log files, as well as the job scheduler software and version
A copy of the run_env.sh and run_mpi.sh files for the Forte case you are attempting to run
The Forte MONITOR file
Please also add the following lines to the run_mpi.sh script before the mpirun command:
echo lspci | grep -i mellanox > mellanox.txt ofed_info >ofed_info.txt ibstat > ibstat.txt ucx_info –v > ucx_v.txt ucx_info –d > ucx_d.txt which mpirun > mprun.txt printenv > env.sh ulimit -a > ulimit.txt mpirun hostname > hostnames.txt
and resubmit your run that calls run_mpi.sh and then send us the resulting .txt files.
Sometimes the incorrect Intel MPI fabric may be loaded by MPI on your cluster. To check this, add:
export I_MPI_DEBUG="1000"
to your run_env.sh and re-submit the job and send us the MONITOR file so we can confirm which fabric is being loaded.
Your IT Support may also run the Intel cluster checker:
and follow up directly with Intel if the checker reports potential issues on your cluster and ask Intel for advice.