Troubleshooting clustered deployments

If a node in a clustered Granta MI deployment fails or becomes unhealthy, MI applications will exhibit different symptoms. You can use these symptoms to diagnose problems with your deployment and suggest resolutions where appropriate.

MI Import and MI Export

Jobs created in MI Import and MI Export run on a single node. If the node dies while a job is running, then the job will be restarted by an available node after a time-out period (default 5 minutes).

Resolution: For export jobs and for most import jobs no action is necessary. For import jobs that stopped after partially completing an import, the job may need to be modified and then rerun.

MI Viewer

An MI Viewer session is tied to a single node. If that node fails (or is deemed by the load balancer to be in an unhealthy state), then all users whose sessions are associated with that node will experience problems. The exact nature of the problem will depend on the nature of the problem with the node and the load balancer / load balance configuration.

Resolution: Refreshing the browser window will start a new session on a working node. Session information such as record lists and search results will be lost.
Note: Even if the affected node restarts quickly, the user’s session will be lost.

MI Admin and MI Toolbox

As these desktop applications connect directly to an MI Server running on a node, the connection will be lost and the application will automatically close if the MI Server service stops. Any unsaved changes will be lost.

Resolution: Restart the application and connect to a working node.

MI Data Flow

The MI Data Flow API service can run on a single node only in an MI cluster. If this node dies (or is deemed by the load balancer to be in an unhealthy state):
  • Workflow instances will not run
  • MI Data Flow Designer cannot be used.

Resolution: Resolve the issue with the server where MI Data Flow is installed, or move MI Data Flow onto the other node.

MI Material Calibration and MI AI+ (Machine Learning)

MI Material Calibration and MI AI+ services do not support clustering and so run on servers outside of the cluster. If the compute server on which these services are running goes down, users will lose all related functionality, even if the clustered MI nodes are working as normal.

Resolution: Restart the compute server and then refresh the browser where the MI Material Calibration or MI AI+ session was running to start a new session.