Key terminology in Granta MI

A glossary of general machine learning concepts and the corresponding terminology and objects in Granta MI.


Name	Description
algorithm	Within MI Machine Learning, this always refers to a machine learning algorithm - a computer algorithm whose predictions or results improve through exposure to data.
neural network	Type of machine learning algorithm based on the behavior of neurons in living organisms. The Alchemite algorithm used by MI Machine Learning is a neural network.
model	Within MI Machine Learning, model generally means a specific feature matrix, engine, and the Granta MI data (to be) used as the training and test dataset. It can refer to a model where one or more of these is still to be defined, untrained models, and trained models.
untrained model	A model which hasn't been sent to the machine learning engine to be trained. Partially-defined models are also listed as untrained.
trained model	An algorithm trained on a specific feature matrix and training/test dataset.
cloned model	An untrained copy of a trained or untrained model.
engine	Third-party machine learning algorithm and software able to communicate with Granta MI.
hyperparameters	Settings for the machine learning algorithm or engine, typically relating to algorithm architecture or the training process. These are not currently accessible through MI Machine Learning; the Alchemite engine uses default or pre-optimized values as recommended by Intellegens.
feature	Attribute, property or category included in a model. Within MI Machine Learning, these variables are typically a material property.
feature matrix	Specially formatted input data for a machine learning algorithm, where each column corresponds to a feature of the model, and each row a single sample or set of measurements. Also called: Dataset
feature matrix column	Each column of a feature matrix corresponds to a feature of the model (one of the attributes, properties or categories whose relationships the model will explore). Columns are axes in the model's parameter space. In MI Machine Learning, each column is created from an attribute stored in Granta MI plus a transform specifying how to convert the attribute's value into a data point in the feature matrix.
feature matrix row	Each row of a feature matrix represents a real-life sample or example, which is also a single point in the model's parameter space. In MI Machine Learning, each record added to the training/test dataset corresponds to one or more rows (depending on whether the record contains one-to-one or one-to-many links in attributes included in the Feature Set).
Input, input feature	Represents a property which is already known or controllable, and can be used to predict other properties. For example, process parameters or AM build data. Also called: Independent feature, Descriptor, Predictor
Output, output feature	Represents a property which can be predicted using, or is influenced by, input features. For example, test results or final properties of an AM part. Also called: Dependent feature, Result
variable feature	Not currently supported in MI Machine Learning. Some features can be used as input or output features by a machine learning algorithm, depending on the data available.
Matrix	A set of tables, attributes and records stored in Granta MI, plus the transforms required to convert them into a true feature matrix.
primary table	The Granta MI table containing the records you want to include in your Matrix. All attributes you want to include should be in this table, or tables linked to it.
Feature Set	The set of Granta MI attributes and their transforms which form the columns of the Matrix (and will become the columns of the feature matrix sent to the engine).
purpose	Whether the feature defined in the Feature Set will be an Input or Output.
transform	The transform you want to apply to an attribute's value when the Matrix is converted into a feature matrix. For example, you may want to use the average value or the maximum/minimum value of a range attribute.
training dataset	The initial feature matrix used to define a model's parameter space and "train" the machine learning algorithm. The size and quality of the training dataset directly impacts the quality of the model's predictions. In MI Machine Learning, a single feature matrix is submitted to the engine, and split into training and test datasets. Also called: Input dataset, training feature matrix
test dataset	After training, a second input dataset is used to validate the newly-created model. This dataset is typically much smaller than the training dataset, and does not change the model. In MI Machine Learning, a single feature matrix is submitted to the engine, and split into training and test datasets.
model range	In MI Machine Learning, the model range for a feature is the range of the data the model was trained on (the maximum and minimum values in that column of the training dataset). Outside this range, the model relies on extrapolation and predictions will not be as reliable.
Model Quality	The Alchemite engine includes an estimate of the ability of a trained model to predict output features solely from input features. Model Quality is the median R² value of each Output in the test dataset versus the same Output generated by the model. Also called: Network error
Correlation	The correlation between (R² value of) two features of a trained model.
correlation matrix	All correlations (R² values) between all features in a trained model. Also called: Correlation analysis, Sensitivity analysis, Importance
Criteria Set	A model-specific set of constraints on input features and target values (goals) for output features, which defines a process optimization problem to be sent to the engine.
unoptimized Criteria Set	A Criteria Set which hasn't been sent to the engine to be solved.
optimized Criteria Set	A Criteria Set plus the results returned by the engine. Predicted values meet the criteria, but do not represent maxima or minima of the parameter space (the input values of the system are optimized to produce a specified result, but the output values are not mathematically optimized).
Weight	Relative importance of the specified Goals in a Criteria Set.
Probability	Part of a Criteria Set's results. An estimate from the model of the overall likelihood of achieving the stated output feature values from the stated input feature values. Also called: Likelihood, Cost function
Uncertainty	Part of a Criteria Set's results. Each feature has a predicted value and an uncertainty of 1 standard deviation. Also called: Erro