The following statistical procedures topics are available:
The mean, variance, covariance, and correlation coefficients of a multiple subscripted parameter are computed (using the *MOPER command). Refer to Kreyszig([163]) for the basis of the following formulas. All operations are performed on columns to conform to the database structure. The covariance is assumed to be a measure of the association between columns.
The following notation is used:
where:
[x] = starting matrix |
i = row index of first array parameter matrix |
j = column index of first array parameter matrix |
m = number of rows in first array parameter matrix |
n = number of columns in first array parameter matrix |
subscripts s, t = selected column indices |
[S] = covariance matrix n x n |
[c] = correlation matrix n x n |
= variance |
The mean of a column is:
(16–18) |
The covariance of the columns s and t is:
(16–19) |
The variance, , of column s is the diagonal term Sss of the covariance matrix [S]. The equivalent common definition of variance is:
(16–20) |
The correlation coefficient is a measure of the independence or dependence of one column to the next. The correlation and mean operations are based on Hoel([164]) (and initiated when CORR is inserted in the Oper field of the *MOPER command).
Correlation coefficient:
(16–21) |
value S of the terms of the coefficient matrix range from -1.0 to 1.0 where:
-1.0 = fully inversely related |
0.0 = fully independent |
1.0 = fully directly related |
A vector can be filled with a random sample of real numbers based on a uniform distribution with given lower and upper bounds (using RAND in the Func field on the *VFILL command) (see Figure 16.2: Uniform Density):
(16–22) |
where:
The numbers are generated using the URN algorithm of Swain and Swain([162]). The initial seed numbers are hard coded into the routine.
A vector may be filled with a random sample of real numbers based on a Gaussian distribution with a known mean and standard deviation (using GDIS in the Func field on the *VFILL command).
First, random numbers P(x), with a uniform distribution from 0.0 to 1.0, are generated using a random number generator. These numbers are used as probabilities to enter a cumulative standard normal probability distribution table (Abramowitz and Stegun([161])), which can be represented by Figure 16.3: Cumulative Probability Function or the Gaussian distribution function:
(16–23) |
where:
f(t) = Gaussian density function |
The table maps values of P(x) into values of x, which are standard Gaussian distributed random numbers from -5.0 to 5.0, and satisfy the Gaussian density function (Figure 16.4: Gaussian Density):
(16–24) |
where:
The x values are transformed into the final Gaussian distributed set of random numbers, with the given mean and standard deviation, by the transformation equation:
(16–25) |
A vector may be filled with a random sample of real numbers based on a triangular distribution with a known lower bound, peak value location, and upper bound (using TRIA in the Func field on the *VFILL command).
First, random numbers P(x) are generated as in the Gaussian example. These P(x) values (probabilities) are substituted into the triangular cumulative probability distribution function:
(16–26) |
where:
a = lower bound (input as CON1 on *VFILL command) |
c = peak location (input as CON2 on *VFILL command) |
b = upper bound (input as CON3 on *VFILL command) |
which is solved for values of x. These x values are random numbers with a triangular distribution, and satisfy the triangular density function (Figure 16.5: Triangular Density):
(16–27) |
A vector may be filled with a random sample of real numbers based on a beta distribution with known lower and upper bounds and α and β parameters (using BETA in the Func field on the *VFILL command).
First, random numbers P(x) are generated as in the Gaussian example. These random values are used as probabilities to enter a cumulative beta probability distribution table, generated by the program. This table can be represented by a curve similar to (Figure 16.3: Cumulative Probability Function), or the beta cumulative probability distribution function:
(16–28) |
The table maps values of P(x) into x values which are random numbers from 0.0 to 1.0. The values of x have a beta distribution with given α and β values, and satisfy the beta density function (Figure 16.6: Beta Density):
(16–29) |
where:
a = lower bound (input as CON1 on *VFILL command) |
b = upper bound (input as CON2 on *VFILL command) |
α = alpha parameter (input as CON3 on *VFILL command) |
β = beta parameter (input as CON4 on *VFILL command) |
B (α, β) = beta function |
f(t) = beta density function |
The x values are transformed into the final beta distributed set of random numbers, with given lower and upper bounds, by the transformation equation:
(16–30) |
A vector may be filled with a random sample of real numbers based on a gamma distribution with a known lower bound for α and β parameters (using GAMM in the Func field on the *VFILL command).
First, random numbers P(x) are generated as in the Gaussian example. These random values are used as probabilities to enter a cumulative gamma probability distribution table, generated by the program. This table can be represented by a curve similar to Figure 16.7: Gamma Density, or the gamma cumulative probability distribution function:
(16–31) |
where:
f(t) = gamma density function. |
The table maps values of P(x) into values of x, which are random numbers having a gamma distribution with given α and β values, and satisfy the gamma distribution density function (Figure 16.7: Gamma Density):
(16–32) |
where:
α = alpha parameter of gamma function (input as CON2 on *VFILL command) |
β = beta parameter of gamma density function (input as CON3 on *VFILL command) |
a = lower bound (input as CON1 on *VFILL command) |
The x values are relocated relative to the given lower bound by the transformation equation:
(16–33) |