A total of 11 performance metrics is used to evaluate an algorithm's performance for a particular problem, with each computed via the performance evaluation file.


Metrics 1-3: Average Suboptimality with Penalty for Constraint Violations

The first three metrics attempt to gauge "average suboptimality" as

\[ \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \left( \phi_p({\bf u}_k) - \phi_p ({\bf u}^*) \right), \]

where \( {\bf u}^* \) is the best known optimal point of the problem. This quantity is particularly crucial in applications where the degree of optimality of each individual experiment is important, and where it is of interest to minimize the number of strongly suboptimal experiments. A penalty for constraint violations is included so as not to reward those algorithms that obtain low (or even negative) average suboptimality by violating the constraints:

\[ \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \left( \phi_p({\bf u}_k) - \phi_p ({\bf u}^*) \right) + \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \sum_{j = 1}^{n_{g_p}} \mathop {\max} \left[ 0, \lambda g_{p,j}({\bf u}_k) \right] + \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \sum_{j = 1}^{n_{g}} \mathop {\max} \left[ 0, \lambda g_{j}({\bf u}_k) \right] + \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \sum_{i = 1}^{n_u} \mathop {\max} \left[ \lambda (u_i^L - u_{k,i}), 0, \lambda (u_{k,i} - u_i^U) \right], \]

with \( \lambda > 0 \) a penalty coefficient. So as not to introduce discrepancies due to scaling, a scaled version of the above is employed in defining the metrics:

\[ \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \frac{ \phi_p({\bf u}_k) - \phi_p ({\bf u}^*) }{\Delta \phi_{max}} + \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \sum_{j = 1}^{n_{g_p}} \mathop {\max} \left[ 0, \lambda \frac{g_{p,j}({\bf u}_k)}{g_{p,j}^{max}} \right] + \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \sum_{j = 1}^{n_{g}} \mathop {\max} \left[ 0, \lambda \frac{g_{j}({\bf u}_k) }{g_j^{max}} \right] + \frac{1}{k_{\rm final}+1} \sum_{k = 0}^{k_{\rm final}} \sum_{i = 1}^{n_u} \mathop {\max} \left[ \lambda \frac{u_i^L - u_{k,i}}{u_i^U - u_i^L}, 0, \lambda \frac{u_{k,i} - u_i^U}{u_i^U - u_i^L} \right], \]

with \( \Delta \phi_{max} \), \( g_{p,j}^{max} \), and \( g_{j}^{max} \) all defined separately for each problem in such a manner so as to give each function a range that is approximately equal to unity. The value of \( \lambda \) is set to 1 for Metric 1, 10 for Metric 2, and 100 for Metric 3 to reflect light, medium, and heavy penalties for violating the constraints, respectively. In the case that a tested algorithm obtains a value of 9999 or above for any of these metrics, which would correspond to dramatically poor performance, the reported value is simply given as 9999 to indicate that this is the case.


Metric 4: Number of Constraint Violations

Metric 4 simply counts the total number of constraint violations without providing any information on the degree of violation, and can take values ranging from 0 to \( (k_{\rm final}+1)(n_{g_p} + n_g + n_u) \). This metric may be useful for those problems where any violations are extremely undesirable.


Metrics 5-7: Suboptimality at the Final Experiment with Penalty for Constraint Violations

Metrics 5-7 are similar to Metrics 1-3 but only consider the final experiment and ignore the suboptimality or constraint violations that may have taken place during the convergence process:

\[ \frac{ \phi_p({\bf u}_{k_{\rm final}}) - \phi_p ({\bf u}^*) }{\Delta \phi_{max}} + \sum_{j = 1}^{n_{g_p}} \mathop {\max} \left[ 0, \lambda \frac{g_{p,j}({\bf u}_{k_{\rm final}})}{g_{p,j}^{max}} \right] + \sum_{j = 1}^{n_{g}} \mathop {\max} \left[ 0, \lambda \frac{g_{j}({\bf u}_{k_{\rm final}}) }{g_j^{max}} \right] + \sum_{i = 1}^{n_u} \mathop {\max} \left[ \lambda \frac{u_i^L - u_{k_{\rm final},i}}{u_i^U - u_i^L}, 0, \lambda \frac{u_{k_{\rm final},i} - u_i^U}{u_i^U - u_i^L} \right]. \]

Such a metric may be important for those applications where only the end result is important, with the potential costs of actually getting there considered to be negligible. As in Metrics 1-3, the penalty coefficient \( \lambda \) is set as 1 for Metric 5, 10 for Metric 6, and 100 for Metric 7. The upper limit of 9999 is enforced here as well when reporting the results of very poor performances.


Metrics 8-10: Number of Experiments Needed to Converge within a Tolerance

Metrics 8-10 are defined as the number of experiments needed for the algorithm to reduce the suboptimality of the initial experiment by a certain fraction and to maintain all further experiments below that threshold. They are defined as the lowest number of experiments, \( \bar k \), satisfying

\[ \bar k: \phi_p({\bf u}_k) - \phi_p ({\bf u}^*) \leq \gamma \left( \phi_p({\bf u}_0) - \phi_p ({\bf u}^*) \right), \; \forall k \in [\bar k, k_{\rm final} ], \]

where \( \gamma \) is set as 0.5 for Metric 8, 0.3 for Metric 9, and 0.1 for Metric 10 to represent 50%, 70%, and 90% convergence, respectively. In the case that no \( \bar k \) satisfy the above condition, that particular realization is not included in the averaged performance data, but the failure to converge within a tolerance in the allotted \( k_{\rm final} \) experiments is nevertheless made known when the percentage of realizations that converged (out of the 100 tested) is reported. These metrics essentially reflect the convergence speed of the algorithm.


Metric 11: Average Computation Time of Algorithm per Iteration

Metric 11 reports the average time, in seconds, needed for the algorithm to carry out its computations, i.e., the time needed to execute the line

[u(k+2,:),output] = algo(u,phiphat,gphat,sigmaphi,sigmag,uL,uU,algonum,input);

in algotest.m. This may be relevant for those applications where the experiments are to be carried out at a high frequency and where fast computation is desired.