Algorithm Performance and Statistical Significance

Introduction¶

When preparing to implement multi-objective optimisation experiments, it's often more convenient to use a ready-made framework/library instead of programming everything from scratch. Many libraries and frameworks have been implemented in many different programming languages. With our focus on multi-objective optimisation, our choice is an easy one. We will choose Platypus which has a focus on multi-objective problems and optimisation.

Platypus is a framework for evolutionary computing in Python with a focus on multiobjective evolutionary algorithms (MOEAs). It differs from existing optimization libraries, including PyGMO, Inspyred, DEAP, and Scipy, by providing optimization algorithms and analysis tools for multiobjective optimization.

In this section, we will use the Platypus framework to compare the performance of the Non-dominated Sorting Genetic Algorithm II (NSGA-II)1 and the Pareto Archived Evolution Strategy (PAES)2. To do this, we will use them to generate solutions to three problems in the ZDT test suite3.

Because both of these algorithms are stochastic, meaning that they will produce different results every time they are executed, we will select a sufficient sample size of 30 per algorithm per test problem. We will also use the default configurations for all the test problems and algorithms employed in this comparison. We will use the Hypervolume Indicator (introduced in earlier sections) as our performance metric.

This time, we will also try to test the significance of our results.

Significance testing¶

Finally, let's test the significance of our pairwise comparison. The significance test you select depends on the nature of your data-set and other criteria, e.g. some select non-parametric tests if their data-sets are not normally distributed. We will use the Wilcoxon signed-rank test through the following function: scipy.stats.wilcoxon():

The Wilcoxon signed-rank test tests the null hypothesis that two related paired samples come from the same distribution. In particular, it tests whether the distribution of the differences x - y is symmetric about zero. It is a non-parametric version of the paired T-test.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilcoxon.html

This will give us some idea as to whether the results from one algorithm are significantly different from those from another algorithm.

Introduction¶

Significance testing¶

Recommend

Using a Framework to Compare Algorithm Performance

Using a Framework to Generate Results

Non-Dominated Sorting

Pareto Optimality and Dominance Relations

Single Objective Problems: Rastrigin

Using a Framework and the ZDT Test Suite

Populations in Objective and Decision Space

Synthetic Objective Functions and ZDT2

Synthetic Objective Functions and ZDT1

Block Diagrams in Notebooks

About Joyk