7

Algorithm Performance and Statistical Significance

 3 years ago
source link: https://datacrayon.com/posts/search-and-optimisation/practical-evolutionary-algorithms/algorithm-performance-and-statistical-significance/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Introduction

When preparing to implement multi-objective optimisation experiments, it's often more convenient to use a ready-made framework/library instead of programming everything from scratch. Many libraries and frameworks have been implemented in many different programming languages. With our focus on multi-objective optimisation, our choice is an easy one. We will choose Platypus which has a focus on multi-objective problems and optimisation.

Platypus is a framework for evolutionary computing in Python with a focus on multiobjective evolutionary algorithms (MOEAs). It differs from existing optimization libraries, including PyGMO, Inspyred, DEAP, and Scipy, by providing optimization algorithms and analysis tools for multiobjective optimization.

In this section, we will use the Platypus framework to compare the performance of the Non-dominated Sorting Genetic Algorithm II (NSGA-II)1 and the Pareto Archived Evolution Strategy (PAES)2. To do this, we will use them to generate solutions to three problems in the ZDT test suite3.

Because both of these algorithms are stochastic, meaning that they will produce different results every time they are executed, we will select a sufficient sample size of 30 per algorithm per test problem. We will also use the default configurations for all the test problems and algorithms employed in this comparison. We will use the Hypervolume Indicator (introduced in earlier sections) as our performance metric.

This time, we will also try to test the significance of our results.

Significance testing

Finally, let's test the significance of our pairwise comparison. The significance test you select depends on the nature of your data-set and other criteria, e.g. some select non-parametric tests if their data-sets are not normally distributed. We will use the Wilcoxon signed-rank test through the following function: scipy.stats.wilcoxon():

The Wilcoxon signed-rank test tests the null hypothesis that two related paired samples come from the same distribution. In particular, it tests whether the distribution of the differences x - y is symmetric about zero. It is a non-parametric version of the paired T-test.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilcoxon.html

This will give us some idea as to whether the results from one algorithm are significantly different from those from another algorithm.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK