Skip to main content

Table 1 Experimental setting for each method

From: Effective hyperparameter optimization using Nelder-Mead method in deep learning

Method

Detail

Random search

Perform 600 random evaluations.

Bayesian optimization

Initialize the observation data with the first 100 evaluations of the random search, then perform the optimization with exactly 500 evaluations. The kernel is the ARD Matérn 5/2 and the acquisition function is the EI [8, 10].

CMA-ES

Perform 600 evaluations with 20 generations where each generation consists of 30 individuals. \(\langle \mathbf {x} \rangle _{w}^{(0)} = 0.5\), σ (0)=0.2. All variables are scaled to [0,1] [10].

Coordinate-search method

Initialize x 0 as the best point of the first 100 random search evaluations, then perform optimization for up to 500 evaluations. α=0.5. All variables are scaled to [0,1].

Nelder-Mead method

Generate an initial simplex randomly, then perform optimization for up to 600 evaluations (including initialization). \(\gamma ^{s} = \frac {1}{2}, \delta ^{ic} = -\frac {1}{2}, \delta ^{oc} = \frac {1}{2}, \delta ^{r} = 1\ \text {and}\ \delta ^{e} = 2\).