From: Effective hyperparameter optimization using Nelder-Mead method in deep learning
Name | Description | Range |
---|---|---|
x 1 | Learning rate (\(= 0.1^{x_{1}}\phantom {\dot {i}\!}\)) | [ 1,4] |
x 2 | Momentum (\(= 1 - 0.1^{x_{2}}\phantom {\dot {i}\!}\)) | [ 0.5,2] |
x 3 | L2 weight decay | [ 0.001,0.01] |
x 4 | Dropout 1 | [ 0.4,0.6] |
x 5 | Dropout 2 | [ 0.4,0.6] |
\(x_{6}^{*}\) | FC 1 units | [ 512,1024] |
\(x_{7}^{*}\) | FC 2 units | [ 256,512] |
x 8 | Conv 1 initialization deviation | [ 0.01,0.05] |
x 9 | Conv 2 initialization deviation | [ 0.01,0.05] |
x 10 | Conv 3 initialization deviation | [ 0.01,0.05] |
x 11 | FC 1 initialization deviation | [ 0.001,0.01] |
x 12 | FC 2 initialization deviation | [ 0.001,0.01] |
x 13 | FC 3 initialization deviation | [ 0.001,0.01] |
x 14 | Conv 1 bias | [ 0,1] |
x 15 | Conv 2 bias | [ 0,1] |
x 16 | Conv 3 bias | [ 0,1] |
x 17 | FC 1 bias | [ 0,1] |
x 18 | FC 2 bias | [ 0,1] |
\(x_{19}^{*}\) | Normalization 1 localsize (=2x 19+3) | [ 0,2] |
\(x_{20}^{*}\) | Normalization 2 localsize (=2x 20+3) | [ 0,2] |
x 21 | Normalization 1 alpha | [ 0.0001,0.0002] |
x 22 | Normalization 2 alpha | [ 0.0001,0.0002] |
x 23 | Normalization 1 beta | [ 0.5,0.95] |
x 24 | Normalization 2 beta | [ 0.5,0.95] |