`optimizer` - optimizer class¶

The file yann.modules.optimizer.py contains the definition for the optimizer:

class yann.modules.optimizer.optimizer(optimizer_init_args, verbose=1)[source]¶

Optimizer is an important module of the toolbox. Optimizer creates the protocols required for learning. yann‘s optimizer supports the following optimization techniques:

Stochastic Gradient Descent

AdaGrad [1]

RmsProp [2]

Adam [3]

Adadelta [4]

Optimizer also supports the following momentum techniques:

Polyak [5]

Nesterov [6]

[1]	John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. JMLR

[2]	Yann N. Dauphin, Harm de Vries, Junyoung Chung, Yoshua Bengio,”RMSProp and equilibrated adaptive learning rates for non-convex optimization”, or arXiv:1502.04390v1

[3]	Kingma, Diederik, and Jimmy Ba. “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980 (2014).

[4]	Zeiler, Matthew D. “ADADELTA: an adaptive learning rate method.” arXiv preprint arXiv:1212.5701 (2012).

[5]

Polyak, Boris Teodorovich. “Some methods of speeding up the convergence of iteration methods.” USSR Computational Mathematics and Mathematical Physics 4.5 (1964): 1-17. Implementation was adapted from Sutskever, Ilya, et al. “On the importance of initialization and momentum in deep learning.” Proceedings of the 30th international conference on machine learning (ICML-13). 2013.

[6]	Nesterov, Yurii. “A method of solving a convex programming problem with convergence rate O (1/k2).” Soviet Mathematics Doklady. Vol. 27. No. 2. 1983. Adapted from Sebastien Bubeck’s blog.

Parameters:

verbose – Similar to any 3-level verbose in the toolbox.

optimizer_init_args –

optimizer_init_args is a dictionary like:

optimizer_params =  {
    "momentum_type"   : <option>  'false' <no momentum>, 'polyak', 'nesterov'.
                        Default value is 'false'
    "momentum_params" : (<option in range [0,1]>, <option in range [0,1]>, <int>)
                        (momentum coeffient at start,at end,
                        at what epoch to end momentum increase)
                        Default is the tuple (0.5, 0.95,50)
    "optimizer_type" : <option>, 'sgd', 'adagrad', 'rmsprop', 'adam'.
                       Default is 'sgd'
    "id"        : id of the optimizer
            }

Returns:

Optimizer object

Return type:

yann.modules.optimizer

calculate_gradients(params, objective, verbose=1)[source]¶

This method initializes the gradients.

Parameters:	params – Supply learnable active parameters of a network. objective – supply a theano graph connecting the params to a loss verbose – Just as always

Notes

Once this is setup, optimizer.gradients are available

create_updates(verbose=1)[source]¶

This basically creates all the updates and update functions which trainers can iterate upon.

Parameters:	verbose – Just as always

optimizer - optimizer class¶

`optimizer` - optimizer class¶