network - The network module

yann.network.py contains the definition for the base network class. It is pretty much the most accessible part of this toolbox and forms the structure of the toolbox itself. Any experiment using this toolbox will have to begin and end with using the network class:

class yann.network.network(verbose=2, **kwargs)[source]

Todo:

Class definition for the class network:

All network properties, network variables and functionalities are initialized using this class and are contained by the network object. The network.__init__ method initializes the network class. The network.__init___ function has many purposes depending on the arguments supplied.

Provide any or all of the following arguments. Appropriate errors will be thrown if the parameters are not supplied correctly.

Todo

  • posteriors in a classifier layers is not really a probability. Need to fix this.
Parameters:
  • verbose – Similar to any 3-level verbose in the toolbox.
  • type – option takes only ‘classifier’ for now. Will add ‘encoders’ and others later
  • borrow – Check theano's borrow. Default is True.
Returns:

network object with parameters setup.

Return type:

yann.network.network

add_layer(type, verbose=2, **kwargs)[source]

Todo

Need to add the following: * Inception Layer. * LSTM layer. * ...

Parameters:
  • type – <string> options include ‘input’ or ‘data’ - which indicates an input layer. ‘conv_pool’ or ‘convolution’ - indicates a convolutional - pooling layer ‘deconv’ or ‘deconvolution’ - indicates a fractional stride convoltuion layer ‘dot_product’ or ‘hidden’ or ‘mlp’ or ‘fully_connected’ - indicates a hidden fully connected layer ‘classifier’ or ‘softmax’ or ‘output’ or ‘label’ - indicates a classifier layer ‘objective’ or ‘loss’ or ‘energy’ - a layer that creates a loss function ‘merge’ or ‘join’ - a layer that merges two layers. ‘flatten’ - a layer that produces a flattened output of a block data. ‘random’ - a layer that produces random numbers. ‘rotate’ - a layer that rotate the input images. ‘tensor’ - a layer that converts the input tensor as an input layer. From now on everything is optional args..
  • id – <string> how to identify the layer by. Default is just layer number that starts with 0.
  • originid will use the output of that layer as input to the new layer. Default is the last layer created. This variable for input type of layers is not a layer, but a datastream id. For merge layer, this is a tuple of two layer ids.
  • verbose – similar to the rest of the toolbox.
  • mean_subtract – if True we will subtract the mean from each image, else not.
  • num_neurons – number of neurons in the layer
  • dataset – <string> Location to the dataset. used when layer type is input.
  • activation – String, takes options that are listed in activations Needed for layers that use activations. Some activations also take support parameters, for instance maxout takes maxout type and size, softmax takes an option temperature. Refer to the module activations to know more.
  • stride – tuple (int , int). Used as convolution stride. Default (1,1)
  • batch_norm – If provided will be used, default is False.
  • border_mode – Refer to border_mode variable in yann.core.conv, module conv
  • pool_size – Subsample size, default is (1,1).
  • pool_type – Refer to pool for details. {‘max’, ‘sum’, ‘mean’, ‘max_same_size’}
  • learnable – Default is True, if True we backprop on that layer. If False Layer is obstinate.
  • shape – tuple of shape to unflatten to ( height, width, channels ) in case layer was an unflatten layer
  • input_params – Supply params or initializations from a pre-trained system.
  • dropout_rate – If you want to dropout this layer’s output provide the output.
  • regularizeTrue is you want to apply regularization, False if not.
  • num_classesint number of classes to classify.
  • objective – objective provided by classifier nll-negative log likelihood, cce-categorical cross entropy, bce-binary cross entropy, hinge-hinge loss . For classifier layer.
  • dataset_init_args – same as for the dataset module. In fact this argument is needed only when dataset module is not setup.
  • datastream_id – When using input layer or during objective layer, use this to identify which datastream to take data from.
  • regularizer – Default is (0.001, 0.001) coeffients for L1, L2 regulaizer coefficients.
  • errormerge layers take an option called 'error' which can be None or others which are methods in yann.core.errors.
  • angle – Takes value between [0,1] to capture the angle between [0,180] degrees Default is None. If None is specified, random number is generated from a uniform distriibution between 0 and 1.
  • layer_type – If value supply, else it is default 'discriminator'. For other layers, if the layer class takes an argument type, supply that argument here as layer_type. merge layer for instance will use this arugment as its type argument.
add_module(type, params=None, verbose=2)[source]

Use this function to add a module to the net.

Parameters:
  • type – which module to add. Options are 'resultor', 'visualizer', 'optimizer' 'datastream'
  • params

    If the type was 'resultor' params is a dictionary of the form:

    params    =    {
            "root"     : "<root directory to save stuff inside>"
            "results"  : "<results_file_name>.txt",
            "errors"   : "<error_file_name>.txt",
            "costs"    : "<cost_file_name>.txt",
            "confusion": "<confusion_file_name>.txt",
            "network"  : "<network_save_file_name>.pkl"
            "id"       : id of the resultor
                    }
    

    While the filenames are optional, root must be provided. If a particular file is not provided, that value will not be saved. This value is supplied to setup the resultor module of :mod: network.

    If the type was 'visualizer' params is a dictionary of the form:

    parmas = {
            "root"        : location to save the visualizations
            "frequency"   : <integer>, after how many epochs do you need to
                            visualize. Default value is 1import os
    
            "sample_size" : <integer, prefer squares>, simply save down random
                            images from the datasets also saves down activations
                            for the same images also. Default value is 16
            "rgb_filters" : <bool> flag. if True 3D-RGB CNN filters are rendered.
                            Default value is False
            "id"          : id of the visualizer
                    }
    

    If the type was 'optimizer' params is a dictionary of the form:

    params =  {
            "momentum_type"       : <option> takes 'false' <no momentum>, 'polyak'
                                    and 'nesterov'. Default value is 'polyak'
            "momentum_params"   : (<value in [0,1]>, <value in [0,1]>, <int>),
                                    (momentum coeffient at start, at end, at what
                                    epoch to end momentum increase). Default is
                                    the tuple (0.5, 0.95,50)
            "learning_rate"   : (initial_learning_rate, fine_tuning_learning_rate,
                                    annealing_decay_rate). Default is the tuple
                                    (0.1,0.001,0.005)
            "regularization"    : (l1_coeff, l2_coeff). Default is (0.001, 0.001)
            "optimizer_type": <option>, takes 'sgd', 'adagrad', 'rmsprop', 'adam'.
                                    Default is 'rmsprop'
            "objective_function": <option>,  takes
                                    'nll'-negative log likelihood,
                                    'cce'-categorical cross entropy,
                                    'bce'-binary cross entropy.
                                    Default is 'nll'
            "id"                : id of the optimizer
                }
    

    If the type was ``'datastream' params is a dictionary of the form:

    params = {
                "dataset":  <location>
                "svm"    :  False or True
                    ``svm`` if ``True``, a one-hot label set will also be setup.
                "n_classes": <int>
                    ``n_classes`` if ``svm`` is ``True``, we need to know how
                    many ``n_classes`` are present.
                "id": id of the datastream
        }
    
  • verbose – Similar to rest of the toolbox.
cook(verbose=2, **kwargs)[source]

This function builds the backprop network, and makes the trainer, tester and validator theano functions. The trainer builds the trainers for a particular objective layer and optimizer.

Parameters:
  • optimizer – Supply which optimizer to use. Default is last optimizer created.
  • datastream – Supply which datastream to use. Default is the last datastream created.
  • visualizer – Supply a visualizer to cook with. Default is the last visualizer created.
  • classifier_layer – supply the layer of classifier. Default is the last classifier layer created.
  • objective_layers – Supply a list of layer ids of layers that has the objective function. Default is last objective layer created if no classifier is provided.
  • objective_weights – Supply a list of weights to be multiplied by each value of the objective layers. Default is 1.
  • active_layers – Supply a list of active layers. If this parameter is supplied all 'learnabile' of all layers will be ignored and only these layers will be trained. By default, all the learnable layers are used.
  • verbose – Similar to the rest of the toolbox.
deactivate_layer(id, verbose=2)[source]

This method will remove a layer’s parameters from the active_layer dictionary.

Parameters:
  • id – Layer which you want to de activate.
  • verbose – as usual.

Notes

If the network was cooked, it would have to be re-cooked after deactivation.

get_params(verbose=2)[source]

This method returns a dictionary of layer weights and bias in numpy format.

Parameters:verbose – Blah..
Returns:A dictionary of parameters.
Return type:OrderedDict
layer_activity(id, index=0, verbose=2)[source]

Use this function to visualize or print out the outputs of each layer. I don’t know why this might be useful, but its fun to check this out I guess. This will only work after the dataset is initialized.

Parameters:
  • id – id of the layer that you want to visualize the output for.
  • index – Which batch of data should I use for producing the outputs. Default is 0
pretty_print(verbose=2)[source]

This method is used to pretty print the network’s connections This is going to be deprecated with the use of visualizer module.

print_status(epoch, print_lr=False, verbose=2)[source]

This function prints the cost of the current epoch, learning rate and momentum of the network at the moment. This also calls the resultor to process results.

Todo

This needs to to go to visualizer.

Parameters:
  • verbose – Just as always.
  • epoch – Which epoch are we at ?
save_params(epoch=0, verbose=2)[source]

This method will save down a list of network parameters

Parameters:
  • verbose – As usual
  • epoch – epoch.
test(show_progress=True, verbose=2)[source]

This function is used for producing the testing accuracy.

Parameters:verbose – As usual
train(verbose=2, **kwargs)[source]

Training function of the network. Calling this will begin training.

Parameters:
  • epochs(num_epochs for each learning rate... ) to train Default is (20, 20)
  • validate_after_epochs – 1, after how many epochs do you want to validate ?
  • save_after_epochs – 1, Save network after that many epochs of training.
  • show_progress – default is True, will display a clean progressbar. If verbose is 3 or more - False
  • early_terminateTrue will allow early termination.
  • learning_rates – (annealing_rate, learning_rates ... ) length must be one more than epochs Default is (0.05, 0.01, 0.001)
validate(epoch=0, training_accuracy=False, show_progress=False, verbose=2)[source]

Method is use to run validation. It will also load the validation dataset.

Parameters:
  • verbose – Just as always
  • show_progress – Display progressbar ?
  • training_accuracy – Do you want to print accuracy on the training set as well ?
visualize(epoch=0, verbose=2)[source]

This method will use the cooked visualizer to save down the visualizations

Parameters:epoch – supply the epoch number ( used to create directories to save
visualize_activities(epoch=0, verbose=2)[source]

This method will save down all layer activities for the correct epoch.

Parameters:
  • epoch – What epoch is being running now.
  • verbose – As always.
visualize_filters(epoch=0, verbose=2)[source]

This method will save down all layer filters for the correct epoch.

Parameters:
  • epoch – What epoch is being running now.
  • verbose – As always.