network
- The network module¶
yann.network.py
contains the definition for the base network
class. It is pretty much
the most accessible part of this toolbox and forms the structure of the toolbox itself.
Any experiment using this toolbox will have to begin and end with using the network
class:
-
class
yann.network.
network
(verbose=2, **kwargs)[source]¶ Todo:
Class definition for the class
network
:All network properties, network variables and functionalities are initialized using this class and are contained by the network object. The
network.__init__
method initializes the network class. Thenetwork.__init___
function has many purposes depending on the arguments supplied.Provide any or all of the following arguments. Appropriate errors will be thrown if the parameters are not supplied correctly.
Todo
- posteriors in a classifier layers is not really a probability. Need to fix this.
Parameters: - verbose – Similar to any 3-level verbose in the toolbox.
- type – option takes only ‘classifier’ for now. Will add ‘encoders’ and others later
- borrow – Check
theano's
borrow. Default isTrue
.
Returns: network object with parameters setup.
Return type: yann.network.network
-
add_layer
(type, verbose=2, **kwargs)[source]¶ Todo
Need to add the following: * Inception Layer. * LSTM layer. * ...
Parameters: - type – <string> options include ‘input’ or ‘data’ - which indicates an input layer. ‘conv_pool’ or ‘convolution’ - indicates a convolutional - pooling layer ‘deconv’ or ‘deconvolution’ - indicates a fractional stride convoltuion layer ‘dot_product’ or ‘hidden’ or ‘mlp’ or ‘fully_connected’ - indicates a hidden fully connected layer ‘classifier’ or ‘softmax’ or ‘output’ or ‘label’ - indicates a classifier layer ‘objective’ or ‘loss’ or ‘energy’ - a layer that creates a loss function ‘merge’ or ‘join’ - a layer that merges two layers. ‘flatten’ - a layer that produces a flattened output of a block data. ‘random’ - a layer that produces random numbers. ‘rotate’ - a layer that rotate the input images. ‘tensor’ - a layer that converts the input tensor as an input layer. From now on everything is optional args..
- id – <string> how to identify the layer by.
Default is just layer number that starts with
0
. - origin –
id
will use the output of that layer as input to the new layer. Default is the last layer created. This variable forinput
type of layers is not a layer, but a datastream id. Formerge
layer, this is a tuple of two layer ids. - verbose – similar to the rest of the toolbox.
- mean_subtract – if
True
we will subtract the mean from each image, else not. - num_neurons – number of neurons in the layer
- dataset – <string> Location to the dataset.
used when layer
type
isinput
. - activation – String, takes options that are listed in
activations
Needed for layers that use activations. Some activations also take support parameters, for instancemaxout
takes maxout type and size,softmax
takes an option temperature. Refer to the moduleactivations
to know more. - stride – tuple
(int , int)
. Used as convolution stride. Default(1,1)
- batch_norm – If provided will be used, default is
False
. - border_mode – Refer to
border_mode
variable inyann.core.conv
, moduleconv
- pool_size – Subsample size, default is
(1,1)
. - pool_type – Refer to
pool
for details. {‘max’, ‘sum’, ‘mean’, ‘max_same_size’} - learnable – Default is
True
, ifTrue
we backprop on that layer. IfFalse
Layer is obstinate. - shape – tuple of shape to unflatten to ( height, width, channels ) in case layer was an unflatten layer
- input_params – Supply params or initializations from a pre-trained system.
- dropout_rate – If you want to dropout this layer’s output provide the output.
- regularize –
True
is you want to apply regularization,False
if not. - num_classes –
int
number of classes to classify. - objective – objective provided by classifier
nll
-negative log likelihood,cce
-categorical cross entropy,bce
-binary cross entropy,hinge
-hinge loss . For classifier layer. - dataset_init_args – same as for the dataset module. In fact this argument is needed only when dataset module is not setup.
- datastream_id – When using input layer or during objective layer, use this to identify which datastream to take data from.
- regularizer – Default is
(0.001, 0.001)
coeffients for L1, L2 regulaizer coefficients. - error –
merge
layers take an option called'error'
which can be None or others which are methods inyann.core.errors
. - angle – Takes value between [0,1] to capture the angle between [0,180] degrees Default is None. If None is specified, random number is generated from a uniform distriibution between 0 and 1.
- layer_type – If
value
supply, else it is default'discriminator'
. For other layers, if the layer class takes an argumenttype
, supply that argument here aslayer_type
.merge
layer for instance will use this arugment as itstype
argument.
-
add_module
(type, params=None, verbose=2)[source]¶ Use this function to add a module to the net.
Parameters: - type – which module to add. Options are
'resultor'
,'visualizer'
,'optimizer'
'datastream'
- params –
If the
type
was'resultor'
params is a dictionary of the form:params = { "root" : "<root directory to save stuff inside>" "results" : "<results_file_name>.txt", "errors" : "<error_file_name>.txt", "costs" : "<cost_file_name>.txt", "confusion": "<confusion_file_name>.txt", "network" : "<network_save_file_name>.pkl" "id" : id of the resultor }
While the filenames are optional,
root
must be provided. If a particular file is not provided, that value will not be saved. This value is supplied to setup the resultor module of :mod: network.If the
type
was'visualizer'
params is a dictionary of the form:parmas = { "root" : location to save the visualizations "frequency" : <integer>, after how many epochs do you need to visualize. Default value is 1import os "sample_size" : <integer, prefer squares>, simply save down random images from the datasets also saves down activations for the same images also. Default value is 16 "rgb_filters" : <bool> flag. if True 3D-RGB CNN filters are rendered. Default value is False "id" : id of the visualizer }
If the
type
was'optimizer'
params is a dictionary of the form:params = { "momentum_type" : <option> takes 'false' <no momentum>, 'polyak' and 'nesterov'. Default value is 'polyak' "momentum_params" : (<value in [0,1]>, <value in [0,1]>, <int>), (momentum coeffient at start, at end, at what epoch to end momentum increase). Default is the tuple (0.5, 0.95,50) "learning_rate" : (initial_learning_rate, fine_tuning_learning_rate, annealing_decay_rate). Default is the tuple (0.1,0.001,0.005) "regularization" : (l1_coeff, l2_coeff). Default is (0.001, 0.001) "optimizer_type": <option>, takes 'sgd', 'adagrad', 'rmsprop', 'adam'. Default is 'rmsprop' "objective_function": <option>, takes 'nll'-negative log likelihood, 'cce'-categorical cross entropy, 'bce'-binary cross entropy. Default is 'nll' "id" : id of the optimizer }
If the
type was ``'datastream'
params is a dictionary of the form:params = { "dataset": <location> "svm" : False or True ``svm`` if ``True``, a one-hot label set will also be setup. "n_classes": <int> ``n_classes`` if ``svm`` is ``True``, we need to know how many ``n_classes`` are present. "id": id of the datastream }
- verbose – Similar to rest of the toolbox.
- type – which module to add. Options are
-
cook
(verbose=2, **kwargs)[source]¶ This function builds the backprop network, and makes the trainer, tester and validator theano functions. The trainer builds the trainers for a particular objective layer and optimizer.
Parameters: - optimizer – Supply which optimizer to use. Default is last optimizer created.
- datastream – Supply which datastream to use. Default is the last datastream created.
- visualizer – Supply a visualizer to cook with. Default is the last visualizer created.
- classifier_layer – supply the layer of classifier. Default is the last classifier layer created.
- objective_layers – Supply a list of layer ids of layers that has the objective function. Default is last objective layer created if no classifier is provided.
- objective_weights – Supply a list of weights to be multiplied by each value of the objective layers. Default is 1.
- active_layers – Supply a list of active layers. If this parameter is supplied all
'learnabile'
of all layers will be ignored and only these layers will be trained. By default, all the learnable layers are used. - verbose – Similar to the rest of the toolbox.
-
deactivate_layer
(id, verbose=2)[source]¶ This method will remove a layer’s parameters from the active_layer dictionary.
Parameters: - id – Layer which you want to de activate.
- verbose – as usual.
Notes
If the network was cooked, it would have to be re-cooked after deactivation.
-
get_params
(verbose=2)[source]¶ This method returns a dictionary of layer weights and bias in numpy format.
Parameters: verbose – Blah.. Returns: A dictionary of parameters. Return type: OrderedDict
-
layer_activity
(id, index=0, verbose=2)[source]¶ Use this function to visualize or print out the outputs of each layer. I don’t know why this might be useful, but its fun to check this out I guess. This will only work after the dataset is initialized.
Parameters: - id – id of the layer that you want to visualize the output for.
- index – Which batch of data should I use for producing the outputs.
Default is
0
-
pretty_print
(verbose=2)[source]¶ This method is used to pretty print the network’s connections This is going to be deprecated with the use of visualizer module.
-
print_status
(epoch, print_lr=False, verbose=2)[source]¶ This function prints the cost of the current epoch, learning rate and momentum of the network at the moment. This also calls the resultor to process results.
Todo
This needs to to go to visualizer.
Parameters: - verbose – Just as always.
- epoch – Which epoch are we at ?
-
save_params
(epoch=0, verbose=2)[source]¶ This method will save down a list of network parameters
Parameters: - verbose – As usual
- epoch – epoch.
-
test
(show_progress=True, verbose=2)[source]¶ This function is used for producing the testing accuracy.
Parameters: verbose – As usual
-
train
(verbose=2, **kwargs)[source]¶ Training function of the network. Calling this will begin training.
Parameters: - epochs –
(num_epochs for each learning rate... )
to train Default is(20, 20)
- validate_after_epochs – 1, after how many epochs do you want to validate ?
- save_after_epochs – 1, Save network after that many epochs of training.
- show_progress – default is
True
, will display a clean progressbar. Ifverbose
is3
or more - False - early_terminate –
True
will allow early termination. - learning_rates – (annealing_rate, learning_rates ... ) length must be one more than
epochs
Default is(0.05, 0.01, 0.001)
- epochs –
-
validate
(epoch=0, training_accuracy=False, show_progress=False, verbose=2)[source]¶ Method is use to run validation. It will also load the validation dataset.
Parameters: - verbose – Just as always
- show_progress – Display progressbar ?
- training_accuracy – Do you want to print accuracy on the training set as well ?
-
visualize
(epoch=0, verbose=2)[source]¶ This method will use the cooked visualizer to save down the visualizations
Parameters: epoch – supply the epoch number ( used to create directories to save