Modules

Module interfaces

class pfrl.nn.Recurrent[source]

Recurrent module interface.

This class defines the interface of a recurrent module PFRL support.

The interface is similar to that of torch.nn.LSTM except that sequential data are expected to be packed in torch.nn.utils.rnn.PackedSequence.

To implement a model with recurrent layers, you can either use default container classes such as pfrl.nn.RecurrentSequential and pfrl.nn.RecurrentBranched or write your module extending this class and torch.nn.Module.

forward(packed_input, recurrent_state)[source]

Multi-step batch forward computation.

Parameters:
  • packed_input (object) – Input sequences. Tensors must be packed in torch.nn.utils.rnn.PackedSequence.
  • recurrent_state (object or None) – Batched recurrent state. If set to None, it is initialized.
Returns:

Output sequences. Tensors will be packed in

torch.nn.utils.rnn.PackedSequence.

object or None: New batched recurrent state.

Return type:

object

Module implementations

class pfrl.nn.Branched(*modules)[source]

Module that calls forward functions of child modules in parallel.

When the forward method of this module is called, all the arguments are forwarded to each child module’s forward method.

The returned values from the child modules are returned as a tuple.

Parameters:*modules – Child modules. Each module should be callable.
class pfrl.nn.EmpiricalNormalization(shape, batch_axis=0, eps=0.01, dtype=<class 'numpy.float32'>, until=None, clip_threshold=None)[source]

Normalize mean and variance of values based on empirical values.

Parameters:
  • shape (int or tuple of int) – Shape of input values except batch axis.
  • batch_axis (int) – Batch axis.
  • eps (float) – Small value for stability.
  • dtype (dtype) – Dtype of input values.
  • until (int or None) – If this arg is specified, the link learns input values until the sum of batch sizes exceeds it.
class pfrl.nn.FactorizedNoisyLinear(mu_link, sigma_scale=0.4)[source]

Linear layer in Factorized Noisy Network

Parameters:
  • mu_link (nn.Linear) – Linear link that computes mean of output.
  • sigma_scale (float) – The hyperparameter sigma_0 in the original paper. Scaling factor of the initial weights of noise-scaling parameters.
class pfrl.nn.MLP(in_size, out_size, hidden_sizes, nonlinearity=<function relu>, last_wscale=1)[source]

Multi-Layer Perceptron

class pfrl.nn.MLPBN(in_size, out_size, hidden_sizes, normalize_input=True, normalize_output=False, nonlinearity=<function relu>, last_wscale=1)[source]

Multi-Layer Perceptron with Batch Normalization.

Parameters:
  • in_size (int) – Input size.
  • out_size (int) – Output size.
  • hidden_sizes (list of ints) – Sizes of hidden channels.
  • normalize_input (bool) – If set to True, Batch Normalization is applied to inputs.
  • normalize_output (bool) – If set to True, Batch Normalization is applied to outputs.
  • nonlinearity (callable) – Nonlinearity between layers. It must accept a Variable as an argument and return a Variable with the same shape. Nonlinearities with learnable parameters such as PReLU are not supported.
  • last_wscale (float) – Scale of weight initialization of the last layer.
class pfrl.nn.SmallAtariCNN(n_input_channels=4, n_output_channels=256, activation=<function relu>, bias=0.1)[source]

Small CNN module proposed for DQN in NeurIPS DL Workshop, 2013.

See: https://arxiv.org/abs/1312.5602

class pfrl.nn.LargeAtariCNN(n_input_channels=4, n_output_channels=512, activation=<function relu>, bias=0.1)[source]

Large CNN module proposed for DQN in Nature, 2015.

See: https://www.nature.com/articles/nature14236

class pfrl.nn.RecurrentBranched(*modules)[source]

Recurrent module that bundles parallel branches.

This is a recurrent analog to pfrl.nn.Branched. It bundles multiple recurrent modules.

Parameters:*modules – Child modules. Each module should be recurrent and callable.
class pfrl.nn.RecurrentSequential(*args)[source]

Sequential model that can contain stateless recurrent modules.

This is a recurrent analog to torch.nn.Sequential. It supports the recurrent interface by automatically detecting recurrent modules and handles recurrent states properly.

For non-recurrent layers, this module automatically concatenates the input to the layers for efficient computation.

Parameters:*layers – Callable objects.

Module utility functions

pfrl.nn.to_factorized_noisy(module, *args, **kwargs)[source]

Add noisiness to components of given module

Currently this fn. only supports torch.nn.Linear (with and without bias)