Policies¶
Head modules for Gaussian policies¶
-
class
pfrl.policies.
GaussianHeadWithFixedCovariance
(scale=1)[source]¶ Gaussian head with fixed covariance.
This module is intended to be attached to a neural network that outputs the mean of a Gaussian policy. Its covariance is fixed to a diagonal matrix with a given scale.
Parameters: scale (float) – Scale parameter.
-
class
pfrl.policies.
GaussianHeadWithDiagonalCovariance
(var_func=<built-in function softplus>)[source]¶ Gaussian head with diagonal covariance.
This module is intended to be attached to a neural network that outputs a vector that is twice the size of an action vector. The vector is split and interpreted as the mean and diagonal covariance of a Gaussian policy.
Parameters: var_func (callable) – Callable that computes the variance from the second input. It should always return positive values.
-
class
pfrl.policies.
GaussianHeadWithStateIndependentCovariance
(action_size, var_type='spherical', var_func=<built-in function softplus>, var_param_init=0)[source]¶ Gaussian head with state-independent learned covariance.
This link is intended to be attached to a neural network that outputs the mean of a Gaussian policy. The only learnable parameter this link has determines the variance in a state-independent way.
State-independent parameterization of the variance of a Gaussian policy is often used with PPO and TRPO, e.g., in https://arxiv.org/abs/1709.06560.
Parameters: - action_size (int) – Number of dimensions of the action space.
- var_type (str) – Type of parameterization of variance. It must be ‘spherical’ or ‘diagonal’.
- var_func (callable) – Callable that computes the variance from the var parameter. It should always return positive values.
- var_param_init (float) – Initial value the var parameter.