flint.nn.init¶
Some of the code is borrowed from: https://github.com/pytorch/pytorch/blob/master/torch/nn/init.py
- flint.nn.init.calculate_gain(nonlinearity: str, param: Optional[Union[int, float]] = None)[source]¶
Return the recommended gain value for the given nonlinearity function.
The values are as follows:
nonlinearity
gain
Linear / Identity
\(1\)
Conv{1,2,3}D
\(1\)
Sigmoid
\(1\)
Tanh
\(\frac{5}{3}\)
ReLU
\(\sqrt{2}\)
Leaky Relu
\(\sqrt{\frac{2}{1 + \text{negative\_slope}^2}}\)
SELU
\(\frac{3}{4}\)
- Parameters
nonlinearity (str) – Name of the non-linear function
param (Union[int, float], optional) – Optional parameter for the non-linear function
- flint.nn.init.constant_(tensor: flint.tensor.Tensor, val: float) None [source]¶
Fill the tensor with the given scalar value
val
.- Parameters
tensor (Tensor) – A Tensor
val (float) – The value to fill the tensor with
- flint.nn.init.kaiming_normal_(tensor: flint.tensor.Tensor, a: float = 0.0, mode: str = 'fan_in', nonlinearity: str = 'leaky_relu') None [source]¶
Implementation of Kaiming initialization proposed in [1]. Also known as He initialization, using a normal distribution.
The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where
std = gain / sqrt(fan_mode)
.- Parameters
tensor (Tensor) – A Tensor
a (float, optional, default=0.) – The negative slope of the rectifier used after this layer (only used with ‘leaky_relu’)
mode (str, optional, default='fan_in') – Either
'fan_in'
or'fan_out'
.'fan_in'
for preserving the magnitude of the variance of the weights in the forward pass.'fan_out'
for preserving the magnitudes in the backwards pass.nonlinearity (str, optional, default='leaky_relu') – Name of the non-linear function, recommended to use only with ‘relu’ or ‘leaky_relu’
References
“Delving Deep into Rectifiers: Surpassing Human-level Performance on ImageNet Classification.” Kaiming He, et al. ICCV 2015.
- flint.nn.init.kaiming_uniform_(tensor: flint.tensor.Tensor, a: float = 0.0, mode: str = 'fan_in', nonlinearity: str = 'leaky_relu') None [source]¶
Implementation of Kaiming initialization proposed in [1]. Also known as He initialization, using a uniform distribution.
The resulting tensor will have values sampled from \(U(-\text{bound}, \text{bound})\), where
bound = gain * sqrt*(3 / fan_mode)
.- Parameters
tensor (Tensor) – A Tensor
a (float, optional, default=0.) – The negative slope of the rectifier used after this layer (only used with ‘leaky_relu’)
mode (str, optional, default='fan_in') – Either
'fan_in'
or'fan_out'
.'fan_in'
for preserving the magnitude of the variance of the weights in the forward pass.'fan_out'
for preserving the magnitudes in the backwards pass.nonlinearity (str, optional, default='leaky_relu') – Name of the non-linear function, recommended to use only with ‘relu’ or ‘leaky_relu’
References
“Delving Deep into Rectifiers: Surpassing Human-level Performance on ImageNet Classification.” Kaiming He, et al. ICCV 2015.
- flint.nn.init.lecun_normal_(tensor: flint.tensor.Tensor) None [source]¶
Implementation of LeCun initialization, using a normal distribution.
The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where
std = sqrt(1 / fan_in)
.- Parameters
tensor (Tensor) – A Tensor
References
“Efficient Backprop.” Yann LeCun, et al. 1998.
- flint.nn.init.lecun_uniform_(tensor: flint.tensor.Tensor) None [source]¶
Implementation of LeCun initialization, using a uniform distribution.
The resulting tensor will have values sampled from \(U(-\text{bound}, \text{bound})\), where
bound = sqrt(3 / fan_in)
.- Parameters
tensor (Tensor) – A Tensor
References
“Efficient Backprop.” Yann LeCun, et al. 1998.
- flint.nn.init.normal_(tensor: flint.tensor.Tensor, mean: float = 0.0, std: float = 1.0) None [source]¶
Fills the tensor with values drawn from the normal distribution.
- Parameters
tensor (Tensor) – A Tensor
mean (float) – The mean of the normal distribution
std (float) – The standard deviation of the normal distribution
- flint.nn.init.ones_(tensor: flint.tensor.Tensor) None [source]¶
Fill the tensor with the scalar value
1
.- Parameters
tensor (Tensor) – A Tensor
- flint.nn.init.uniform_(tensor: flint.tensor.Tensor, a: float = 0.0, b: float = 1.0) None [source]¶
Fills the tensor with values drawn from the uniform distribution.
- Parameters
tensor (Tensor) – A Tensor
low (float) – The lower bound of the uniform distribution
high (float) – The upper bound of the uniform distribution
- flint.nn.init.xavier_normal_(tensor: flint.tensor.Tensor, gain: float = 1.0) None [source]¶
Implementation of Xavier initialization proposed in [1]. Also known as Glorot initialization, using a normal distribution.
The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where
std = gain * sqrt(2 / (fan_in + fan_out))
- Parameters
tensor (Tensor) – A Tensor
gain (float, optional, default=1.) – An optional scaling factor
References
“Understanding the Difficulty of Training Deep Feedforward Neural Networks.” Xavier Glorot and Yoshua Bengio. AISTATS 2010.
- flint.nn.init.xavier_uniform_(tensor: flint.tensor.Tensor, gain: float = 1.0) None [source]¶
Implementation of Xavier initialization proposed in [1]. Also known as Glorot initialization, using a uniform distribution.
The resulting tensor will have values sampled from \(U(-a, a)\), where
a = gain * sqrt(6 / (fan_in + fan_out))
.- Parameters
tensor (Tensor) – A Tensor
gain (float, optional, default=1.) – An optional scaling factor
References
“Understanding the Difficulty of Training Deep Feedforward Neural Networks.” Xavier Glorot and Yoshua Bengio. AISTATS 2010.
- flint.nn.init.zeros_(tensor: flint.tensor.Tensor) None [source]¶
Fill the tensor with the scalar value
0
.- Parameters
tensor (Tensor) – A Tensor