flint.nn.init¶

Some of the code is borrowed from: https://github.com/pytorch/pytorch/blob/master/torch/nn/init.py

flint.nn.init.calculate_gain(nonlinearity: str, param: Optional[Union[int, float]] = None)[source]¶

Return the recommended gain value for the given nonlinearity function.

The values are as follows:

nonlinearity	gain
Linear / Identity	\(1\)
Conv{1,2,3}D	\(1\)
Sigmoid	\(1\)
Tanh	\(\frac{5}{3}\)
ReLU	\(\sqrt{2}\)
Leaky Relu	\(\sqrt{\frac{2}{1 + \text{negative\_slope}^2}}\)
SELU	\(\frac{3}{4}\)

Parameters

nonlinearity (str) – Name of the non-linear function
param (Union[int, float], optional) – Optional parameter for the non-linear function

flint.nn.init.constant_(tensor: flint.tensor.Tensor, val: float) → None[source]¶

Fill the tensor with the given scalar value val.

Parameters

tensor (Tensor) – A Tensor
val (float) – The value to fill the tensor with

flint.nn.init.kaiming_normal_(tensor: flint.tensor.Tensor, a: float = 0.0, mode: str = 'fan_in', nonlinearity: str = 'leaky_relu') → None[source]¶

Implementation of Kaiming initialization proposed in [1]. Also known as He initialization, using a normal distribution.

The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where std = gain / sqrt(fan_mode).

Parameters

tensor (Tensor) – A Tensor
a (float, optional, default=0.) – The negative slope of the rectifier used after this layer (only used with ‘leaky_relu’)
mode (str, optional, default='fan_in') – Either 'fan_in' or 'fan_out'. 'fan_in' for preserving the magnitude of the variance of the weights in the forward pass. 'fan_out' for preserving the magnitudes in the backwards pass.
nonlinearity (str, optional, default='leaky_relu') – Name of the non-linear function, recommended to use only with ‘relu’ or ‘leaky_relu’

References

“Delving Deep into Rectifiers: Surpassing Human-level Performance on ImageNet Classification.” Kaiming He, et al. ICCV 2015.

flint.nn.init.kaiming_uniform_(tensor: flint.tensor.Tensor, a: float = 0.0, mode: str = 'fan_in', nonlinearity: str = 'leaky_relu') → None[source]¶

Implementation of Kaiming initialization proposed in [1]. Also known as He initialization, using a uniform distribution.

The resulting tensor will have values sampled from \(U(-\text{bound}, \text{bound})\), where bound = gain * sqrt*(3 / fan_mode).

Parameters

tensor (Tensor) – A Tensor
a (float, optional, default=0.) – The negative slope of the rectifier used after this layer (only used with ‘leaky_relu’)
mode (str, optional, default='fan_in') – Either 'fan_in' or 'fan_out'. 'fan_in' for preserving the magnitude of the variance of the weights in the forward pass. 'fan_out' for preserving the magnitudes in the backwards pass.
nonlinearity (str, optional, default='leaky_relu') – Name of the non-linear function, recommended to use only with ‘relu’ or ‘leaky_relu’

References

“Delving Deep into Rectifiers: Surpassing Human-level Performance on ImageNet Classification.” Kaiming He, et al. ICCV 2015.

flint.nn.init.lecun_normal_(tensor: flint.tensor.Tensor) → None[source]¶

Implementation of LeCun initialization, using a normal distribution.

The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where std = sqrt(1 / fan_in).

Parameters: tensor (Tensor) – A Tensor

References

“Efficient Backprop.” Yann LeCun, et al. 1998.

flint.nn.init.lecun_uniform_(tensor: flint.tensor.Tensor) → None[source]¶

Implementation of LeCun initialization, using a uniform distribution.

The resulting tensor will have values sampled from \(U(-\text{bound}, \text{bound})\), where bound = sqrt(3 / fan_in).

Parameters: tensor (Tensor) – A Tensor

References

“Efficient Backprop.” Yann LeCun, et al. 1998.

flint.nn.init.normal_(tensor: flint.tensor.Tensor, mean: float = 0.0, std: float = 1.0) → None[source]¶

Fills the tensor with values drawn from the normal distribution.

Parameters

tensor (Tensor) – A Tensor
mean (float) – The mean of the normal distribution
std (float) – The standard deviation of the normal distribution

flint.nn.init.ones_(tensor: flint.tensor.Tensor) → None[source]¶

Fill the tensor with the scalar value 1.

Parameters: tensor (Tensor) – A Tensor

flint.nn.init.uniform_(tensor: flint.tensor.Tensor, a: float = 0.0, b: float = 1.0) → None[source]¶

Fills the tensor with values drawn from the uniform distribution.

Parameters

tensor (Tensor) – A Tensor
low (float) – The lower bound of the uniform distribution
high (float) – The upper bound of the uniform distribution

flint.nn.init.xavier_normal_(tensor: flint.tensor.Tensor, gain: float = 1.0) → None[source]¶

Implementation of Xavier initialization proposed in [1]. Also known as Glorot initialization, using a normal distribution.

The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where std = gain * sqrt(2 / (fan_in + fan_out))

Parameters

tensor (Tensor) – A Tensor
gain (float, optional, default=1.) – An optional scaling factor

References

“Understanding the Difficulty of Training Deep Feedforward Neural Networks.” Xavier Glorot and Yoshua Bengio. AISTATS 2010.

flint.nn.init.xavier_uniform_(tensor: flint.tensor.Tensor, gain: float = 1.0) → None[source]¶

Implementation of Xavier initialization proposed in [1]. Also known as Glorot initialization, using a uniform distribution.

The resulting tensor will have values sampled from \(U(-a, a)\), where a = gain * sqrt(6 / (fan_in + fan_out)).

Parameters

tensor (Tensor) – A Tensor
gain (float, optional, default=1.) – An optional scaling factor

References

“Understanding the Difficulty of Training Deep Feedforward Neural Networks.” Xavier Glorot and Yoshua Bengio. AISTATS 2010.

flint.nn.init.zeros_(tensor: flint.tensor.Tensor) → None[source]¶

Fill the tensor with the scalar value 0.

Parameters: tensor (Tensor) – A Tensor