flint.nn.init

Some of the code is borrowed from: https://github.com/pytorch/pytorch/blob/master/torch/nn/init.py

flint.nn.init.calculate_gain(nonlinearity: str, param: Optional[Union[int, float]] = None)[source]

Return the recommended gain value for the given nonlinearity function.

The values are as follows:

nonlinearity

gain

Linear / Identity

\(1\)

Conv{1,2,3}D

\(1\)

Sigmoid

\(1\)

Tanh

\(\frac{5}{3}\)

ReLU

\(\sqrt{2}\)

Leaky Relu

\(\sqrt{\frac{2}{1 + \text{negative\_slope}^2}}\)

SELU

\(\frac{3}{4}\)

Parameters
  • nonlinearity (str) – Name of the non-linear function

  • param (Union[int, float], optional) – Optional parameter for the non-linear function

flint.nn.init.constant_(tensor: flint.tensor.Tensor, val: float) None[source]

Fill the tensor with the given scalar value val.

Parameters
  • tensor (Tensor) – A Tensor

  • val (float) – The value to fill the tensor with

flint.nn.init.kaiming_normal_(tensor: flint.tensor.Tensor, a: float = 0.0, mode: str = 'fan_in', nonlinearity: str = 'leaky_relu') None[source]

Implementation of Kaiming initialization proposed in [1]. Also known as He initialization, using a normal distribution.

The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where std = gain / sqrt(fan_mode).

Parameters
  • tensor (Tensor) – A Tensor

  • a (float, optional, default=0.) – The negative slope of the rectifier used after this layer (only used with ‘leaky_relu’)

  • mode (str, optional, default='fan_in') – Either 'fan_in' or 'fan_out'. 'fan_in' for preserving the magnitude of the variance of the weights in the forward pass. 'fan_out' for preserving the magnitudes in the backwards pass.

  • nonlinearity (str, optional, default='leaky_relu') – Name of the non-linear function, recommended to use only with ‘relu’ or ‘leaky_relu’

References

  1. Delving Deep into Rectifiers: Surpassing Human-level Performance on ImageNet Classification.” Kaiming He, et al. ICCV 2015.

flint.nn.init.kaiming_uniform_(tensor: flint.tensor.Tensor, a: float = 0.0, mode: str = 'fan_in', nonlinearity: str = 'leaky_relu') None[source]

Implementation of Kaiming initialization proposed in [1]. Also known as He initialization, using a uniform distribution.

The resulting tensor will have values sampled from \(U(-\text{bound}, \text{bound})\), where bound = gain * sqrt*(3 / fan_mode).

Parameters
  • tensor (Tensor) – A Tensor

  • a (float, optional, default=0.) – The negative slope of the rectifier used after this layer (only used with ‘leaky_relu’)

  • mode (str, optional, default='fan_in') – Either 'fan_in' or 'fan_out'. 'fan_in' for preserving the magnitude of the variance of the weights in the forward pass. 'fan_out' for preserving the magnitudes in the backwards pass.

  • nonlinearity (str, optional, default='leaky_relu') – Name of the non-linear function, recommended to use only with ‘relu’ or ‘leaky_relu’

References

  1. Delving Deep into Rectifiers: Surpassing Human-level Performance on ImageNet Classification.” Kaiming He, et al. ICCV 2015.

flint.nn.init.lecun_normal_(tensor: flint.tensor.Tensor) None[source]

Implementation of LeCun initialization, using a normal distribution.

The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where std = sqrt(1 / fan_in).

Parameters

tensor (Tensor) – A Tensor

References

  1. Efficient Backprop.” Yann LeCun, et al. 1998.

flint.nn.init.lecun_uniform_(tensor: flint.tensor.Tensor) None[source]

Implementation of LeCun initialization, using a uniform distribution.

The resulting tensor will have values sampled from \(U(-\text{bound}, \text{bound})\), where bound = sqrt(3 / fan_in).

Parameters

tensor (Tensor) – A Tensor

References

  1. Efficient Backprop.” Yann LeCun, et al. 1998.

flint.nn.init.normal_(tensor: flint.tensor.Tensor, mean: float = 0.0, std: float = 1.0) None[source]

Fills the tensor with values drawn from the normal distribution.

Parameters
  • tensor (Tensor) – A Tensor

  • mean (float) – The mean of the normal distribution

  • std (float) – The standard deviation of the normal distribution

flint.nn.init.ones_(tensor: flint.tensor.Tensor) None[source]

Fill the tensor with the scalar value 1.

Parameters

tensor (Tensor) – A Tensor

flint.nn.init.uniform_(tensor: flint.tensor.Tensor, a: float = 0.0, b: float = 1.0) None[source]

Fills the tensor with values drawn from the uniform distribution.

Parameters
  • tensor (Tensor) – A Tensor

  • low (float) – The lower bound of the uniform distribution

  • high (float) – The upper bound of the uniform distribution

flint.nn.init.xavier_normal_(tensor: flint.tensor.Tensor, gain: float = 1.0) None[source]

Implementation of Xavier initialization proposed in [1]. Also known as Glorot initialization, using a normal distribution.

The resulting tensor will have values sampled from \(N(0, \text{std}^2)\), where std = gain * sqrt(2 / (fan_in + fan_out))

Parameters
  • tensor (Tensor) – A Tensor

  • gain (float, optional, default=1.) – An optional scaling factor

References

  1. Understanding the Difficulty of Training Deep Feedforward Neural Networks.” Xavier Glorot and Yoshua Bengio. AISTATS 2010.

flint.nn.init.xavier_uniform_(tensor: flint.tensor.Tensor, gain: float = 1.0) None[source]

Implementation of Xavier initialization proposed in [1]. Also known as Glorot initialization, using a uniform distribution.

The resulting tensor will have values sampled from \(U(-a, a)\), where a = gain * sqrt(6 / (fan_in + fan_out)).

Parameters
  • tensor (Tensor) – A Tensor

  • gain (float, optional, default=1.) – An optional scaling factor

References

  1. Understanding the Difficulty of Training Deep Feedforward Neural Networks.” Xavier Glorot and Yoshua Bengio. AISTATS 2010.

flint.nn.init.zeros_(tensor: flint.tensor.Tensor) None[source]

Fill the tensor with the scalar value 0.

Parameters

tensor (Tensor) – A Tensor