flint.nn

Module

class flint.nn.modules.Module[source]

Bases: object

Base class for all modules.

Parameters

name (str) – name of the module

add_module(name: str, module: Optional[flint.nn.modules.module.Module]) None[source]

Add a child module to the current module.

Parameters
  • name (str) – name of the child module

  • module (Module) – child module to be added to the module

children() Iterator[flint.nn.modules.module.Module][source]

Returns an iterator over immediate children modules.

Yields

module (Module) – A child module

eval() flint.nn.modules.module.Module[source]

Sets the module in evaluation mode.

This has effect only on the following modules:

  • flint.nn.Dropout

See their documentations for details of their behaviors in training / evaluation mode.

Returns

module

Return type

Module

modules() Iterator[flint.nn.modules.module.Module][source]

Returns an iterator over all modules in the network, only yielding the module itself.

Yields

Module – a module in the network

Note

Duplicate modules are returned only once.

named_children() Iterator[Tuple[str, flint.nn.modules.module.Module]][source]

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

Yields

(string, Module) – Tuple containing a name and child module

named_modules(memo: Optional[Set[flint.nn.modules.module.Module]] = None, prefix: str = '') Iterator[Tuple[str, flint.nn.modules.module.Module]][source]

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself. Borrowed from: https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/module.py

Parameters
  • memo (Set) – a set for recording visited modules

  • prefix (str) – prefix to prepend to all parameter names

Yields

(string, Module) – Tuple of name and module

Note

Duplicate modules are returned only once.

named_parameters(prefix: str = '', recurse: bool = True) Iterator[Tuple[str, flint.nn.parameter.Parameter]][source]

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

Adapted from: https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/module.py

Parameters
  • prefix (str) – prefix to prepend to all parameter names.

  • recurse (bool) – True: yield parameters of this module and all submodules False: yield only parameters that are direct members of this module

Yields

(string, Parameter) – Tuple containing the name and parameter

parameters(recurse: bool = True) Iterator[flint.nn.parameter.Parameter][source]

Returns an iterator over module parameters, only yielding the parameter itself.

Parameters

recurse (bool) – If True, yields parameters of this module and all submodules. If False, yields only parameters that are direct members of this module.

Yields

Parameter – module parameter

register_parameter(name: str, param: Optional[flint.nn.parameter.Parameter]) None[source]

Add a parameter to the module.

Parameters
  • name (str) – name of the parameter

  • param (Parameter) – parameter to be added to the module

train(mode: bool = True) flint.nn.modules.module.Module[source]

Sets the module in training mode.

This has effect only on the following modules:

  • flint.nn.Dropout

See their documentations for details of their behaviors in training / evaluation mode.

Parameters

mode (bool, optional, default=True) – Whether to set training mode (True) or evaluation mode (False)

Returns

module

Return type

Module

training: bool

Containers

class flint.nn.modules.Sequential(*args: flint.nn.modules.module.Module)[source]
class flint.nn.modules.Sequential(arg: OrderedDict[str, Module])

Bases: flint.nn.modules.module.Module

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.

Activations

class flint.nn.modules.ReLU[source]

Bases: flint.nn.modules.module.Module

ReLU (Rectified Linear Unit) activation function. See flint.nn.functional.relu() for more details.

class flint.nn.modules.LeakyReLU(negative_slope: float = 0.01)[source]

Bases: flint.nn.modules.module.Module

Leaky ReLU activation function. See flint.nn.functional.leaky_relu() for more details.

\[\text{LeakyReLU}(x) = \max(0, x) + \text{negative\_slope} * \min(0, x) \]
Parameters

negative_slope (float, optional, default=1e-2) – Controls the angle of the negative slope.

class flint.nn.modules.Sigmoid[source]

Bases: flint.nn.modules.module.Module

Sigmoid activation function. See flint.nn.functional.sigmoid() for more details.

\[\text{sigmoid}(x) = \frac{1}{1 + \exp(-x)} \]
class flint.nn.modules.Tanh[source]

Bases: flint.nn.modules.module.Module

Tanh (Hyperbolic Tangent) activation function. See flint.nn.functional.tanh() for more details.

\[\text{tanh}(x) = \frac{\sinh(x)}{\cosh(x)} = \frac{\exp(x) - \exp(-x)}{\exp(x) + \exp(-x)} \]
class flint.nn.modules.GELU[source]

Bases: flint.nn.modules.module.Module

Gaussian Error Linear Units (GELU) function. See flint.nn.functional.gelu() for more details.

\[\text{GELU}(x) = x \cdot \Phi(x) = x \cdot \frac{1}{2} [1 + \text{erf} (x / \sqrt{2})] \]

where \(\Phi(x)\) is the Cumulative Distribution Function for Gaussian Distribution.

We can approximate it with:

\[\text{GELU}(x) = 0.5 x (1 + \text{tanh}[ \sqrt{2 / \pi} (x + 0.044715 x^3) ]) \]

or

\[\text{GELU}(x) = x \sigma(1.702 x) \]

References

  1. Gaussian Error Linear Units (GELUs).” Dan Hendrycks and Kevin Gimpel. arXiv 2016.

Loss Functions

class flint.nn.modules.Loss(reduction: str = 'mean')[source]

Bases: flint.nn.modules.module.Module

class flint.nn.modules.BCELoss(reduction: str = 'mean')[source]

Bases: flint.nn.modules.loss.Loss

Binary Cross Entropy Loss:

\[\text{loss} = y \log(x) + (1 - y) \log(1 - x) \]

See flint.nn.functional.binary_cross_entropy() for more details.

Parameters

reduction (str, optional) – ‘none’ / ‘mean’ / ‘sum’

class flint.nn.modules.MSELoss(reduction: str = 'mean')[source]

Bases: flint.nn.modules.loss.Loss

Mean Squared Error Loss: \((x - y)^2\). See flint.nn.functional.mse_loss() for more details.

Parameters

reduction (str, optional) – ‘none’ / ‘mean’ / ‘sum’

class flint.nn.modules.NllLoss(reduction: str = 'mean')[source]

Bases: flint.nn.modules.loss.Loss

Negative Log Likelihood Loss. See flint.nn.functional.nll_loss() for more details.

Parameters

reduction (str, optional) – ‘none’ / ‘mean’ / ‘sum’

class flint.nn.modules.CrossEntropyLoss(reduction: str = 'mean')[source]

Bases: flint.nn.modules.loss.Loss

Cross Entropy Loss, combines softmax() and nll_loss(). See flint.nn.functional.cross_entropy() for more details.

Parameters

reduction (str, optional) – ‘none’ / ‘mean’ / ‘sum’

Convolution

class flint.nn.modules.Conv1d(in_channels: int, out_channels: int, kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1, bias: bool = True)[source]

Bases: flint.nn.modules.conv._ConvNd

Apply a 1D convolution over an input signal composed of several input planes.

  • input shape: (batch_size, in_channels, L_in)

  • output shape: (batch_size, out_channels, L_out)

where:

\[\text{L\_out} = \frac{\text{L\_in + 2 * padding - dilation * (kernel\_size - 1) - 1}}{\text{stride}} + 1 \]
Parameters
  • in_channels (int) – Number of channels in the input image

  • out_channels (int) – Number of channels produced by the convolution

  • kernel_size (int or tuple) – Size of the convolving kernel

  • tuple (stride int or) – Stride of the convolution kernels as they move over the input volume

  • optional – Stride of the convolution kernels as they move over the input volume

  • default=1 – Stride of the convolution kernels as they move over the input volume

  • padding (int or tuple, optional, default=0) – Zero-padding added to both sides of the input

  • dilation (int or tuple, optional, default=1) – Spacing between kernel elements

  • bias (bool, optional, default=True) – Enable bias or not

class flint.nn.modules.Conv2d(in_channels: int, out_channels: int, kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1, bias: bool = True)[source]

Bases: flint.nn.modules.conv._ConvNd

Apply a 2D convolution over an input signal composed of several input planes. See flint.nn.functional.conv2d() for more details.

  • input shape: (batch_size, in_channels, h_in, w_in)

  • output shape: (batch_size, out_channels, h_out, w_out)

where:

\[\text{h\_out} = \frac{\text{h\_in + 2 * padding[0] - dilation[0] * (kernel\_size[0] - 1) - 1}}{\text{stride}[0]} + 1 \]
\[\text{w\_out} = \frac{\text{w\_in + 2 * padding[1] - dilation[1] * (kernel\_size[1] - 1) - 1}}{\text{stride}[1]} + 1 \]
Parameters
  • in_channels (int) – Number of channels in the input image

  • out_channels (int) – Number of channels produced by the convolution

  • kernel_size (int or tuple) – Size of the convolving kernel

  • stride (int or tuple[int, int], optional, default=1) – Stride of the convolution kernels as they move over the input volume

  • padding (int or tuple[int, int], optional, default=0) – Zero-padding added to both sides of the input

  • dilation (int or tuple[int, int], optional, default=1) – Spacing between kernel elements

  • bias (bool, optional, default=True) – Enable bias or not

Linear

class flint.nn.modules.Linear(in_features: int, out_features: int, bias: bool = True)[source]

Bases: flint.nn.modules.module.Module

Full connected layer

\[y = x A^T + b \]
  • Input shape: (batch_size, in_features)

  • Output shape: (batch_size, out_features)

Parameters
  • in_features (int) – Size of each input sample.

  • out_features (int) – Size of each output sample.

  • bias (bool, optional, default=True) – Enable bias or not.

class flint.nn.modules.Identity(*args: Any, **kwargs: Any)[source]

Bases: flint.nn.modules.module.Module

A placeholder identity operator that is argument-insensitive.

  • Input shape: \((*)\), where \(*\) means any number of dimensions.

  • Output shape: \((*)\), same shape as the input.

Parameters
  • args (Any) – Any argument (unused).

  • kwargs (Any) – Any keyword argument (unused).

Pooling

class flint.nn.modules.MaxPool1d(kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1, return_indices: bool = False)[source]

Bases: flint.nn.modules.pooling._MaxPoolNd

Apply a 1D max pooling over an input signal composed of several input planes. See flint.nn.functional.maxpool1d() for more details.

  • input shape: (batch_size, in_channels, L_in)

  • output shape: (batch_size, out_channels, L_out)

where:

\[\text{L\_out} = \frac{\text{L\_in + 2 * padding - dilation * (kernel\_size - 1) - 1}}{\text{stride}} + 1 \]

Note

It should be noted that, PyTorch argues the input will be implicitly zero-padded when padding is non-zero in its documentation. However, in fact, it uses implicit negative infinity padding rather than zero-padding, see this issue.

In this class, zero-padding is used.

Parameters
  • kernel_size (_size_1_t) – Size of the sliding window, must be > 0.

  • stride (_size_1_t) – Stride of the window, must be > 0. Default to kernel_size.

  • padding (_size_1_t, optional, default=0)) – Zero-padding added to both sides of the input, must be >= 0 and <= kernel_size / 2.

  • dilation (_size_1_t, optional, default=1) – Spacing between the elements in the window, must be > 0

  • return_indices (bool, optional, default=False) – If True, will return the max indices along with the outputs

class flint.nn.modules.MaxPool2d(kernel_size: Union[T, Tuple[T]], stride: Optional[Union[T, Tuple[T]]] = None, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1, return_indices: bool = False)[source]

Bases: flint.nn.modules.pooling._MaxPoolNd

Apply a 2D max pooling over an input signal composed of several input planes. See flint.nn.functional.maxpool2d() for more details.

  • input shape: (batch_size, in_channels, h_in, w_in)

  • output shape: (batch_size, out_channels, h_out, w_out)

where:

\[\text{h\_out} = \frac{\text{h\_in + 2 * padding[0] - dilation[0] * (kernel\_size[0] - 1) - 1}}{\text{stride}[0]} + 1 \]
\[\text{w\_out} = \frac{\text{w\_in + 2 * padding[1] - dilation[1] * (kernel\_size[1] - 1) - 1}}{\text{stride}[1]} + 1 \]

Note

It should be noted that, PyTorch argues the input will be implicitly zero-padded when padding is non-zero in its documentation. However, in fact, it uses implicit negative infinity padding rather than zero-padding, see this issue.

In this class, zero-padding is used.

Parameters
  • kernel_size (_size_2_t) – Size of the sliding window, must be > 0.

  • stride (_size_2_t) – Stride of the window, must be > 0. Default to kernel_size.

  • padding (_size_2_t, optional, default=0) – Zero-padding added to both sides of the input, must be >= 0 and <= kernel_size / 2.

  • dilation (_size_2_t, optional, default=1)) – Spacing between the elements in the window, must be > 0

  • return_indices (bool, optional, default=False) – If True, will return the max indices along with the outputs

Unfold

class flint.nn.modules.Unfold(kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1)[source]

Bases: flint.nn.modules.module.Module

Extracts sliding local blocks from a batched input tensor. See flint.nn.functional.unfold() for more details.

  • input shape: \((N, C, H, W)\)

  • output shape: \((N, C \times \prod(\text{kernel\_size}), L)\)

where:

\[L = \prod_d \frac{\text{spatial\_size[d] + 2 * padding[d] - dilation[d] * (kernel\_size[d] - 1) - 1}}{\text{stride}[d]} + 1 \]

where \(\text{spatial\_size}\) is formed by the spatial dimensions of input (H and W above), and \(d\) is over all spatial dimensions.

Parameters
  • input (Tensor) – Input tensor

  • kernel_size (int or tuple) – Size of the sliding blocks.

  • stride (int or tuple, optional, default=1) – Stride of the sliding blocks in the input spatial dimensions.

  • padding (int or tuple, optional, default=0) – Implicit zero padding to be added on both sides of input.

  • dilation (int or tuple, optional, default=1) – A parameter that controls the stride of elements within the neighborhood.

Dropout

class flint.nn.modules.Dropout(p: float = 0.5)[source]

Bases: flint.nn.modules.module.Module

Dropout is used to randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution during training. Furthermore, the outputs are scaled by a factor of \(\frac{1}{1 - p}\) during training. Each channel will be zeroed out independently on every forward call.

During evaluation, the module simply computes an identity function.

This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper [1].

See flint.nn.functional.dropout() for more details.

Parameters

p (float, optional, default=0.5) – Probability of an element to be zeroed

References

  1. Improving Neural Networks by Preventing Co-adaptation of Feature Detectors.” Geoffrey E. Hinton, et al. arXiv 2012.

Flatten

class flint.nn.modules.Flatten[source]

Bases: flint.nn.modules.module.Module

Flatten the input. Does not affect the batch size.

Note

If inputs are shaped (batch,) without a feature axis, then flattening adds an extra channel dimension and output shape is (batch, 1).