flint.nn¶
Module¶
- class flint.nn.modules.Module[source]¶
Bases:
object
Base class for all modules.
- Parameters
name (str) – name of the module
- add_module(name: str, module: Optional[flint.nn.modules.module.Module]) None [source]¶
Add a child module to the current module.
- Parameters
name (str) – name of the child module
module (Module) – child module to be added to the module
- children() Iterator[flint.nn.modules.module.Module] [source]¶
Returns an iterator over immediate children modules.
- Yields
module (Module) – A child module
- eval() flint.nn.modules.module.Module [source]¶
Sets the module in evaluation mode.
This has effect only on the following modules:
flint.nn.Dropout
See their documentations for details of their behaviors in training / evaluation mode.
- Returns
module
- Return type
- modules() Iterator[flint.nn.modules.module.Module] [source]¶
Returns an iterator over all modules in the network, only yielding the module itself.
- Yields
Module – a module in the network
Note
Duplicate modules are returned only once.
- named_children() Iterator[Tuple[str, flint.nn.modules.module.Module]] [source]¶
Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
- Yields
(string, Module) – Tuple containing a name and child module
- named_modules(memo: Optional[Set[flint.nn.modules.module.Module]] = None, prefix: str = '') Iterator[Tuple[str, flint.nn.modules.module.Module]] [source]¶
Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself. Borrowed from: https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/module.py
- Parameters
memo (Set) – a set for recording visited modules
prefix (str) – prefix to prepend to all parameter names
- Yields
(string, Module) – Tuple of name and module
Note
Duplicate modules are returned only once.
- named_parameters(prefix: str = '', recurse: bool = True) Iterator[Tuple[str, flint.nn.parameter.Parameter]] [source]¶
Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
Adapted from: https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/module.py
- Parameters
prefix (str) – prefix to prepend to all parameter names.
recurse (bool) – True: yield parameters of this module and all submodules False: yield only parameters that are direct members of this module
- Yields
(string, Parameter) – Tuple containing the name and parameter
- parameters(recurse: bool = True) Iterator[flint.nn.parameter.Parameter] [source]¶
Returns an iterator over module parameters, only yielding the parameter itself.
- Parameters
recurse (bool) – If
True
, yields parameters of this module and all submodules. IfFalse
, yields only parameters that are direct members of this module.- Yields
Parameter – module parameter
- register_parameter(name: str, param: Optional[flint.nn.parameter.Parameter]) None [source]¶
Add a parameter to the module.
- Parameters
name (str) – name of the parameter
param (Parameter) – parameter to be added to the module
- train(mode: bool = True) flint.nn.modules.module.Module [source]¶
Sets the module in training mode.
This has effect only on the following modules:
flint.nn.Dropout
See their documentations for details of their behaviors in training / evaluation mode.
- Parameters
mode (bool, optional, default=True) – Whether to set training mode (
True
) or evaluation mode (False
)- Returns
module
- Return type
- training: bool¶
Containers¶
- class flint.nn.modules.Sequential(*args: flint.nn.modules.module.Module)[source]¶
- class flint.nn.modules.Sequential(arg: OrderedDict[str, Module])
Bases:
flint.nn.modules.module.Module
A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.
Activations¶
- class flint.nn.modules.ReLU[source]¶
Bases:
flint.nn.modules.module.Module
ReLU (Rectified Linear Unit) activation function. See
flint.nn.functional.relu()
for more details.
- class flint.nn.modules.LeakyReLU(negative_slope: float = 0.01)[source]¶
Bases:
flint.nn.modules.module.Module
Leaky ReLU activation function. See
flint.nn.functional.leaky_relu()
for more details.\[\text{LeakyReLU}(x) = \max(0, x) + \text{negative\_slope} * \min(0, x) \]- Parameters
negative_slope (float, optional, default=1e-2) – Controls the angle of the negative slope.
- class flint.nn.modules.Sigmoid[source]¶
Bases:
flint.nn.modules.module.Module
Sigmoid activation function. See
flint.nn.functional.sigmoid()
for more details.\[\text{sigmoid}(x) = \frac{1}{1 + \exp(-x)} \]
- class flint.nn.modules.Tanh[source]¶
Bases:
flint.nn.modules.module.Module
Tanh (Hyperbolic Tangent) activation function. See
flint.nn.functional.tanh()
for more details.\[\text{tanh}(x) = \frac{\sinh(x)}{\cosh(x)} = \frac{\exp(x) - \exp(-x)}{\exp(x) + \exp(-x)} \]
- class flint.nn.modules.GELU[source]¶
Bases:
flint.nn.modules.module.Module
Gaussian Error Linear Units (GELU) function. See
flint.nn.functional.gelu()
for more details.\[\text{GELU}(x) = x \cdot \Phi(x) = x \cdot \frac{1}{2} [1 + \text{erf} (x / \sqrt{2})] \]where \(\Phi(x)\) is the Cumulative Distribution Function for Gaussian Distribution.
We can approximate it with:
\[\text{GELU}(x) = 0.5 x (1 + \text{tanh}[ \sqrt{2 / \pi} (x + 0.044715 x^3) ]) \]or
\[\text{GELU}(x) = x \sigma(1.702 x) \]References
“Gaussian Error Linear Units (GELUs).” Dan Hendrycks and Kevin Gimpel. arXiv 2016.
Loss Functions¶
- class flint.nn.modules.BCELoss(reduction: str = 'mean')[source]¶
Bases:
flint.nn.modules.loss.Loss
Binary Cross Entropy Loss:
\[\text{loss} = y \log(x) + (1 - y) \log(1 - x) \]See
flint.nn.functional.binary_cross_entropy()
for more details.- Parameters
reduction (str, optional) – ‘none’ / ‘mean’ / ‘sum’
- class flint.nn.modules.MSELoss(reduction: str = 'mean')[source]¶
Bases:
flint.nn.modules.loss.Loss
Mean Squared Error Loss: \((x - y)^2\). See
flint.nn.functional.mse_loss()
for more details.- Parameters
reduction (str, optional) – ‘none’ / ‘mean’ / ‘sum’
- class flint.nn.modules.NllLoss(reduction: str = 'mean')[source]¶
Bases:
flint.nn.modules.loss.Loss
Negative Log Likelihood Loss. See
flint.nn.functional.nll_loss()
for more details.- Parameters
reduction (str, optional) – ‘none’ / ‘mean’ / ‘sum’
- class flint.nn.modules.CrossEntropyLoss(reduction: str = 'mean')[source]¶
Bases:
flint.nn.modules.loss.Loss
Cross Entropy Loss, combines
softmax()
andnll_loss()
. Seeflint.nn.functional.cross_entropy()
for more details.- Parameters
reduction (str, optional) – ‘none’ / ‘mean’ / ‘sum’
Convolution¶
- class flint.nn.modules.Conv1d(in_channels: int, out_channels: int, kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1, bias: bool = True)[source]¶
Bases:
flint.nn.modules.conv._ConvNd
Apply a 1D convolution over an input signal composed of several input planes.
input shape:
(batch_size, in_channels, L_in)
output shape:
(batch_size, out_channels, L_out)
where:
\[\text{L\_out} = \frac{\text{L\_in + 2 * padding - dilation * (kernel\_size - 1) - 1}}{\text{stride}} + 1 \]- Parameters
in_channels (int) – Number of channels in the input image
out_channels (int) – Number of channels produced by the convolution
kernel_size (int or tuple) – Size of the convolving kernel
tuple (stride int or) – Stride of the convolution kernels as they move over the input volume
optional – Stride of the convolution kernels as they move over the input volume
default=1 – Stride of the convolution kernels as they move over the input volume
padding (int or tuple, optional, default=0) – Zero-padding added to both sides of the input
dilation (int or tuple, optional, default=1) – Spacing between kernel elements
bias (bool, optional, default=True) – Enable bias or not
- class flint.nn.modules.Conv2d(in_channels: int, out_channels: int, kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1, bias: bool = True)[source]¶
Bases:
flint.nn.modules.conv._ConvNd
Apply a 2D convolution over an input signal composed of several input planes. See
flint.nn.functional.conv2d()
for more details.input shape:
(batch_size, in_channels, h_in, w_in)
output shape:
(batch_size, out_channels, h_out, w_out)
where:
\[\text{h\_out} = \frac{\text{h\_in + 2 * padding[0] - dilation[0] * (kernel\_size[0] - 1) - 1}}{\text{stride}[0]} + 1 \]\[\text{w\_out} = \frac{\text{w\_in + 2 * padding[1] - dilation[1] * (kernel\_size[1] - 1) - 1}}{\text{stride}[1]} + 1 \]- Parameters
in_channels (int) – Number of channels in the input image
out_channels (int) – Number of channels produced by the convolution
kernel_size (int or tuple) – Size of the convolving kernel
stride (int or tuple[int, int], optional, default=1) – Stride of the convolution kernels as they move over the input volume
padding (int or tuple[int, int], optional, default=0) – Zero-padding added to both sides of the input
dilation (int or tuple[int, int], optional, default=1) – Spacing between kernel elements
bias (bool, optional, default=True) – Enable bias or not
Linear¶
- class flint.nn.modules.Linear(in_features: int, out_features: int, bias: bool = True)[source]¶
Bases:
flint.nn.modules.module.Module
Full connected layer
\[y = x A^T + b \]Input shape:
(batch_size, in_features)
Output shape:
(batch_size, out_features)
- Parameters
in_features (int) – Size of each input sample.
out_features (int) – Size of each output sample.
bias (bool, optional, default=True) – Enable bias or not.
- class flint.nn.modules.Identity(*args: Any, **kwargs: Any)[source]¶
Bases:
flint.nn.modules.module.Module
A placeholder identity operator that is argument-insensitive.
Input shape: \((*)\), where \(*\) means any number of dimensions.
Output shape: \((*)\), same shape as the input.
- Parameters
args (Any) – Any argument (unused).
kwargs (Any) – Any keyword argument (unused).
Pooling¶
- class flint.nn.modules.MaxPool1d(kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1, return_indices: bool = False)[source]¶
Bases:
flint.nn.modules.pooling._MaxPoolNd
Apply a 1D max pooling over an input signal composed of several input planes. See
flint.nn.functional.maxpool1d()
for more details.input shape:
(batch_size, in_channels, L_in)
output shape:
(batch_size, out_channels, L_out)
where:
\[\text{L\_out} = \frac{\text{L\_in + 2 * padding - dilation * (kernel\_size - 1) - 1}}{\text{stride}} + 1 \]Note
It should be noted that, PyTorch argues the input will be implicitly zero-padded when
padding
is non-zero in its documentation. However, in fact, it uses implicit negative infinity padding rather than zero-padding, see this issue.In this class, zero-padding is used.
- Parameters
kernel_size (_size_1_t) – Size of the sliding window, must be > 0.
stride (_size_1_t) – Stride of the window, must be > 0. Default to
kernel_size
.padding (_size_1_t, optional, default=0)) – Zero-padding added to both sides of the input, must be >= 0 and <=
kernel_size / 2
.dilation (_size_1_t, optional, default=1) – Spacing between the elements in the window, must be > 0
return_indices (bool, optional, default=False) – If
True
, will return the max indices along with the outputs
- class flint.nn.modules.MaxPool2d(kernel_size: Union[T, Tuple[T]], stride: Optional[Union[T, Tuple[T]]] = None, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1, return_indices: bool = False)[source]¶
Bases:
flint.nn.modules.pooling._MaxPoolNd
Apply a 2D max pooling over an input signal composed of several input planes. See
flint.nn.functional.maxpool2d()
for more details.input shape:
(batch_size, in_channels, h_in, w_in)
output shape:
(batch_size, out_channels, h_out, w_out)
where:
\[\text{h\_out} = \frac{\text{h\_in + 2 * padding[0] - dilation[0] * (kernel\_size[0] - 1) - 1}}{\text{stride}[0]} + 1 \]\[\text{w\_out} = \frac{\text{w\_in + 2 * padding[1] - dilation[1] * (kernel\_size[1] - 1) - 1}}{\text{stride}[1]} + 1 \]Note
It should be noted that, PyTorch argues the input will be implicitly zero-padded when
padding
is non-zero in its documentation. However, in fact, it uses implicit negative infinity padding rather than zero-padding, see this issue.In this class, zero-padding is used.
- Parameters
kernel_size (_size_2_t) – Size of the sliding window, must be > 0.
stride (_size_2_t) – Stride of the window, must be > 0. Default to
kernel_size
.padding (_size_2_t, optional, default=0) – Zero-padding added to both sides of the input, must be >= 0 and <=
kernel_size / 2
.dilation (_size_2_t, optional, default=1)) – Spacing between the elements in the window, must be > 0
return_indices (bool, optional, default=False) – If
True
, will return the max indices along with the outputs
Unfold¶
- class flint.nn.modules.Unfold(kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1)[source]¶
Bases:
flint.nn.modules.module.Module
Extracts sliding local blocks from a batched input tensor. See
flint.nn.functional.unfold()
for more details.input shape: \((N, C, H, W)\)
output shape: \((N, C \times \prod(\text{kernel\_size}), L)\)
where:
\[L = \prod_d \frac{\text{spatial\_size[d] + 2 * padding[d] - dilation[d] * (kernel\_size[d] - 1) - 1}}{\text{stride}[d]} + 1 \]where \(\text{spatial\_size}\) is formed by the spatial dimensions of
input
(H and W above), and \(d\) is over all spatial dimensions.- Parameters
input (Tensor) – Input tensor
kernel_size (int or tuple) – Size of the sliding blocks.
stride (int or tuple, optional, default=1) – Stride of the sliding blocks in the input spatial dimensions.
padding (int or tuple, optional, default=0) – Implicit zero padding to be added on both sides of input.
dilation (int or tuple, optional, default=1) – A parameter that controls the stride of elements within the neighborhood.
Dropout¶
- class flint.nn.modules.Dropout(p: float = 0.5)[source]¶
Bases:
flint.nn.modules.module.Module
Dropout is used to randomly zeroes some of the elements of the input tensor with probability
p
using samples from a Bernoulli distribution during training. Furthermore, the outputs are scaled by a factor of \(\frac{1}{1 - p}\) during training. Each channel will be zeroed out independently on every forward call.During evaluation, the module simply computes an identity function.
This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper [1].
See
flint.nn.functional.dropout()
for more details.- Parameters
p (float, optional, default=0.5) – Probability of an element to be zeroed
References
“Improving Neural Networks by Preventing Co-adaptation of Feature Detectors.” Geoffrey E. Hinton, et al. arXiv 2012.
Flatten¶
- class flint.nn.modules.Flatten[source]¶
Bases:
flint.nn.modules.module.Module
Flatten the input. Does not affect the batch size.
Note
If inputs are shaped
(batch,)
without a feature axis, then flattening adds an extra channel dimension and output shape is(batch, 1)
.