flint.nn.functional¶
- flint.nn.functional.binary_cross_entropy(input: flint.tensor.Tensor, target: flint.tensor.Tensor, reduction: str = 'mean') flint.tensor.Tensor [source]¶
Binary Cross Entropy Loss
\[\text{loss} = - (y \log(x) + (1 - y) \log(1 - x)) \]
- flint.nn.functional.conv1d(input: flint.tensor.Tensor, weight: flint.tensor.Tensor, bias: Optional[flint.tensor.Tensor] = None, stride: Tuple[int] = (1,), padding: Tuple[int] = (0,), dilation: Tuple[int] = (1,))[source]¶
Apply a 1D convolution over an input signal composed of several input planes.
input shape:
(batch_size, in_channels, L_in)
output shape:
(batch_size, out_channels, L_out)
where:
\[\text{L\_out} = \frac{\text{L\_in + 2 * padding - dilation * (kernel\_size - 1) - 1}}{\text{stride}} + 1 \]- Parameters
input (Tensor) – Input tensor
weight (Tensor) – Weight of the conv1d layer
bias (Tensor, optional) – Bias of the conv1d layer
stride (Tuple[int], optional, default: (1, )) – Stride of the convolution
padding (Tuple[int], optional, default: (0, )) – Zero-padding added to both sides of the input
dilation (Tuple[int], optional, default: (1, )) – Spacing between kernel elements
- flint.nn.functional.conv2d(input: flint.tensor.Tensor, weight: flint.tensor.Tensor, bias: Optional[flint.tensor.Tensor] = None, stride: Tuple[int] = (1, 1), padding: Tuple[int] = (0, 0), dilation: Tuple[int] = (1, 1))[source]¶
Apply a 2D convolution over an input signal composed of several input planes.
input shape:
(batch_size, in_channels, h_in, w_in)
output shape:
(batch_size, out_channels, h_out, w_out)
where:
\[\text{h\_out} = \frac{\text{h\_in + 2 * padding[0] - dilation[0] * (kernel\_size[0] - 1) - 1}}{\text{stride}[0]} + 1 \]\[\text{w\_out} = \frac{\text{w\_in + 2 * padding[1] - dilation[1] * (kernel\_size[1] - 1) - 1}}{\text{stride}[1]} + 1 \]Note
Use
unfold
function to perform the convolution as a single matrix multiplication. For more details, see [1].- Parameters
input (Tensor) – Input tensor
weight (Tensor) – Weight of the conv1d layer
bias (Tensor, optional) – Bias of the conv2d layer
stride (Tuple[int, int], optional, default=(1, 1)) – Stride of the convolution
padding (Tuple[int, int], optional, default=(0, 0))) – Zero-padding added to both sides of the input
dilation (Tuple[int, int], optional, default=(1, 1)) – Spacing between kernel elements
References
- flint.nn.functional.cross_entropy(input: flint.tensor.Tensor, target: flint.tensor.Tensor, reduction: str = 'mean') flint.tensor.Tensor [source]¶
Cross Entropy Loss
Note
Combine
softmax()
andnll_loss()
, which is DIFFERENT FROMnn.functional.cross_entropy()
IN PYTORCH!
- flint.nn.functional.dropout(input: flint.tensor.Tensor, p: float = 0.5, training: bool = True) flint.tensor.Tensor [source]¶
Dropout is used to randomly zeroes some of the elements of the input tensor with probability
p
using samples from a Bernoulli distribution during training. Furthermore, the outputs are scaled by a factor of \(\frac{1}{1 - p}\) during training. Each channel will be zeroed out independently on every forward call.During evaluation, the module simply computes an identity function.
This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper [1].
- Parameters
p (float, optional, default=0.5) – Probability of an element to be zeroed
training (bool) – Apply dropout if is
True
References
“Improving Neural Networks by Preventing Co-adaptation of Feature Detectors.” Geoffrey E. Hinton, et al. arXiv 2012.
- flint.nn.functional.flatten(input: flint.tensor.Tensor) flint.tensor.Tensor [source]¶
Flatten the input. Does not affect the batch size.
Note
If inputs are shaped
(batch,)
without a feature axis, then flattening adds an extra channel dimension and output shape is(batch, 1)
.
- flint.nn.functional.gelu(input: flint.tensor.Tensor) flint.tensor.Tensor [source]¶
Compute GELU (Gaussian Error Linear Units) [1] element-wise.
\[\text{GELU}(x) = x \cdot \Phi(x) = x \cdot \frac{1}{2} [1 + \text{erf} (x / \sqrt{2})] \]where \(\Phi(x)\) is the Cumulative Distribution Function for Gaussian Distribution.
We can approximate it with:
\[\text{GELU}(x) = 0.5 x (1 + \text{tanh}[ \sqrt{2 / \pi} (x + 0.044715 x^3) ]) \]or
\[\text{GELU}(x) = x \sigma(1.702 x) \]References
“Gaussian Error Linear Units (GELUs).” Dan Hendrycks and Kevin Gimpel. arXiv 2016.
- flint.nn.functional.leaky_relu(input: flint.tensor.Tensor, negative_slope: float = 0.01) flint.tensor.Tensor [source]¶
Compute Leaky ReLU element-wise.
\[\text{LeakyReLU}(x) = \max(0, x) + \text{negative\_slope} * \min(0, x) \]- Parameters
negative_slope (float, optional, default=1e-2) – Controls the angle of the negative slope.
- flint.nn.functional.linear(input: flint.tensor.Tensor, weight: flint.tensor.Tensor, bias: Optional[flint.tensor.Tensor] = None)[source]¶
Apply a linear transformation to the incoming data.
\[y = x A^T + b \]
- flint.nn.functional.max_pool1d(input: flint.tensor.Tensor, kernel_size: Tuple[int], stride: Tuple[int] = (1,), padding: Tuple[int] = (0,), dilation: Tuple[int] = (1,), return_indices: bool = False)[source]¶
Apply a 1D max pooling over an input signal composed of several input planes.
input shape:
(batch_size, in_channels, L_in)
output shape:
(batch_size, out_channels, L_out)
where:
\[\text{L\_out} = \frac{\text{L\_in + 2 * padding - dilation * (kernel\_size - 1) - 1}}{\text{stride}} + 1 \]Note
It should be noted that, PyTorch argues the input will be implicitly zero-padded when
padding
is non-zero in its documentation. However, in fact, it uses implicit negative infinity padding rather than zero-padding, see this issue.In this class, zero-padding is used.
- Parameters
kernel_size (Tuple[int]) – Size of the sliding window, must be > 0.
stride (Tuple[int]) – Stride of the window, must be > 0. Default to
kernel_size
.padding (Tuple[int], optional, default=0) – Zero-padding added to both sides of the input, must be >= 0 and <=
kernel_size / 2
.dilation (Tuple[int], optional, default=1) – Spacing between the elements in the window, must be > 0
return_indices (bool, optional, default=False)) – If
True
, will return the max indices along with the outputs
- flint.nn.functional.max_pool2d(input: flint.tensor.Tensor, kernel_size: Tuple[int], stride: Tuple[int], padding: Tuple[int] = (0, 0), dilation: Tuple[int] = (1, 1), return_indices: bool = False)[source]¶
Apply a 2D max pooling over an input signal composed of several input planes.
input shape:
(batch_size, in_channels, h_in, w_in)
output shape:
(batch_size, out_channels, h_out, w_out)
where:
\[\text{h\_out} = \frac{\text{h\_in + 2 * padding[0] - dilation[0] * (kernel\_size[0] - 1) - 1}}{\text{stride}[0]} + 1 \]\[\text{w\_out} = \frac{\text{w\_in + 2 * padding[1] - dilation[1] * (kernel\_size[1] - 1) - 1}}{\text{stride}[1]} + 1 \]Note
Use
unfold
function to perform the max pooling as a single matrix multiplication. For more details, see [1].Note
It should be noted that, PyTorch argues the input will be implicitly zero-padded when
padding
is non-zero in its documentation. However, in fact, it uses implicit negative infinity padding rather than zero-padding, see this issue.In this class, zero-padding is used.
- Parameters
kernel_size (Tuple[int, int]) – Size of the sliding window, must be > 0.
stride (Tuple[int, int]) – Stride/hop of the window. Default to
kernel_size
.padding (Tuple[int, int], optional, default=(0, 0)) – Zero-padding added to both sides of the input, must be >= 0 and <=
kernel_size / 2
.dilation (Tuple[int, int], optional, default=(1, 1)) – Spacing between the elements in the window, must be > 0
return_indices (bool, optional, default=False) – If
True
, will return the max indices along with the outputs
References
- flint.nn.functional.mse_loss(input: flint.tensor.Tensor, target: flint.tensor.Tensor, reduction: str = 'mean') flint.tensor.Tensor [source]¶
Mean Squared Error Loss \((x - y)^2\)
- flint.nn.functional.nll_loss(input: flint.tensor.Tensor, target: flint.tensor.Tensor, reduction: str = 'mean') flint.tensor.Tensor [source]¶
Negative Log Likelihood Loss
Note
Here I apply
log()
on the prediction data, which is DIFFERENT FROMnn.functional.nll_loss()
IN PYTORCH!
- flint.nn.functional.pad(input: flint.tensor.Tensor, pad: Tuple[int], value: int = 0) flint.tensor.Tensor [source]¶
Pad tensor.
- Parameters
input (Tensor) – N-dimensional tensor
pad (_tuple_any_t[int]) – Padding sizes, a m-elements tuple, where
m/2
<= input dimensions andm
is even. The padding sizes are described starting from them/2
to last dimension to the last dimension. That is,m/2
dimensions of input will be padded.value (int, optional, default=0) – Fill value for ‘constant’ padding
- flint.nn.functional.relu(input: flint.tensor.Tensor) flint.tensor.Tensor [source]¶
Compute ReLU (Rectified Linear Unit) element-wise.
- flint.nn.functional.sigmoid(input: flint.tensor.Tensor) flint.tensor.Tensor [source]¶
Compute Sigmoid element-wise.
\[\text{sigmoid}(x) = \frac{1}{1 + \exp(-x)} \]
- flint.nn.functional.tanh(input: flint.tensor.Tensor) flint.tensor.Tensor [source]¶
Compute Tanh (Hyperbolic Tangent) element-wise.
\[\text{tanh}(x) = \frac{\sinh(x)}{\cosh(x)} = \frac{\exp(x) - \exp(-x)}{\exp(x) + \exp(-x)} \]
- flint.nn.functional.unfold(input: flint.tensor.Tensor, kernel_size: Union[T, Tuple[T]], stride: Union[T, Tuple[T]] = 1, padding: Union[T, Tuple[T]] = 0, dilation: Union[T, Tuple[T]] = 1)[source]¶
Extracts sliding local blocks from a batched input tensor.
input shape: \((N, C, H, W)\)
output shape: \((N, C \times \prod(\text{kernel\_size}), L)\)
where:
\[L = \prod_d \frac{\text{spatial\_size[d] + 2 * padding[d] - dilation[d] * (kernel\_size[d] - 1) - 1}}{\text{stride}[d]} + 1 \]where \(\text{spatial\_size}\) is formed by the spatial dimensions of
input
(H and W above), and \(d\) is over all spatial dimensions.- Parameters
input (Tensor) – Input tensor
kernel_size (int or tuple) – Size of the sliding blocks.
stride (int or tuple, optional, default=1) – Stride of the sliding blocks in the input spatial dimensions.
padding (int or tuple, optional, default=0) – Implicit zero padding to be added on both sides of input.
dilation (int or tuple, optional, default=1) – A parameter that controls the stride of elements within the neighborhood.