All models inherit from torch.nn.Module—either directly or through torch.nn.Sequential—and therefore function like standard PyTorch models.

FCNet

A fully connected neural network model.

class torch_tools.models._fc_net.FCNet(*args: Any, **kwargs: Any)[source]

Fully-connected neural network.

An optional input block, which applies batch normalisation and dropout to the inputs, followed by a series of fully-connected blocks consisting of Linear, BatchNorm1d and LeakyReLU layers, followed by a final Linear output layer.

Parameters:
  • in_feats (int) – Number of input features to the model.

  • out_feats (int) – Number of output features (classes).

  • hidden_sizes (Tuple[int, ...], optional) – The sizes of the hidden layers (or None).

  • input_bnorm (bool, optional) – Should we apply batch-normalisation to the input batches?

  • input_dropout (float, optional) – The dropout probability to apply to the inputs (not included if zero).

  • hidden_dropout (float, optional) – The Dropout probability at each hidden layer (not included if zero).

  • hidden_bnorm (bool, optional) – Should we include batch norms in the hidden layers?

  • negative_slope (float, optional) – The negative slope argument to use in the LeakyReLU layers.

Examples

>>> from torch_tools import FCNet
>>> FCNet(in_feats=256,
          out_feats=2,
          hidden_sizes=(128, 64, 32),
          input_bnorm=True,
          input_dropout=0.1,
          hidden_dropout=0.25,
          hidden_bnorm=True,
          negative_slope=0.2)

ConvNet2d

2D CNN model which wraps Torchvision’s ResNet and VGG models.

class torch_tools.models._conv_net_2d.ConvNet2d(*args: Any, **kwargs: Any)[source]

CNN model which wraps Torchvision’s ResNet, VGG and Mobilenet_v3 models.

The model contains:

— An encoder, taken from Torchvision’s ResNet/VGG models.

— An adaptive pooling layer.

— A fully-connected classification/regression head.

Parameters:
  • out_feats (int) – The number of output features the model should produce (for example, the number of classes).

  • in_channels (int) – Number of input channels the model should take. Warning: if you don’t use three input channels, the first conv layer is overwritten, which renders freezing the encoder pointless.

  • encoder_style (str, optional) – The encoder option to use. The encoders are loaded from torchvision’s models. Options include all of torchvision’s VGG, ResNET and MOBILENET v3 options (i.e. "vgg11", "vgg11_bn", "resnet18", mobilenet_v3_small etc.).

  • pretrained (bool, optional) – Determines whether the encoder is initialised with Torchvision’s pretrained weights. If True, the model will load Torchvision’s most up-to-date image-net-trained weights.

  • pool_option (str, optional) – The type of adaptive pooling layer to use. Choose from "avg", "max" or "avg-max-concat" (the latter simply concatenates the former two). See torch_tools.models._adaptive_pools_2d for more info.

  • fc_net_kwargs (Dict[str, Any], optional) – Keyword arguments for torch_tools.models.fc_net.FCNet which serves as the classification/regression part of the model.

Examples

>>> from torch_tools import ConvNet2d
>>> model = ConvNet2d(out_feats=512,
                      in_channels=3,
                      encoder_style="vgg11_bn",
                      pretrained=True,
                      pool_style="avg-max-concat",
                      fc_net_kwargs={"hidden_sizes": (1024, 1024), "hidden_dropout": 0.25})

Another potentially useful feature is the ability to freeze the encoder, and take advantage of the available pretrained weights by doing transfer learning.

>>> from torch import rand
>>> from torch_tools import ConvNet2d
>>> model = ConvNet2d(out_feats=10, pretrained=True)
>>> # Batch of 10 fake three-channel images of 256x256 pixels
>>> mini_batch = rand(10, 3, 256, 256)
>>> # With the encoder frozen
>>> preds = model(mini_batch, frozen_encoder=True)
>>> # Without the encoder frozen (default behaviour)
>>> preds = model(mini_batch, frozen_encoder=False)

Notes

— Even if you load pretrained weights, but don’t freeze the encoder, you will likely end up finding better performance than you would by randomly initialising the model—even if it doesn’t make sense. Welcome to deep learning.

— If you change the number of input channels, don’t bother freezing the encoder—the first convolutional layer is overloaded and randomly initialised.

— See torch_tools.models._conv_net_2d.ConvNet2d for more info.

forward(batch: torch.Tensor, frozen_encoder: bool = False) torch.Tensor[source]

Pass batch through the model.

Parameters:
  • batch (Tensor) – A mini-batch of inputs with shape (N, C, H, W), where N is the batch-size, C the number of channels and (H, W) the input size.

  • frozen_encoder (bool, optional) – If True, the gradients are disabled in the encoder. If False, the gradients are enabled in the encoder.

Returns:

The result of passing batch through the model.

Return type:

Tensor

get_features(batch: torch.Tensor) torch.Tensor[source]

Return the features produced by the encoder and pool.

Parameters:

batch (Tensor) – A mini-batch of image-like inputs.

Returns:

The encoded features for the items in batch.

Return type:

Tensor

UNet

UNet model for semantic segmentation.

class torch_tools.models._unet.UNet(*args: Any, **kwargs: Any)[source]

UNet for two-spatial-dimensional (image-like) semantic segmentation.

Parameters:
  • in_chans (int) – The number of input channels.

  • out_chans (int) – The number of output channels.

  • features_start (int, optional) – The number of features produced by the first convolutional block.

  • num_layers (int, optional) – The number of layers in the UNet.

  • pool_style (str, optional) – The pool style to use in the DownBlock blocks. Can be "max" or "avg".

  • bilinear (bool, optional) – Whether to use use bilinear interpolation in the upsampling layers or not. If True, we use bilinear interpolation to upsample. If False, we use ConvTranspose2d.

  • lr_slope (float, optional) – The negative slope argument for LeakyReLU layers.

  • kernel_size (int, optional) – Linear size of the square convolutional kernel to use in the Conv2d layers. Should be a positive, odd, int.

  • block_style (str) – Type of convolutional blocks to use: "double_conv" or "conv_res".

  • dropout (float, optional) – The dropout probability to apply at the output of each convolutional block.

Examples

>>> from torch_tools import UNet
>>> model = UNet(
                in_chans=3,
                out_chans=16,
                features_start=64,
                num_layers=3,
                pool_style="max",
                bilinear=False,
                lr_slope=0.2,
                kernel_size=3,
                )
forward(batch: torch.Tensor) torch.Tensor[source]

Pass batch through the model.

Parameters:

batch (Tensor) – A mini-batch of image-like inputs.

Returns:

The result of passing batch through the model.

Return type:

Tensor

Encoder2d

Two-dimensional convolutional encoder moder.

class torch_tools.models._encoder_2d.Encoder2d(*args: Any, **kwargs: Any)[source]

Encoder model for image-like inputs.

A DoubleConvBlock which produces start_features features, followed by num_blocks - 1 DownBlock blocks. The DoubleConvBlock preserves the input’s height and width, while each DownBlock halves the spatial dimensions and doubles the number of channels.

Parameters:
  • in_chans (int) – The number of input channels the encoder should take.

  • start_features (int) – The number of features the first conv block should produce.

  • num_blocks (int) – The number of downsampling blocks in the encoder.

  • pool_style (str) – The type of pooling to use when downsampling ("avg" or "max").

  • lr_slope (float) – The negative slope argument to use in the LeakyReLU layers.

  • kernel_size (int) – Size of the square convolutional kernel to use in the Conv2d layers. Should be a positive, odd, int.

  • max_features – In each of the down-sampling blocks, the numbers of features is doubled. Optionally supplying max_features places a limit on this.

  • optional – In each of the down-sampling blocks, the numbers of features is doubled. Optionally supplying max_features places a limit on this.

  • block_style (str, optional) – Style of encoding block to use: "conv_block" or "conv_res".

  • dropout (float, optional) – The dropout probability to apply at the output of each block.

Examples

>>> from torch_tools import Encoder2d
>>> model = Encoder2d(
                in_chans=3,
                start_features=64,
                num_blocks=4,
                pool_style="max",
                lr_slope=0.123,
                kernel_size=3,
                max_feats=512,
            )

Decoder2d

Two-dimensional decoder model.

class torch_tools.models._decoder_2d.Decoder2d(*args: Any, **kwargs: Any)[source]

Simple decoder model for image-like inputs.

Parameters:
  • in_chans (int) – The number of input channels the model should take.

  • out_chans (int) – The number of output channels the decoder should produce.

  • num_blocks (int) – The number of blocks to include in the decoder.

  • bilinear (bool) – Whether to use bilinear interpolation (True) or a ConvTranspose2d to do the upsampling.

  • lr_slope (float) – The negative slope to use in the LeakyReLU layers.

  • kernel_size (int) – The size of the square convolutional kernel in the Conv2d layers. Should be an odd, positive, int.

  • min_up_feats (int, optional) – The minimum numbers features the up-sampling blocks are allowed to produce.

  • block_style (str, optional) – Style of decoding block to use: "conv_block" or "conv_res".

  • dropout (float, optional) – Dropout probability to apply at the output of each block.

Examples

>>> from torch_tools import Decoder2d
>>> model = Decoder2d(
                in_chans=128,
                out_chans=3,
                num_blocks=4,
                bilinear=False,
                lr_slope=0.123,
                kernel_size=3,
            )

AutoEncoder2d

A simple image encoder-decoder model.

class torch_tools.models._autoencoder_2d.AutoEncoder2d(*args: Any, **kwargs: Any)[source]

A simple encoder-decoder pair for image-like inputs.

Parameters:
  • in_chans (int) – The number of input channels.

  • out_chans (int) – The number of output layers the model should produce.

  • num_layers (int, optional) – The number of layers in the encoder/decoder.

  • features_start (int, optional) – The number of features produced by the first conv block.

  • lr_slope (float, optional) – The negative slope to use in the LeakyReLU layers.

  • pool_style (str, optional) – The pool style to use in the downsampling blocks ( "avg" or "max" ).

  • bilinear (bool, optional) – Whether or not to upsample with bilinear interpolation ( True ) or ConvTranspose2d ( False ).

  • kernel_size (int, optional) – Size of the square convolutional kernel to use on the Conv2d layers. Must be a positive, odd, int.

  • block_style (str, optional) – Style of convolutional blocks to use in the encoding and decoding blocks. Use either "double_conv" or "conv_res".

  • dropout (float, optional) – The dropout probability to apply at the output of the convolutional blocks.

Notes

— Depending on the application, it may be convenient to pretrain this model and then use it for transfer learning—hence the frozen_encoder and frozen_decoder arguments in the forward method. There are no pretrained weights available, however.

Examples

>>> from torch_tools import AutoEncoder2d
>>> model = AutoEncoder2d(
                in_chans=3,
                start_features=64,
                num_blocks=4,
                pool_style="max",
                lr_slope=0.123,
            )

Another (potentially) useful feature (if you want to do transfer learning) if the ability to freeze—i.e. fix—the parameters of either the encoder or the decoder:

>>> from torch import rand
>>> from torch_tools import AutoEncoder2d
>>> # Mini-batch of ten, three-channel images of 64 by 64 pixels
>>> mini_batch = rand(10, 3, 64, 64)
>>> model = AutoEncoder2d(in_chans=3, out_chans=3)
>>> # With nothing frozen (default behaviour)
>>> pred = model(mini_batch, frozen_encoder=False, frozen_decoder=False)
>>> # With the encoder frozen:
>>> pred = model(mini_batch, frozen_encoder=True, frozen_decoder=False)
>>> # With both the encoder and decoder frozen:
>>> pred = model(mini_batch, frozen_encoder=True, frozen_decoder=True)
forward(batch: torch.Tensor, frozen_encoder: bool = False, frozen_decoder: bool = False) torch.Tensor[source]

Pass batch through the model.

Parameters:
  • batch (Tensor) – A mini-batch of inputs.

  • frozen_encoder (bool, optional) – Boolean switch controlling whether the encoder’s gradients are enabled or disabled (useful for transfer learning).

  • frozen_decoder (bool, optional) – Boolean switch controlling whether the decoder’s gradients are enabled or disabled (useful for transfer learning).

Returns:

The result of passing batch through the model.

Return type:

Tensor

VAE2d

2D convolutional variational autoencoder.

class torch_tools.models._variational_autoencoder_2d.VAE2d(*args: Any, **kwargs: Any)[source]

2D convolutional variational autoencoder.

Parameters:
  • in_chans (int) – The number of input channels the model should take.

  • out_chans (int) – The number of output channels the model should produce.

  • input_dims (Tuple[int, int]) – The (height, width) of the input images (only necessary if mean_var_nets == "linear").

  • start_features (int, optional) – The number of features the first double conv block should produce.

  • num_layers (int, optional) – The number of layers in the U-like architecture.

  • down_pool (str, optional) – The type of pooling to use in the down-sampling layers: "avg" or "max".

  • bilinear (bool, optional) – If True, we use bilinear interpolation in the upsampling. If False, we use ConvTranspose2d.

  • lr_slope (float, optional) – Negative slope to use in the leaky relu layers.

  • kernel_size (int, optional) – Linear size of the square convolutional kernels to use.

  • max_down_feats (int, optional) – Upper limit on the number of features that can be produced by the down-sampling blocks.

  • min_up_feats (int, optional) – Minimum number of features the up-sampling blocks can produce.

  • block_style (str) – Block style to use in the down and up blocks.

  • mean_var_net (str) – The style of the networks for which learn the mean and variances: "linear" or "conv".

  • dropout (float, optional) – Dropout probability to apply at the output of the convolutional blocks.

decode(features: torch.Tensor, frozen_decoder: bool) torch.Tensor[source]

Decode the latent features.

Parameters:
  • features (Tensor) – VA-encoded features.

  • frozen_decoder (bool) – Should the decoder’s weights be frozen, or not?

Returns:

The decoded features.

Return type:

Tensor

encode(batch: torch.Tensor, frozen_encoder: bool) Tuple[torch.Tensor, torch.Tensor][source]

Encode the inputs in batch.

Parameters:
  • batch (Tensor) – Mini-batch of inputs.

  • frozen_encoder (bool) – Shoould the encoder’s weights be frozen, or not?

Returns:

  • feats (Tensor) – The encoded features.

  • Tensor – The KL divergence between the features and N(0, 1).

forward(batch: torch.Tensor, frozen_encoder: bool = False, frozen_decoder: bool = False) torch.Tensor | Tuple[torch.Tensor, torch.Tensor][source]

Pass batch through the model.

Parameters:
  • batch (Tensor) – A mini-batch of image-like inputs.

  • frozen_encoder (bool, optional) – Should the encoder’s parameters be fixed?

  • frozen_decoder (bool, optional) – Should the decoder’s weights be fixed?

Returns:

  • decoded (Tensor) – The predicted version of batch.

  • kl_div (Tensor) – The KL divergence between features and N(0, 1).

get_features(means: torch.Tensor, logvar: torch.Tensor) torch.Tensor[source]

Get the features using the reparam trick.

Parameters:
  • means (Tensor) – The feature means.

  • logvar (Tensor) – The log variance.

Returns:

The feature dist.

Return type:

Tensor

static kl_divergence(means: torch.Tensor, log_var: torch.Tensor) torch.Tensor[source]

Compute the KL divergence between the dists and a unit normal.

Parameters:
  • means (Tensor) – Samples from the mean distributions.

  • log_var (Tensor) – The logarithm of the variances.

Returns:

Kullback-Leibler divergence between the feature dists and unit normals.

Return type:

Tensor

SimpleConvNet2d

A simple two-dimensional convolutional neural network.

class torch_tools.models._simple_conv_2d.SimpleConvNet2d(*args: Any, **kwargs: Any)[source]

A very simple 2D CNN with an encoder, pool, and fully-connected layer.

Parameters:
  • in_chans (int) – The number of input channels.

  • out_feats (int) – The number of output features the fully connected layer should produce.

  • features_start (int) – The number of features the input convolutional block should produce.

  • num_blocks (int) – The number of encoding blocks to use.

  • downsample_pool (str) – The style of downsampling pool to use in the encoder ("avg" or "max").

  • adaptive_pool (str) – The style of adaptive pool to use on the encoder’s output ("avg", "max" or "avg-max-concat".)

  • lr_slope (float) – The negative slope to use in the LeakyReLU layers.

  • kernel_size (int) – The size of the square convolutional kernel to use in the Conv2d layers. Must be an odd, positive, int.

  • fc_net_kwargs (Dict[str, Any], optional) – Keyword arguments for torch_tools.models.fc_net.FCNet which serves as the classification/regression part of the model.

  • block_style (str, optional) – Style of encoding blocks to use: choose from "double_conv" or conv_res.

Examples

>>> from torch_tools import SimpleConvNet2d
>>> SimpleConvNet2d(
        in_chans=3,
        out_feats=128,
        features_start=64,
        num_blocks=4,
        downsample_pool="max",
        adaptive_pool="avg-max-concat",
        lr_slope=0.123,
        fc_net_kwards={"hidden_sizes": (256, 256,)},
    )
forward(batch: torch.Tensor, frozen_encoder: bool = False) torch.Tensor[source]

Pass batch through the model.

Parameters:
  • batch (Tensor) – A mini-batch of inputs.

  • frozen_encoder (bool, optional) – Should the encoder’s weights be frozen (i.e. have no grad) during the forward pass?

Returns:

The result of passing batch through the model.

Return type:

Tensor

get_features(batch: torch.Tensor) torch.Tensor[source]

Get the features produced by the encoder and adaptive poool.

Parameters:

batch (Tensor) – A mini-batch of image-like inputs.

Returns:

The features for each item in batch.

Return type:

Tensor