deepml package

Subpackages

Submodules

deepml.accelerator_trainer module

class deepml.accelerator_trainer.AcceleratorTrainer(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_step_policy='epoch', accelerator_config=None)[source]

Bases: BaseLearner

Training class using HuggingFace Accelerate for distributed training.

This trainer leverages the Accelerate library for seamless distributed training, mixed precision, and device management across CPUs, GPUs, and TPUs. It supports gradient accumulation, gradient clipping, and automatic model/optimizer preparation.

__init__(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_step_policy='epoch', accelerator_config=None)[source]

Initializes the AcceleratorTrainer.

Parameters:
  • task (Task) – Task object defining the learning task (e.g., classification, segmentation).

  • optimizer (Optimizer) – PyTorch optimizer instance for parameter updates.

  • criterion (Module) – Loss function module.

  • lr_scheduler (Optional[_LRScheduler]) – Learning rate scheduler instance. Defaults to None.

  • lr_scheduler_step_policy (str) – When to call scheduler.step(). Valid options are "epoch" (step after each epoch) or "step" (step after each optimizer update). Defaults to "epoch".

  • accelerator_config (Optional[dict]) – Optional dictionary of keyword arguments passed to Accelerate.Accelerator() for configuration. Common options include: - gradient_accumulation_steps: Number of steps to accumulate gradients - mixed_precision: Mixed precision mode (“no”, “fp16”, “bf16”) - device_placement: Whether to automatically place tensors on device - split_batches: Whether to split batches across devices Defaults to None (uses Accelerate defaults).

Note

Unlike FabricTrainer, this class accepts an lr_scheduler instance directly rather than a factory function (lr_scheduler_fn).

fit(train_loader, val_loader=None, epochs=10, save_model_after_every_epoch=5, metrics=None, gradient_clip_value=None, gradient_clip_max_norm=None, resume_from_checkpoint=None, load_optimizer_state=False, load_scheduler_state=False, logger=None, non_blocking=True, image_inverse_transform=None, logger_img_size=None)[source]

Trains the model for the specified number of epochs using Accelerate.

Handles the complete training workflow including model preparation, distributed training coordination, checkpointing, validation, and metric logging.

Parameters:
  • train_loader (DataLoader) – DataLoader for training data.

  • val_loader (DataLoader) – DataLoader for validation data. Defaults to None.

  • epochs (int) – Total number of epochs to train. Defaults to 10.

  • save_model_after_every_epoch (int) – Frequency (in epochs) to save model checkpoints. Defaults to 5.

  • metrics (Dict[str, Module]) – Dictionary mapping metric names to metric instances. Each metric must be a torch.nn.Module with a forward() method. Defaults to None.

  • gradient_clip_value (Optional[float]) – Maximum absolute value for gradient clipping. Gradients will be clipped to [-gradient_clip_value, gradient_clip_value]. Mutually exclusive with gradient_clip_max_norm. Defaults to None.

  • gradient_clip_max_norm (Optional[float]) – Maximum L2 norm for gradient clipping. Mutually exclusive with gradient_clip_value. Defaults to None.

  • resume_from_checkpoint (str) – Path to checkpoint file to resume training from. Defaults to None.

  • load_optimizer_state (bool) – Whether to load optimizer state from checkpoint. Defaults to False.

  • load_scheduler_state (bool) – Whether to load learning rate scheduler state from checkpoint. Defaults to False.

  • logger (MLExperimentLogger) – Experiment logger for tracking metrics and artifacts. If None, uses TensorboardLogger. Defaults to None.

  • non_blocking (bool) – Whether to use asynchronous CUDA tensor transfers. Defaults to True.

  • image_inverse_transform (Callable) – Transformation to reverse image normalization for visualization in TensorBoard. Defaults to None.

  • logger_img_size (Union[int, Tuple[int, int]]) – Image size (int or tuple) for TensorBoard logging. Defaults to None.

Returns:

Dictionary containing training history with metric names as keys and lists of values as entries.

Raises:
  • ValueError – If both gradient_clip_value and gradient_clip_max_norm are provided.

  • TypeError – If any metric is not a torch.nn.Module with a forward() method.

Note

  • All model, optimizer, scheduler, and dataloaders are prepared by Accelerate

  • Only the main process saves checkpoints and manages logging

  • All processes synchronize at the end of each epoch using wait_for_everyone()

  • The model is automatically unwrapped when saving best validation checkpoint

fit_temp(train_loader, val_loader, epochs=10, metrics={})[source]

Temporary/experimental training method with simplified Accelerate workflow.

Warning: This method appears to be legacy/debug code and should not be used in production. Use the fit() method instead.

Parameters:
  • train_loader – DataLoader for training data.

  • val_loader – DataLoader for validation data.

  • epochs – Number of epochs to train. Defaults to 10.

  • metrics (dict) – Dictionary mapping metric names to metric functions. Defaults to {}.

Note

  • This method has several issues compared to the main fit() method:
    • References self.model instead of self._model

    • Hardcoded checkpoint paths

    • Missing checkpoint management features

    • Uses deprecated gather_for_metrics() instead of gather()

  • This should likely be removed or refactored to align with fit()

Deprecated:

Use fit() method instead for production training.

deepml.base module

class deepml.base.BaseLearner(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_fn=None, lr_scheduler_step_policy='epoch')[source]

Bases: ABC

__init__(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_fn=None, lr_scheduler_step_policy='epoch')[source]
set_optimizer(optimizer)[source]
set_criterion(criterion)[source]
set_lr_scheduler_policy(lr_scheduler_step_policy='epoch')[source]
static load_optimizer_state(optimizer, state_dict)[source]
static load_lr_schedular_state(lr_scheduler, state_dict)[source]
create_state_dict(model, optimizer, criterion, lr_scheduler=None, epoch=-1, train_loss=inf, val_loss=inf)[source]
save(tag, model, optimizer, criterion, lr_scheduler=None, epoch=-1, train_loss=inf, val_loss=inf, **kwargs)[source]
static init_metrics(metrics)[source]
Return type:

OrderedDict[str, float]

static update_metrics(outputs, targets, metrics_instance_dict, target_metrics_dict)[source]
static update_metrics_with_simple_moving_average(source_metrics_dict, target_metrics_dict, step)[source]
static write_metrics_to_logger(metrics_dict, tag, global_step, logger, history)[source]
static write_lr(optimizer, global_step, logger, history)[source]
log_metrics(val_loader, train_metrics, val_metrics, metrics_history, epochs_completed, logger_img_size, image_inverse_transform)[source]
fit(*args, **kwargs)[source]
predict(*args, **kwargs)[source]

deepml.constants module

deepml.datasets module

class deepml.datasets.ImageRowDataFrameDataset(dataframe, target_column=None, image_size=(28, 28), transform=None)[source]

Bases: Dataset

Dataset for reading images stored as flattened arrays in DataFrame rows.

This dataset treats each row of a DataFrame as a flattened image array, which is then reshaped to the specified image dimensions.

dataframe

DataFrame containing flattened image data (without target column).

target_column

Series containing target labels, if provided.

samples

Number of samples in the dataset.

image_size

Tuple specifying the output image dimensions (height, width).

transform

Optional transformation callable to apply to images.

__init__(dataframe, target_column=None, image_size=(28, 28), transform=None)[source]

Initializes the ImageRowDataFrameDataset.

Parameters:
  • dataframe (DataFrame) – DataFrame where each row contains a flattened image array.

  • target_column (str) – Name of the column containing target labels. If provided, this column is extracted and removed from the DataFrame. Defaults to None.

  • image_size (Tuple[int, int]) – Dimensions to reshape each image to as (height, width). Defaults to (28, 28).

  • transform (Callable) – Optional callable to transform images (e.g., torchvision transforms). Defaults to None.

Note

The DataFrame is reset with a fresh index, and the target column (if specified) is removed from the image data.

__getitem__(index)[source]

Retrieves an image and its label at the specified index.

Parameters:

index (int) – Index of the sample to retrieve.

Returns:

  • image: Transformed PIL Image or tensor of shape specified by image_size

  • label: Target label if target_column was provided, otherwise 0

Return type:

Tuple of (image, label) where

__len__()[source]

Returns the total number of samples in the dataset.

Returns:

Number of samples.

class deepml.datasets.ImageDataFrameDataset(dataframe, image_file_name_column='image', target_columns=None, image_dir=None, transforms=None, target_transform=None, open_file_func=None)[source]

Bases: Dataset

Dataset for reading images from file paths specified in a DataFrame.

This dataset loads images from disk based on file paths listed in a DataFrame, making it suitable for image classification and regression tasks.

dataframe

DataFrame containing image file paths and optional target columns.

image_file_name_column

Name of the column containing image filenames.

target_columns

Column name(s) containing target values.

image_dir

Base directory containing images.

transforms

Transformation callable to apply to images.

samples

Number of samples in the dataset.

target_transform

Transformation callable to apply to targets.

open_file_func

Custom function for opening image files.

__init__(dataframe, image_file_name_column='image', target_columns=None, image_dir=None, transforms=None, target_transform=None, open_file_func=None)[source]

Initializes the ImageDataFrameDataset.

Parameters:
  • dataframe (DataFrame) – DataFrame containing image file paths and optional targets.

  • image_file_name_column (str) – Name of the column containing image filenames. Defaults to “image”.

  • target_columns (Union[int, List[str]]) – Column name(s) containing target values. Can be a single column name (str) or list of column names for multi-target tasks. If None, no targets are loaded. Defaults to None.

  • image_dir (str) – Base directory containing images. If provided, filenames from the DataFrame are joined with this directory. Defaults to None.

  • transforms (Callable) – Optional callable to transform images (e.g., torchvision transforms). Defaults to None.

  • target_transform (Callable) – Optional callable to transform target values. Defaults to None.

  • open_file_func (Callable) – Custom callable to open image files. Should accept a file path and return an image object. If None, uses PIL.Image.open. Defaults to None.

Note

The DataFrame is reset with a fresh index to ensure consistent indexing.

__len__()[source]

Returns the total number of samples in the dataset.

Return type:

int

Returns:

Number of samples.

__getitem__(index)[source]

Retrieves an image and its target at the specified index.

Parameters:

index (int) – Index of the sample to retrieve.

Returns:

  • image: Transformed image as PIL Image or tensor

  • target: Target value(s) as tensor if target_columns was provided, otherwise 0

Return type:

Tuple of (image, target) where

Note

If image_dir is provided, the image path is constructed by joining image_dir with the filename from the DataFrame.

class deepml.datasets.ImageListDataset(image_dir, transforms=None, open_file_func=None)[source]

Bases: Dataset

Dataset for loading all images from a directory.

This dataset reads all files from a specified directory and treats them as images. It returns both the image and its filename, making it useful for inference or unlabeled image processing tasks.

image_dir

Directory path containing image files.

images

List of image filenames in the directory.

transforms

Optional transformation callable to apply to images.

open_file_func

Custom function for opening image files.

__init__(image_dir, transforms=None, open_file_func=None)[source]

Initializes the ImageListDataset.

Parameters:
  • image_dir (str) – Directory path containing image files.

  • transforms (Callable) – Optional callable to transform images (e.g., torchvision transforms). Defaults to None.

  • open_file_func (Callable) – Custom callable to open image files. Should accept a file path and return an image object. If None, uses PIL.Image.open. Defaults to None.

Note

All files in the directory are assumed to be images. No filtering is applied.

__len__()[source]

Returns the total number of images in the directory.

Returns:

Number of images.

__getitem__(index)[source]

Retrieves an image and its filename at the specified index.

Parameters:

index (int) – Index of the image to retrieve.

Returns:

  • image: Transformed image as PIL Image or tensor

  • filename: String filename of the image

Return type:

Tuple of (image, filename) where

class deepml.datasets.SegmentationDataFrameDataset(dataframe, image_dir, mask_dir=None, image_col='image', mask_col=None, albu_torch_transforms=None, target_transform=None, train=True, open_file_func=None)[source]

Bases: Dataset

Dataset for semantic segmentation with images and corresponding masks.

This dataset loads images and their corresponding segmentation masks from directories specified in a DataFrame. It supports both training mode (with masks) and inference mode (without masks).

dataframe

DataFrame containing image and mask file information.

image_dir

Directory containing input images.

mask_dir

Directory containing segmentation masks (required for training).

image_col

Column name for image filenames.

mask_col

Column name for mask filenames.

albu_torch_transforms

Albumentations transforms for augmentation.

target_transform

Additional transforms for masks only.

samples

Number of samples in the dataset.

train

Whether the dataset is in training mode.

open_file_func

Custom function for opening image files.

Note

Image and mask files should have the same name unless mask_col specifies a different column. The open_file_func should accept an image_file_path and return a numpy array or PIL Image.

__init__(dataframe, image_dir, mask_dir=None, image_col='image', mask_col=None, albu_torch_transforms=None, target_transform=None, train=True, open_file_func=None)[source]

Initializes the SegmentationDataFrameDataset.

Parameters:
  • dataframe (DataFrame) – DataFrame containing image and mask file information.

  • image_dir (str) – Directory path containing input images.

  • mask_dir (str) – Directory path containing segmentation masks. Required when train=True. Defaults to None.

  • image_col (str) – Name of the DataFrame column containing image filenames. Defaults to “image”.

  • mask_col (str) – Name of the DataFrame column containing mask filenames. If None, uses the same filenames as image_col. Defaults to None.

  • albu_torch_transforms (Callable) – Albumentations transforms to apply to both image and mask. Should return a dictionary with “image” and “mask” keys. Defaults to None.

  • target_transform (Callable) – Additional transform to apply only to the mask after albumentations transforms. Defaults to None.

  • train (bool) – Whether the dataset is in training mode. If True, loads and returns masks. If False, returns filenames instead of masks. Defaults to True.

  • open_file_func (Callable) – Custom callable to open image/mask files. Should accept a file path and return a numpy array. If None, uses PIL.Image.open with conversion to numpy array. Defaults to None.

Raises:

AssertionError – If train=True and mask_dir is None.

Note

  • The DataFrame is reset with a fresh index for consistent indexing

  • In training mode, returns (image, mask) tuples

  • In inference mode, returns (image, filename) tuples

__len__()[source]

Returns the total number of samples in the dataset.

Return type:

int

Returns:

Number of samples.

__getitem__(index)[source]

Retrieves an image and its mask (or filename) at the specified index.

Parameters:

index (int) – Index of the sample to retrieve.

Returns:

  • image: Transformed image tensor from albumentations

  • target: If train=True, transformed mask tensor. If train=False, string filename of the image.

Return type:

Tuple of (image, target) where

Note

  • In training mode, applies albumentations transforms to both image and mask

  • In inference mode, applies albumentations transforms only to image

  • Additional target_transform is applied to mask if provided (training only)

deepml.fabric_trainer module

class deepml.fabric_trainer.FabricTrainer(task, optimizer, criterion, lr_scheduler_fn=None, lr_scheduler_step_policy='epoch', accelerator='auto', strategy='auto', devices='auto', precision='32-true', num_nodes=1, fabric_plugins=None)[source]

Bases: BaseLearner

Training class for learning model weights using Lightning Fabric.

This trainer leverages Lightning Fabric for distributed training, mixed precision, and hardware acceleration while maintaining a simple PyTorch-like interface.

It supports features like gradient accumulation, gradient clipping, learning rate scheduling, checkpointing, and logging with experiment tracking integration. The trainer is designed to be flexible and extensible for various types of learning tasks defined by the Task abstraction.

__init__(task, optimizer, criterion, lr_scheduler_fn=None, lr_scheduler_step_policy='epoch', accelerator='auto', strategy='auto', devices='auto', precision='32-true', num_nodes=1, fabric_plugins=None)[source]

Initializes the FabricTrainer.

Parameters:
  • task (Task) – Task object defining the learning task (e.g., classification, segmentation).

  • optimizer (Optimizer) – PyTorch optimizer instance for parameter updates.

  • criterion (Module) – Loss function module.

  • lr_scheduler_fn (Optional[Callable[[Optimizer], _LRScheduler]]) – Factory function that creates a learning rate scheduler. Should accept an optimizer and return a scheduler instance. Example: lambda optimizer: StepLR(optimizer, step_size=5, gamma=0.5). Defaults to None.

  • lr_scheduler_step_policy (str) – When to call scheduler.step(). Valid options are "epoch" (step after each epoch) or "step" (step after each gradient update). Defaults to "epoch".

  • accelerator (Union[str, int]) – Hardware accelerator to use. Options: "cpu", "cuda", "mps", "gpu", "tpu", or "auto". Defaults to "auto".

  • strategy (Union[str, int]) – Distributed training strategy. Options: "dp", "ddp", "fsdp", "deepspeed", "ddp_spawn", or "auto". Defaults to "auto".

  • devices (Union[str, int]) – Number or list of devices to use. Can be int, str, or "auto". Defaults to "auto".

  • precision (str) – Training precision. Options: "16-mixed", "32-true", "64-true", "bf16-mixed", "bf16-true", or "auto". Defaults to "32-true".

  • num_nodes (int) – Number of nodes for multi-node distributed training. Defaults to 1.

  • fabric_plugins (Optional) – Optional Fabric plugins for custom behaviors (e.g., DeepSpeedPlugin, BitsandbytesPrecision). Defaults to None.

Example

>>> from lightning_fabric.plugins import BitsandbytesPrecision
>>> plugin = BitsandbytesPrecision(mode="int8")
>>> trainer = FabricTrainer(
...     task=task,
...     optimizer=optimizer,
...     criterion=criterion,
...     fabric_plugins=plugin
... )
fit(train_loader, val_loader=None, epochs=10, save_model_after_every_epoch=5, metrics=None, gradient_accumulation_steps=1, gradient_clip_value=None, gradient_clip_max_norm=None, resume_from_checkpoint=None, load_optimizer_state=False, load_scheduler_state=False, logger=None, non_blocking=True, image_inverse_transform=None, logger_img_size=None)[source]

Trains the model for the specified number of epochs.

This method launches distributed training using Lightning Fabric and handles checkpointing, logging, and training history management.

Parameters:
  • train_loader (DataLoader) – DataLoader for training data.

  • val_loader (DataLoader) – DataLoader for validation data. Defaults to None.

  • epochs (int) – Total number of epochs to train. Defaults to 10.

  • save_model_after_every_epoch (int) – Frequency (in epochs) to save model checkpoints. Defaults to 5.

  • metrics (Dict[str, Module]) – Dictionary mapping metric names to metric instances. Each metric must be a torch.nn.Module with a forward() method. Defaults to None.

  • gradient_accumulation_steps (int) – Number of steps to accumulate gradients before performing an optimizer step. Simulates larger batch sizes. Defaults to 1.

  • gradient_clip_value (Optional[float]) – Maximum absolute value for gradient clipping. Gradients will be clipped to [-gradient_clip_value, gradient_clip_value]. Defaults to None (no clipping).

  • gradient_clip_max_norm (Optional[float]) – Maximum L2 norm for gradient clipping. Defaults to None (no clipping).

  • resume_from_checkpoint (str) – Path to checkpoint file to resume training from. Defaults to None.

  • load_optimizer_state (bool) – Whether to load optimizer state from checkpoint. Defaults to False.

  • load_scheduler_state (bool) – Whether to load learning rate scheduler state from checkpoint. Defaults to False.

  • logger (MLExperimentLogger) – Experiment logger for tracking metrics and artifacts. If None, uses TensorboardLogger. Defaults to None.

  • non_blocking (bool) – Whether to use asynchronous CUDA tensor transfers. Defaults to True.

  • image_inverse_transform (Callable) – Transformation to reverse image normalization for visualization in TensorBoard. Defaults to None.

  • logger_img_size (Union[int, Tuple[int, int]]) – Image size (int or tuple) for TensorBoard logging. Defaults to None.

Note

After training completes, the latest model checkpoint is automatically loaded into the trainer’s model and optimizer.

predict(loader)[source]

Generates predictions for the given data loader.

Parameters:

loader – DataLoader containing data for prediction.

Returns:

Tuple of (predictions, targets) where predictions are the model outputs and targets are the ground truth labels.

predict_class(loader)[source]

Generates class predictions with probabilities for the given data loader.

Parameters:

loader – DataLoader containing data for prediction.

Returns:

  • predicted_class: Predicted class labels

  • probability: Class probabilities or confidence scores

  • targets: Ground truth labels

Return type:

Tuple of (predicted_class, probability, targets) where

show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]

Visualizes model predictions on sample images.

Parameters:
  • loader – DataLoader containing data for visualization.

  • image_inverse_transform – Transformation to reverse image normalization for display. Defaults to None.

  • samples – Number of samples to display. Defaults to 9.

  • cols – Number of columns in the visualization grid. Defaults to 3.

  • figsize – Figure size as (width, height) tuple. Defaults to (10, 10).

  • target_known – Whether ground truth targets are available for comparison. Defaults to True.

deepml.losses module

class deepml.losses.JaccardLoss(*args: Any, **kwargs: Any)[source]

Bases: Module

Jaccard Loss (Intersection over Union) for segmentation tasks.

Computes 1 - IoU as a differentiable loss function for both binary and multiclass segmentation.

activation

Activation function applied to output logits. Softmax2d for multiclass, Sigmoid for binary.

__init__(is_multiclass)[source]

Initializes JaccardLoss with the appropriate activation.

Parameters:

is_multiclass – If True, uses Softmax2d activation for multiclass segmentation. Otherwise, uses Sigmoid for binary segmentation.

forward(output, target)[source]

Computes the Jaccard loss between predictions and targets.

Parameters:
  • output – Raw model output logits of shape (N, C, H, W).

  • target – Ground truth tensor of the same shape as output.

Returns:

Scalar tensor representing 1 - mean(IoU).

class deepml.losses.RMSELoss(*args: Any, **kwargs: Any)[source]

Bases: Module

Root Mean Squared Error loss.

Computes sqrt(MSE + eps) to provide a differentiable RMSE loss that avoids numerical instability near zero.

mse

Underlying MSELoss module.

eps

Small epsilon value added before the square root for numerical stability.

__init__(eps=1e-06)[source]

Initializes RMSELoss.

Parameters:

eps – Small constant for numerical stability. Defaults to 1e-6.

forward(output, target)[source]

Computes the RMSE loss.

Parameters:
  • output – Predicted tensor of arbitrary shape.

  • target – Ground truth tensor of the same shape as output.

Returns:

Scalar tensor representing sqrt(MSE(output, target) + eps).

class deepml.losses.WeightedBCEWithLogitsLoss(*args: Any, **kwargs: Any)[source]

Bases: Module

Weighted Binary Cross-Entropy loss with logits.

Applies separate weights to positive and negative samples in the binary cross-entropy computation.

w_p

Weight for positive samples.

w_n

Weight for negative samples.

__init__(w_p=None, w_n=None)[source]

Initializes WeightedBCEWithLogitsLoss.

Parameters:
  • w_p – Weight applied to the positive class loss term. Defaults to None.

  • w_n – Weight applied to the negative class loss term. Defaults to None.

forward(logits, labels, epsilon=1e-07)[source]

Computes the weighted binary cross-entropy loss.

Parameters:
  • logits – Raw model output logits of shape (N,) or (N, 1).

  • labels – Binary ground truth labels of shape (N,).

  • epsilon – Small constant to avoid log(0). Defaults to 1e-7.

Returns:

Scalar tensor representing the weighted BCE loss.

class deepml.losses.ContrastiveLoss(*args: Any, **kwargs: Any)[source]

Bases: Module

Contrastive loss for siamese networks.

Encourages embeddings of similar pairs to be close together and embeddings of dissimilar pairs to be at least margin apart.

margin

Minimum distance margin between negative pairs.

distance_func

Optional custom distance function. If None, pairwise Euclidean distance is used.

label_transform

Optional transformation applied to target labels before loss computation.

__init__(margin=2.0, distance_func=None, label_transform=None)[source]

Initializes ContrastiveLoss.

Parameters:
  • margin – The distance margin between positive and negative class. Defaults to 2.0.

  • distance_func – Custom distance function to use. If None, Euclidean pairwise distance is used. Defaults to None.

  • label_transform – Transformation function to apply on the target label, e.g., lambda label: label[:, 0]. Defaults to None.

forward(embeddings, label)[source]

Computes the contrastive loss for a pair of embeddings.

Parameters:
  • embeddings (Tensor) – A tuple of two tensors (embeddings1, embeddings2), each of shape (N, D) where D is the embedding dimension.

  • label (Tensor) – Tensor of shape (N,). A value of 1 indicates a positive (similar) pair; 0 indicates a negative (dissimilar) pair.

Returns:

Scalar tensor representing the mean contrastive loss.

class deepml.losses.AngularPenaltySMLoss(*args: Any, **kwargs: Any)[source]

Bases: Module

Angular Penalty Softmax Loss for deep face recognition.

Implements three angular margin-based softmax losses:

  • ArcFace: Additive angular margin loss. See ArcFace.

  • SphereFace: Multiplicative angular margin loss. See SphereFace.

  • CosFace: Additive cosine margin loss. See CosFace.

s

Scaling factor for the logits.

m

Angular or cosine margin penalty.

loss_type

One of ‘arcface’, ‘sphereface’, or ‘cosface’.

in_features

Size of the input feature vector.

out_features

Number of output classes.

fc

Fully connected layer mapping input features to class logits (without bias).

eps

Small epsilon for numerical stability in acos clamping.

__init__(in_features, out_features, loss_type='arcface', eps=1e-07, s=None, m=None)[source]

Initializes AngularPenaltySMLoss.

Parameters:
  • in_features – Dimensionality of the input feature embeddings.

  • out_features – Number of target classes.

  • loss_type – Type of angular penalty loss. Must be one of ‘arcface’, ‘sphereface’, or ‘cosface’. Defaults to ‘arcface’.

  • eps – Small constant for numerical stability when clamping values for acos. Defaults to 1e-7.

  • s – Scaling factor for logits. If None, uses the default for the chosen loss type (64.0 for arcface/sphereface, 30.0 for cosface). Defaults to None.

  • m – Margin penalty. If None, uses the default for the chosen loss type (0.5 for arcface, 1.35 for sphereface, 0.4 for cosface). Defaults to None.

Raises:

AssertionError – If loss_type is not one of the supported types.

forward(x, labels)[source]

Computes the angular penalty softmax loss.

Parameters:
  • x – Input feature embeddings of shape (N, in_features).

  • labels – Ground truth class labels of shape (N,), with values in the range [0, out_features).

Returns:

Scalar tensor representing the negative mean log probability.

Raises:

AssertionError – If input and labels have mismatched batch sizes, or if labels contain values outside the valid range.

deepml.lr_scheduler_utils module

deepml.lr_scheduler_utils.setup_one_cycle_lr_scheduler_with_warmup(optimizer, steps_per_epoch, warmup_steps=None, warmup_ratio=None, num_epochs=50, max_lr=0.001, anneal_strategy='cos')[source]

Sets up a OneCycleLR learning rate scheduler with warmup phase.

Creates a OneCycleLR scheduler that includes a warmup phase specified either by the number of steps or as a ratio of total training steps. The scheduler follows the 1-cycle policy: warmup → annealing to max_lr → annealing to min_lr.

Parameters:
  • optimizer – PyTorch optimizer instance to schedule.

  • steps_per_epoch (int) – Number of optimizer steps in one epoch. Typically len(train_loader). When using gradient accumulation or distributed training, adjust accordingly: len(train_loader) // gradient_accumulation_steps // num_processes.

  • warmup_steps (Optional[int]) – Number of warmup steps before reaching max_lr. Must be less than total training steps. Mutually exclusive with warmup_ratio. Defaults to None.

  • warmup_ratio (Optional[float]) – Ratio of total training steps to use for warmup (0-1). Mutually exclusive with warmup_steps. Defaults to None.

  • num_epochs (int) – Total number of training epochs. Defaults to 50.

  • max_lr (float) – Maximum learning rate during the cycle. Defaults to 1e-3.

  • anneal_strategy (Literal['cos', 'linear']) – Annealing strategy after warmup. Options: - "cos": Cosine annealing (smooth decay) - "linear": Linear annealing Defaults to "cos".

Returns:

OneCycleLR scheduler instance configured with the specified parameters.

Raises:

Example

>>> from torch.optim import Adam
>>> optimizer = Adam(model.parameters(), lr=1e-4)
>>> scheduler = setup_one_cycle_lr_scheduler_with_warmup(
...     optimizer,
...     steps_per_epoch=100,
...     warmup_ratio=0.1,
...     num_epochs=50,
...     max_lr=1e-3
... )

Note

The OneCycleLR policy divides training into three phases: 1. Warmup: Learning rate increases from initial_lr to max_lr 2. Annealing: Learning rate decreases from max_lr towards min_lr 3. The pct_start parameter controls the fraction of total steps for warmup

total_steps is set to num_epochs * steps_per_epoch + 1 rather than the “bare” product. PyTorch’s OneCycleLR.__init__ calls step() once internally during construction, consuming one slot before training begins. The trainer then calls step() once per batch (num_epochs * steps_per_epoch times total). Without the +1 the very last batch would trigger a ValueError: Tried to step N+1 times.

deepml.tasks module

class deepml.tasks.Task(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]

Bases: ABC

Abstract base class for all deep learning tasks.

This class provides the foundation for task-specific implementations including model management, device handling, and prediction workflows.

Subclasses must implement methods for transforming targets and outputs, batch prediction, training and evaluation steps, and visualization.

__init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]

Initializes the Task.

Parameters:
  • model (Module) – PyTorch model instance to be trained or used for inference.

  • model_dir (str) – Directory path for saving and loading model checkpoints.

  • load_saved_model (bool) – Whether to load a previously saved model from model_dir. Defaults to False. Set to True if you want to load model weights from a checkpoint file in model_dir.

  • model_file_name (str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.

  • device (str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. When “auto”, automatically selects the best available device. Defaults to “auto”.

Raises:

AssertionError – If model is not a torch.nn.Module instance, or if model_dir is None, or if model_file_name is not a string, or if device is not one of the valid options.

property model
property model_dir
property device
property model_file_name
move_input_to_device(x, device=None, non_blocking=False, **kwargs)[source]

Moves input data to the specified device.

Handles various input types including tensors, lists, tuples, and dictionaries containing tensors.

Parameters:
  • x (Union[Tensor, list, tuple, dict]) – Input data to move. Can be a single tensor, list/tuple of tensors, or dictionary with tensor values.

  • device (Union[device, str, None]) – Target device. If None, uses the task’s default device. Defaults to None.

  • non_blocking (bool) – Whether to use asynchronous transfer. Defaults to False.

  • **kwargs (dict) – Additional keyword arguments (unused).

Return type:

Union[Tensor, list, tuple, dict]

Returns:

Input data moved to the target device, maintaining the original data structure.

transform_input(x, image_inverse_transform=None)[source]

Applies optional inverse transformation to input images.

Parameters:
  • x (Tensor) – Input image batch in BCHW format.

  • image_inverse_transform (Callable) – Optional transformation function to apply (e.g., denormalization). Defaults to None.

Return type:

Tensor

Returns:

Transformed image batch in BCHW format.

abstract transform_target(y)[source]

Transforms target data for visualization or evaluation.

Parameters:

y – Target data in model format.

Returns:

Transformed target data.

abstract transform_output(prediction)[source]

Transforms model output for visualization or evaluation.

Parameters:

prediction – Model output in raw format.

Returns:

Transformed prediction data.

abstract predict_batch(x, *args, **kwargs)[source]

Performs prediction on a single batch.

Parameters:
  • x – Input batch.

  • *args – Additional positional arguments.

  • **kwargs – Additional keyword arguments.

Returns:

Model predictions for the batch.

abstract train_step(x, y, *args, **kwargs)[source]
Executes a single training step.

Apply any batch based transformation to the target as well, if needed.

Parameters:
  • x – Input batch.

  • y – Target batch.

  • *args – Additional positional arguments.

  • **kwargs – Additional keyword arguments.

Return type:

Tuple[Any, Any, Any]

Returns:

Tuple of (predictions, processed_inputs, processed_targets).

abstract eval_step(x, y, *args, **kwargs)[source]

Executes a single evaluation step.

Parameters:
  • x – Input batch.

  • y – Target batch.

  • *args – Additional positional arguments.

  • **kwargs – Additional keyword arguments.

Return type:

Tuple[Any, Any, Any]

Returns:

Tuple of (predictions, processed_inputs, processed_targets).

abstract predict(loader)[source]

Generates predictions for all data in the loader.

Parameters:

loader – DataLoader containing data for prediction.

Returns:

Predictions and targets.

abstract predict_class(loader)[source]

Generates class predictions for all data in the loader.

Parameters:

loader – DataLoader containing data for prediction.

Returns:

Predicted classes, probabilities, and targets.

abstract show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]

Visualizes model predictions.

Parameters:
  • loader – DataLoader containing data for visualization.

  • image_inverse_transform – Transformation to reverse normalization.

  • samples – Number of samples to display.

  • cols – Number of columns in visualization grid.

  • figsize – Figure size tuple.

  • target_known – Whether ground truth is available.

abstract write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224)[source]

Writes predictions to experiment logger.

Parameters:
  • tag – Tag identifier for logged data.

  • loader – DataLoader containing data.

  • logger – Experiment logger instance.

  • image_inverse_transform – Transformation to reverse normalization.

  • global_step – Current training step/epoch.

  • img_size – Image size for logging.

abstract evaluate(loader, criterion, metrics=None, non_blocking=False)[source]

Evaluates model performance on the given data.

Parameters:
  • loader (DataLoader) – DataLoader containing evaluation data.

  • criterion (Module) – Loss function module.

  • metrics (Dict[str, Module]) – Dictionary of metric modules.

  • non_blocking – Whether to use async CUDA transfers.

Returns:

Dictionary of evaluation metrics.

class deepml.tasks.NeuralNetTask(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]

Bases: Task

Base task implementation for general deep learning tasks.

This class provides a simple implementation suitable for any deep learning task. It performs predictions without applying task-specific transformations and does not write to TensorBoard by default.

Use this class when you need a minimal task implementation without specialized handling for classification, segmentation, or regression.

__init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]

Initializes the NeuralNetTask.

Parameters:
  • model (Module) – PyTorch model instance to be trained or used for inference.

  • model_dir (str) – Directory path for saving and loading model checkpoints.

  • load_saved_model (bool) – Whether to load a previously saved model from model_dir. Defaults to False.

  • model_file_name (str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.

  • device (str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.

predict_batch(x, *args, **kwargs)[source]

Performs prediction on a single batch.

Parameters:
  • x (Tensor) – Input batch tensor.

  • *args – Additional positional arguments.

  • **kwargs – Additional keyword arguments. If ‘model’ key is present, uses that model instead of the task’s default model.

Returns:

Model predictions for the batch.

train_step(x, y, *args, **kwargs)[source]

Executes a single training step.

Parameters:
  • x – Input batch.

  • y – Target batch.

  • *args – Additional positional arguments.

  • **kwargs – Additional keyword arguments.

Returns:

Tuple of (predictions, inputs, targets).

eval_step(x, y, *args, **kwargs)[source]

Executes a single evaluation step.

Parameters:
  • x – Input batch.

  • y – Target batch.

  • *args – Additional positional arguments.

  • **kwargs – Additional keyword arguments.

Returns:

Tuple of (predictions, inputs, targets).

predict(loader)[source]

Generates predictions for all batches in the data loader.

Parameters:

loader (DataLoader) – DataLoader containing data for prediction.

Returns:

  • predictions: Concatenated tensor of all model predictions

  • targets: Concatenated tensor or list of all ground truth labels

Return type:

Tuple of (predictions, targets) where

Raises:

AssertionError – If loader is None or empty.

predict_class(loader)[source]

Generates class predictions for all data in the loader.

Parameters:

loader (DataLoader) – DataLoader containing data for prediction.

Raises:

NotImplementedError – This method must be implemented by subclasses.

show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]

Visualizes model predictions.

Parameters:
  • loader (DataLoader) – DataLoader containing data for visualization.

  • image_inverse_transform (Callable) – Transformation to reverse normalization.

  • samples (int) – Number of samples to display.

  • cols (int) – Number of columns in visualization grid.

  • figsize (Tuple[int, int]) – Figure size tuple.

  • target_known (bool) – Whether ground truth is available.

Raises:

NotImplementedError – This method must be implemented by subclasses.

transform_target(y)[source]

Transforms target data for visualization or evaluation.

Parameters:

y (Any) – Target data in model format.

Raises:

NotImplementedError – This method must be implemented by subclasses.

transform_output(prediction)[source]

Transforms model output for visualization or evaluation.

Parameters:

prediction – Model output in raw format.

Raises:

NotImplementedError – This method must be implemented by subclasses.

write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224, **kwargs)[source]

Writes predictions to experiment logger.

Parameters:
  • tag (str) – Tag identifier for logged data.

  • loader – DataLoader containing data.

  • logger – Experiment logger instance.

  • image_inverse_transform – Transformation to reverse normalization.

  • global_step – Current training step/epoch.

  • img_size – Image size for logging.

  • **kwargs (dict) – Additional keyword arguments.

Note

Default implementation does nothing. Override in subclasses for custom logging behavior.

evaluate(loader, metrics=None, non_blocking=False)

Evaluates the model on the given data loader using specified metrics.

Parameters:
  • loader (DataLoader) – DataLoader containing evaluation data.

  • metrics (Dict[str, Module]) – Dictionary mapping metric names to metric modules. Each metric should be a torch.nn.Module with a forward() method. Defaults to None.

  • non_blocking – Whether to use asynchronous CUDA transfers. Defaults to False.

Returns:

Dictionary mapping metric names to their average values across all batches.

Raises:

Exception – If loader is None.

class deepml.tasks.Segmentation(model, model_dir, mode='binary', load_saved_model=False, model_file_name='latest_model.pt', device='auto', num_classes=1, threshold=0.5, color_map=None)[source]

Bases: NeuralNetTask

Task implementation for binary and multiclass semantic segmentation.

This class handles pixel-level classification tasks including binary and multiclass segmentation with customizable color mapping for visualization.

mode

Segmentation mode (“binary” or “multiclass”).

num_classes

Number of segmentation classes.

threshold

Threshold for binary segmentation predictions.

class_index_to_color

Dictionary mapping class indices to colors.

palette

Color palette for visualization (PIL format).

__init__(model, model_dir, mode='binary', load_saved_model=False, model_file_name='latest_model.pt', device='auto', num_classes=1, threshold=0.5, color_map=None)[source]

Initializes the Segmentation task.

Parameters:
  • model (Module) – PyTorch model architecture for segmentation.

  • model_dir (str) – Directory path for saving/loading model checkpoints.

  • mode (str) – Segmentation mode. Options: “binary” or “multiclass”. Defaults to “binary”.

  • load_saved_model (bool) – Whether to load a previously saved model. Defaults to False.

  • model_file_name (str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.

  • device (str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.

  • num_classes (int) – Number of segmentation classes. For binary segmentation, use 1 (class 0: background, class 1: foreground). Defaults to 1.

  • threshold (float) – Probability threshold for binary segmentation predictions. Defaults to 0.5.

  • color_map (dict) – Dictionary mapping class indices to colors. If None, uses default color maps: - Binary: {0: 0, 1: 255} (grayscale) - Multiclass: {0: [0,0,0], 1: [R,G,B], …} (RGB triplets) For multiclass, random RGB colors are generated if not specified. Class 0 is always background (black). Defaults to None.

Raises:

AssertionError – If num_classes is not an integer or is less than 1.

Example

>>> model = UNet(in_channels=3, out_channels=3)
>>> color_map = {0: [0,0,0], 1: [255,0,0], 2: [0,255,0]}
>>> task = Segmentation(
...     model=model,
...     model_dir="./models",
...     mode="multiclass",
...     num_classes=3,
...     color_map=color_map
... )
predict_batch(x, *args, **kwargs)[source]

Performs prediction on a single batch.

Parameters:
  • x (Union[Tensor, ndarray]) – Input batch tensor.

  • *args – Additional positional arguments.

  • **kwargs (Dict[str, Any]) – Additional keyword arguments. If ‘model’ key is present, uses that model instead of the task’s default model.

Return type:

Tensor

Returns:

Model predictions for the batch.

save_prediction(loader, save_dir)[source]

Generates and saves segmentation predictions as PNG images.

Performs inference on the data loader and saves predicted segmentation masks as PNG files with the appropriate color palette.

Parameters:
  • loader (DataLoader) – DataLoader yielding batches of (images, filenames). The second element must be a list of filename strings.

  • save_dir (str) – Output directory path where prediction PNG files will be saved. Directory will be created if it doesn’t exist.

Raises:

AssertionError – If loader is None, empty, or save_dir is None.

Note

Filenames that don’t end with ‘.png’ will be automatically converted to PNG format with the .png extension.

predict_class(loader)[source]

Generates class predictions for all data in the loader.

Parameters:

loader (DataLoader) – DataLoader containing data for prediction.

Raises:

NotImplementedError – This method must be implemented by subclasses.

show_predictions(loader, image_inverse_transform=None, samples=4, cols=3, figsize=(16, 16), target_known=True)[source]

Visualizes segmentation predictions on sample images.

Displays input images, ground truth masks, and predicted masks in a matplotlib figure with overlays.

Parameters:
  • loader (DataLoader) – DataLoader containing data for visualization.

  • image_inverse_transform (Callable) – Transformation to reverse image normalization for display. Defaults to None.

  • samples (int) – Number of samples to display. Defaults to 4.

  • cols (int) – Number of columns in the visualization grid. Defaults to 3.

  • figsize (Tuple[int, int]) – Figure size as (width, height) tuple. Defaults to (16, 16).

  • target_known (bool) – Whether ground truth targets are available for comparison. Defaults to True.

transform_target(y)[source]

Transforms target mask to RGB color image for visualization.

Parameters:

y (Tensor) – Target segmentation mask with class indices.

Returns:

RGB color image tensor decoded using the class color palette.

transform_output(predictions)[source]

Converts model predictions to class indices.

Applies sigmoid (binary) or softmax (multiclass) activation and converts probabilities to discrete class indices.

Parameters:

predictions (Tensor) – Model output logits of shape (B, C, H, W) where: - B: batch size - C: number of classes (1 for binary, >1 for multiclass) - H: height - W: width

Return type:

Tensor

Returns:

Tensor of class indices with shape (B, H, W). For binary segmentation, values are 0 or 1. For multiclass, values are in range [0, num_classes).

Raises:

AssertionError – If predictions is not 4-dimensional (BCHW format).

Note

  • Binary: Uses sigmoid activation with threshold (default 0.5)

  • Multiclass: Uses softmax activation with argmax

decode_segmentation_mask(class_indices)[source]

Converts class indices to RGB color images for visualization.

Parameters:

class_indices (Tensor) – Batch of segmentation masks with shape (B, H, W) containing class indices.

Returns:

  • For binary: C=1 (grayscale)

  • For multiclass: C=3 (RGB)

Colors are mapped according to the class_index_to_color palette.

Return type:

Batch of RGB images with shape (B, C, H, W) where

Note

Uses PIL Image palette for efficient color mapping in multiclass mode.

log_prediction(tag, predictions, x, targets, logger, image_inverse_transform, global_step, img_size=224, **kwargs)[source]

Logs input images, target masks, and output masks to the experiment logger.

Creates a visualization grid showing input images, ground truth masks, ground truth overlays, predicted masks, and predicted overlays side by side.

Parameters:
  • tag (str) – Tag identifier for the logged images in the experiment tracker.

  • predictions (Tensor) – Model predictions with shape (B, C, H, W) or (B, H, W).

  • x (Tensor) – Input images with shape (B, C, H, W).

  • targets (Tensor) – Ground truth masks with shape (B, H, W) or (B, C, H, W).

  • logger (MLExperimentLogger) – Experiment logger instance for tracking visualizations.

  • image_inverse_transform (Callable) – Callable to reverse image normalization for proper visualization.

  • global_step (int) – Current training step/epoch for the logger.

  • img_size (Union[int, Tuple[int, int], None]) – Target size for resizing images. Can be int or (H, W) tuple. If None, no resizing is performed. Defaults to 224.

  • **kwargs (dict) – Additional keyword arguments passed through.

Note

Override this method to customize the logging behavior. The default implementation creates a grid with 5 images per sample: input, target mask, target overlay, predicted mask, and predicted overlay.

write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224, **kwargs)[source]

Writes input images, targets, and predictions to the experiment logger.

Samples random batches from the data loader, generates predictions, and logs visualizations to the experiment tracker.

Parameters:
  • tag (str) – Tag identifier for the logged images in the experiment tracker.

  • loader (DataLoader) – DataLoader containing data for visualization.

  • logger (MLExperimentLogger) – Experiment logger instance for tracking visualizations.

  • image_inverse_transform (Callable) – Callable to reverse image normalization for proper visualization.

  • global_step (int) – Current training step/epoch for the logger.

  • img_size (Union[int, Tuple[int, int], None]) – Target size for resizing images. Can be int or (H, W) tuple. If None, no resizing is performed. Defaults to 224.

  • **kwargs (dict) – Additional keyword arguments passed to eval_step.

class deepml.tasks.ImageRegression(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]

Bases: NeuralNetTask

Task implementation for image regression problems.

This class handles tasks where the model predicts continuous values from images, such as age estimation, pose estimation, or depth prediction.

The task supports visualization of predictions alongside ground truth values and logging to experiment trackers.

__init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]

Initializes the ImageRegression task.

Parameters:
  • model (Module) – PyTorch model instance for regression.

  • model_dir (str) – Directory path for saving and loading model checkpoints.

  • load_saved_model (bool) – Whether to load a previously saved model from model_dir. Defaults to False.

  • model_file_name (str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.

  • device (str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.

show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]

Visualizes model predictions on sample images.

Displays random samples from the loader with their ground truth values and predicted values in a matplotlib figure.

Parameters:
  • loader (DataLoader) – DataLoader containing data for visualization.

  • image_inverse_transform (Callable) – Transformation to reverse image normalization for display. Defaults to None.

  • samples (int) – Number of samples to display. Defaults to 9.

  • cols (int) – Number of columns in the visualization grid. Defaults to 3.

  • figsize (Tuple[int, int]) – Figure size as (width, height) tuple. Defaults to (10, 10).

  • target_known (bool) – Whether ground truth targets are available for comparison. Defaults to True.

transform_target(y)[source]

Transforms target tensor to a rounded float value.

Parameters:

y (Tensor) – Target tensor (single value).

Returns:

Rounded float value to 2 decimal places.

transform_output(prediction)[source]

Transforms prediction tensor to a rounded float value.

Parameters:

prediction (Tensor) – Prediction tensor (single value).

Returns:

Rounded float value to 2 decimal places.

write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224)[source]

Writes predictions with ground truth values to the experiment logger.

Creates a visualization grid showing input images alongside their ground truth and predicted values as text overlays.

Parameters:
  • tag (str) – Unique tag identifier for the logged images.

  • loader (DataLoader) – DataLoader containing data for visualization.

  • logger (MLExperimentLogger) – Experiment logger instance for tracking visualizations.

  • image_inverse_transform (Callable) – Transformation to reverse image normalization.

  • global_step (int) – Current training epoch/step for the logger.

  • img_size (Union[int, Tuple[int, int], None]) – Image size for TensorBoard logging. Can be int or (H, W) tuple. If None, no visualization is written. Defaults to 224.

predict_class(loader)[source]

Generates class predictions for all data in the loader.

Parameters:

loader – DataLoader containing data for prediction.

Raises:

NotImplementedError – This method must be implemented by subclasses.

class deepml.tasks.ImageClassification(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto', classes=None)[source]

Bases: NeuralNetTask

Task implementation for image classification.

This class handles both binary and multiclass classification tasks where each image belongs to exactly one class. Supports custom class labels and visualization of predictions.

_classes

Optional sequence of class names for human-readable labels.

__init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto', classes=None)[source]

Initializes the ImageClassification task.

Parameters:
  • model (Module) – PyTorch model instance for classification.

  • model_dir (str) – Directory path for saving and loading model checkpoints.

  • load_saved_model (bool) – Whether to load a previously saved model from model_dir. Defaults to False.

  • model_file_name (str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.

  • device (str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.

  • classes (Sequence) – Optional sequence of class names (e.g., [‘cat’, ‘dog’]). If provided, predictions will use these labels instead of class indices. Defaults to None.

predict_class(loader)[source]

Generates class predictions with probabilities for all data.

Parameters:

loader (DataLoader) – DataLoader containing data for prediction.

Returns:

  • predicted_class: Tensor of predicted class indices

  • probability: Tensor of prediction confidence scores

  • targets: Ground truth class labels

Return type:

Tuple of (predicted_class, probability, targets) where

transform_target(y)[source]

Transforms target class index to human-readable label if available.

Parameters:

y – Target class index.

Returns:

Class name if classes are defined, otherwise returns the index.

transform_output(predictions)[source]

Converts model predictions to class indices and probabilities.

Applies sigmoid (binary) or softmax (multiclass) activation and extracts the predicted class and its probability.

Parameters:

predictions (Tensor) – Model output logits with shape (B, num_classes) for multiclass or (B, 1) for binary classification.

Returns:

  • indices: Tensor of predicted class indices with shape (B,)

  • probabilities: Tensor of prediction confidences with shape (B,)

Return type:

Tuple of (indices, probabilities) where

Note

  • Binary: Uses sigmoid with 0.5 threshold

  • Multiclass: Uses softmax with argmax

show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]

Visualizes model predictions on sample images.

Displays random samples from the loader with their ground truth labels, predicted labels, and confidence scores in a matplotlib figure.

Parameters:
  • loader (DataLoader) – DataLoader containing data for visualization.

  • image_inverse_transform (Callable) – Transformation to reverse image normalization for display. Defaults to None.

  • samples (int) – Number of samples to display. Defaults to 9.

  • cols (int) – Number of columns in the visualization grid. Defaults to 3.

  • figsize (Tuple[int, int]) – Figure size as (width, height) tuple. Defaults to (10, 10).

  • target_known (bool) – Whether ground truth targets are available for comparison. If True, titles will be colored green (correct) or red (incorrect). Defaults to True.

write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224)[source]

Writes predictions with labels to the experiment logger.

Creates a visualization grid showing input images alongside their ground truth and predicted class labels with confidence scores.

Parameters:
  • tag (str) – Unique tag identifier for the logged images.

  • loader – DataLoader containing data for visualization.

  • logger (MLExperimentLogger) – Experiment logger instance for tracking visualizations.

  • image_inverse_transform – Transformation to reverse image normalization.

  • global_step (int) – Current training epoch/step for the logger.

  • img_size – Image size for logging. Can be int or (H, W) tuple. If None, no visualization is written. Defaults to 224.

Note

Predictions are colored green for correct classifications and red for incorrect ones.

class deepml.tasks.MultiLabelImageClassification(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto', classes=None)[source]

Bases: ImageClassification

Task implementation for multi-label image classification.

This class handles classification tasks where each image can belong to multiple classes simultaneously (e.g., an image containing both a cat and a dog).

_classes

Optional sequence of class names for human-readable labels.

__init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto', classes=None)[source]

Initializes the MultiLabelImageClassification task.

Parameters:
  • model (Module) – PyTorch model instance for multi-label classification.

  • model_dir – Directory path for saving and loading model checkpoints.

  • load_saved_model (bool) – Whether to load a previously saved model from model_dir. Defaults to False.

  • model_file_name (str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.

  • device (str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.

  • classes – Optional sequence of class names for labeling. Defaults to None.

predict_class(loader)[source]

Generates multi-label class predictions with probabilities for all data.

Parameters:

loader – DataLoader containing data for prediction.

Returns:

  • predicted_class: Binary tensor indicating predicted classes

  • probability: Tensor of class probabilities for all classes

  • targets: Ground truth multi-label targets

Return type:

Tuple of (predicted_class, probability, targets) where

transform_target(y)[source]

Transforms target class indices to comma-separated class labels.

Parameters:

y – Binary tensor or list where 1 indicates the class is present.

Returns:

Comma-separated string of class names if classes are defined, otherwise returns the original indices.

transform_output(predictions)[source]

Converts model predictions to binary class labels and probabilities.

Applies sigmoid activation and thresholding to convert logits into multi-label predictions.

Parameters:

predictions – Model output logits with shape (B, num_classes).

Returns:

  • indices: Binary tensor with shape (B, num_classes). Value is 1 if class is predicted (probability > 0.5), else 0.

  • probabilities: Tensor of class probabilities with shape (B, num_classes) after sigmoid activation.

Return type:

Tuple of (indices, probabilities) where

Note

Uses sigmoid activation with 0.5 threshold for each class independently.

deepml.tracking module

class deepml.tracking.MLExperimentLogger[source]

Bases: ABC

Abstract base class for experiment tracking and logging.

This class defines the interface for logging machine learning experiments across different platforms (TensorBoard, MLflow, Weights & Biases, etc.).

Subclasses must implement all abstract methods to provide platform-specific logging functionality.

__init__()[source]

Initializes the MLExperimentLogger.

abstract log_params(**kwargs)[source]

Logs hyperparameters and configuration for the experiment.

Parameters:

**kwargs – Arbitrary keyword arguments containing parameters to log. Common parameters include model architecture, optimizer settings, learning rate, batch size, etc.

abstract log_metric(tag, value, step)[source]

Logs a scalar metric value at a specific step.

Parameters:
  • tag (str) – Identifier for the metric (e.g., “train/loss”, “val/accuracy”).

  • value (Any) – Numeric value of the metric.

  • step (int) – Training step or epoch number for this metric value.

abstract log_artifact(tag, value, step, artifact_path=None)[source]

Logs an artifact (file, tensor, or other data) to the experiment.

Parameters:
  • tag (str) – Identifier for the artifact.

  • value (Any) – The artifact data to log.

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path for saving the artifact. Defaults to None.

abstract log_model(tag, value, step, artifact_path=None)[source]

Logs a model checkpoint or weights to the experiment.

Parameters:
  • tag (str) – Identifier for the model checkpoint.

  • value (Any) – Model data or checkpoint information.

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path to the model checkpoint. Defaults to None.

abstract log_image(tag, value, step, artifact_path=None)[source]

Logs an image or batch of images to the experiment.

Parameters:
  • tag (str) – Identifier for the image(s).

  • value (Any) – Image data (tensor, numpy array, or PIL Image).

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path for saving the image. Defaults to None.

class deepml.tracking.TensorboardLogger(model_dir)[source]

Bases: MLExperimentLogger

TensorBoard experiment logger implementation.

This logger writes experiment data to TensorBoard, including metrics, images, model graphs, and other artifacts.

writer

TensorBoard SummaryWriter instance for logging.

__init__(model_dir)[source]

Initializes the TensorboardLogger.

Creates a new run directory within the model directory and initializes the TensorBoard SummaryWriter.

Parameters:

model_dir – Base directory path for saving TensorBoard logs. A new timestamped run directory will be created within this path.

log_params(**kwargs)[source]

Logs hyperparameters and model graph to TensorBoard.

Parameters:

**kwargs – Keyword arguments. If ‘task’ and ‘loader’ are provided, writes the model computational graph to TensorBoard.

log_metric(tag, value, step)[source]

Logs a scalar metric value to TensorBoard.

Parameters:
  • tag (str) – Metric identifier (e.g., “train/loss”, “val/accuracy”).

  • value (float) – Numeric metric value.

  • step (int) – Training step or epoch number.

log_artifact(tag, value, step, artifact_path=None)[source]

Logs an artifact to TensorBoard.

Parameters:
  • tag (str) – Artifact identifier.

  • value (Any) – Artifact data. If a torch.Tensor, logs as images.

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path (unused in this implementation). Defaults to None.

log_model(tag, value, step, artifact_path=None)[source]

Logs a model checkpoint to TensorBoard.

Parameters:
  • tag (str) – Model identifier.

  • value (Any) – Model data.

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path to the model checkpoint. Defaults to None.

log_image(tag, value, step, artifact_path=None)[source]

Logs an image or batch of images to TensorBoard.

Parameters:
  • tag (str) – Image identifier.

  • value (Any) – Image data as a torch.Tensor with shape (B, C, H, W).

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path (unused in this implementation). Defaults to None.

Note

Only logs tensors with 4 dimensions (batch of images).

class deepml.tracking.MLFlowLogger(experiment_name='Default', tracking_uri=None, log_model_weights=True)[source]

Bases: MLExperimentLogger

MLflow experiment logger implementation.

This logger writes experiment data to MLflow tracking server, including metrics, parameters, model checkpoints, and images.

mlflow

MLflow module instance.

log_model_weights

Whether to log model weights as artifacts.

Note

Requires mlflow package to be installed.

__init__(experiment_name='Default', tracking_uri=None, log_model_weights=True)[source]

Initializes the MLFlowLogger.

Sets up the MLflow experiment and optionally configures the tracking URI.

Parameters:
  • experiment_name (str) – Name of the MLflow experiment. Defaults to “Default”.

  • tracking_uri (str) – URI of the MLflow tracking server. If None, uses the default local tracking. Defaults to None.

  • log_model_weights (bool) – Whether to log model weights as artifacts. Defaults to True.

log_params(**kwargs)[source]

Logs hyperparameters to MLflow.

Parameters:

**kwargs – Arbitrary keyword arguments containing parameters to log.

log_metric(tag, value, step)[source]

Logs a scalar metric value to MLflow.

Parameters:
  • tag (str) – Metric identifier (e.g., “train/loss”, “val/accuracy”).

  • value (Any) – Numeric metric value.

  • step (int) – Training step or epoch number.

log_artifact(tag, value, step, artifact_path=None)[source]

Logs an artifact to MLflow.

Parameters:
  • tag (str) – Artifact identifier.

  • value (Any) – Artifact data.

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path to the artifact. Defaults to None.

Note

Currently not implemented. Override to add custom artifact logging.

log_model(tag, value, step, artifact_path=None)[source]

Logs a model checkpoint to MLflow.

Parameters:
  • tag (str) – Model identifier.

  • value (Any) – Model data (unused).

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – File path to the model checkpoint.

Note

Only logs if log_model_weights is True and artifact_path is provided.

log_image(tag, value, step, artifact_path=None)[source]

Logs an image to MLflow.

Parameters:
  • tag (str) – Image identifier/key.

  • value (Any) – Image data as a numpy array or PIL Image.

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path (unused). Defaults to None.

class deepml.tracking.WandbLogger(delete_intermediate_artifacts_versions=True, **kwargs)[source]

Bases: MLExperimentLogger

Weights & Biases (wandb) experiment logger implementation.

This logger writes experiment data to Weights & Biases, including metrics, parameters, model artifacts, and images. Supports automatic cleanup of intermediate artifact versions to avoid storage overflow.

wandb

Wandb module instance.

delete_intermediate_artifacts_versions

Whether to delete old artifact versions automatically.

Note

Requires wandb package to be installed.

__init__(delete_intermediate_artifacts_versions=True, **kwargs)[source]

Initializes the WandbLogger.

Parameters:
  • delete_intermediate_artifacts_versions (bool) – Whether to delete intermediate versions of artifacts during logging to avoid memory overflow. Defaults to True.

  • **kwargs (dict) – Keyword arguments passed to wandb.init() for initialization. Common arguments include project, entity, name, config, etc.

log_params(**kwargs)[source]

Logs hyperparameters to Weights & Biases.

Parameters:

**kwargs – Arbitrary keyword arguments containing parameters to log. These will be added to the wandb config.

log_metric(tag, value, step)[source]

Logs a scalar metric value to Weights & Biases.

Parameters:
  • tag (str) – Metric identifier (e.g., “train/loss”, “val/accuracy”).

  • value (Any) – Numeric metric value.

  • step (int) – Training step or epoch number (unused, wandb auto-increments).

log_artifact(tag, value, step, artifact_path=None)[source]

Logs an artifact to Weights & Biases.

Parameters:
  • tag (str) – Artifact identifier.

  • value (Any) – Artifact data. If a 4D torch.Tensor, can be logged as images.

  • step (int) – Training step or epoch number.

  • artifact_path (Optional[str]) – Optional file path to the artifact. Defaults to None.

Note

Image logging for tensors is currently not implemented (TODO).

log_model(tag, value, step, artifact_path=None)[source]

Logs a model checkpoint to Weights & Biases.

Creates a wandb Artifact for the model and optionally deletes older versions if delete_intermediate_artifacts_versions is True.

Parameters:
  • tag (str) – Model identifier/artifact name.

  • value (Any) – Model data (unused).

  • step (int) – Training step or epoch number (unused).

  • artifact_path (Optional[str]) – File path to the model checkpoint file.

Note

If delete_intermediate_artifacts_versions is enabled, only the latest version of the artifact is retained to save storage space.

log_image(tag, value, step, artifact_path=None)[source]

Logs an image to Weights & Biases.

Parameters:
  • tag (str) – Image identifier/key for logging.

  • value (Any) – Image data (numpy array, PIL Image, or tensor).

  • step (int) – Training step or epoch number (unused, wandb auto-increments).

  • artifact_path (Optional[str]) – Optional file path (unused). Defaults to None.

deepml.trainer module

class deepml.trainer.Learner(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_step_policy='epoch', load_state=False, use_amp=False)[source]

Bases: object

Training class for learning model weights using PyTorch.

This trainer provides straightforward training functionality with support for learning rate scheduling, automatic mixed precision (AMP), gradient accumulation, and gradient clipping. It’s designed for single-device training and works well in interactive environments like Jupyter notebooks.

For multi-GPU or distributed training, consider using FabricTrainer or AcceleratorTrainer.

epochs_completed

Number of epochs completed in training.

best_val_loss

Best validation loss achieved during training.

history

Dictionary storing training history metrics across epochs.

logger

Experiment logger for tracking metrics and artifacts.

Note

This trainer is ideal for: - Single GPU/CPU training - Jupyter notebook environments - Simple training workflows without distributed requirements - Debugging and prototyping

__init__(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_step_policy='epoch', load_state=False, use_amp=False)[source]

Initializes the Learner.

Parameters:
  • task (Task) – Task object defining the learning task (e.g., classification, segmentation).

  • optimizer (Optimizer) – PyTorch optimizer instance for parameter updates.

  • criterion (Module) – Loss function module.

  • lr_scheduler – Learning rate scheduler instance. Defaults to None.

  • lr_scheduler_step_policy (str) – When to call scheduler.step(). Valid options are "epoch" (step after each epoch) or "step" (step after each optimizer update). Defaults to "epoch".

  • load_state (bool) – Whether to resume model training. If True, loads optimizer state, scheduler state (if any), and training history from checkpoint. Defaults to False.

  • use_amp (bool) – Whether to use automatic mixed precision (AMP) for training. Defaults to False.

set_optimizer(optimizer)[source]

Sets the optimizer for training.

Parameters:

optimizer (Optimizer) – PyTorch optimizer instance.

Raises:

AssertionError – If optimizer is not a torch.optim.Optimizer instance.

set_criterion(criterion)[source]

Sets the loss function for training.

Parameters:

criterion (Module) – Loss function module.

Raises:

AssertionError – If criterion is not a torch.nn.Module instance.

set_lr_scheduler(lr_scheduler, lr_scheduler_step_policy='epoch')[source]

Sets the learning rate scheduler.

Parameters:
  • lr_scheduler – Learning rate scheduler instance. If None, no scheduler is used.

  • lr_scheduler_step_policy (str) – When to call scheduler.step(). Valid options are "epoch" or "step". Defaults to "epoch".

Raises:

AssertionError – If lr_scheduler_step_policy is not "epoch" or "step".

save(tag, save_optimizer_state=False, epoch=-1, train_loss=None, val_loss=None)[source]

Saves model checkpoint and training state.

Parameters:
  • tag (str) – Name tag for the checkpoint file (without extension).

  • save_optimizer_state (bool) – Whether to include optimizer state in the checkpoint. Defaults to False.

  • epoch (int) – Current epoch number. Defaults to -1.

  • train_loss (float) – Training loss value for this checkpoint. Defaults to None.

  • val_loss (float) – Validation loss value for this checkpoint. Defaults to None.

Returns:

Full path to the saved checkpoint file.

Return type:

str

Note

  • Automatically handles DataParallel models

  • Saves scheduler state if scheduler is configured

  • Saves AMP scaler state if AMP is enabled

  • Logs the model to the experiment logger

validate(loader, criterion, metrics=None, non_blocking=False)

Evaluates the model on the validation data.

Parameters:
  • loader (DataLoader) – DataLoader for validation data.

  • criterion (Module) – Loss function module.

  • metrics (Dict[str, Module]) – Dictionary mapping metric names to metric modules. Defaults to None.

  • non_blocking – Whether to use asynchronous CUDA transfers. Defaults to False.

Returns:

OrderedDict mapping metric names to their average values across all batches.

Raises:

Exception – If loader is None.

Note

  • Model is set to eval() mode

  • Gradients are disabled via @torch.no_grad() decorator

  • Metrics are computed as running averages

set_predictor(predictor)[source]
fit(train_loader, val_loader=None, epochs=10, steps_per_epoch=None, save_model_after_every_epoch=5, metrics=None, gradient_accumulation_steps=1, gradient_clip_value=0, gradient_clip_algorithm='norm', logger=None, non_blocking=True, image_inverse_transform=None, logger_img_size=None)[source]

Trains the model for the specified number of epochs.

Parameters:
  • train_loader (DataLoader) – DataLoader for training data.

  • val_loader (DataLoader) – DataLoader for validation data. Defaults to None.

  • epochs (int) – Total number of epochs to train. Defaults to 10.

  • steps_per_epoch (int) – Number of steps per epoch. Should be around len(train_loader) to ensure full dataset coverage. If None, defaults to len(train_loader). Defaults to None.

  • save_model_after_every_epoch (int) – Frequency (in epochs) to save model checkpoints. Defaults to 5.

  • metrics (Dict[str, Module]) – Dictionary mapping metric names to metric instances. Each metric must be a torch.nn.Module with a forward() method. Defaults to None.

  • gradient_accumulation_steps (int) – Number of steps to accumulate gradients before performing an optimizer step. Simulates larger batch sizes. Must be > 0. Defaults to 1.

  • gradient_clip_value (float) – Maximum value for gradient clipping. If 0, no clipping is applied. Defaults to 0.

  • gradient_clip_algorithm (str) – Gradient clipping algorithm. Options: - "norm": Clip by gradient norm (recommended) - "value": Clip by gradient value Defaults to "norm".

  • logger (MLExperimentLogger) – Experiment logger for tracking metrics and artifacts. If None, uses TensorboardLogger. Defaults to None.

  • non_blocking (bool) – Whether to use asynchronous CUDA tensor transfers. Defaults to True.

  • image_inverse_transform (Callable) – Transformation to reverse image normalization for visualization in TensorBoard. Defaults to None.

  • logger_img_size (Union[int, Tuple[int, int]]) – Image size (int or tuple) for TensorBoard logging. Defaults to None.

Raises:
  • AssertionError – If steps_per_epoch > len(train_loader).

  • AssertionError – If gradient_accumulation_steps <= 0.

  • AssertionError – If gradient_clip_algorithm not in [“norm”, “value”].

  • TypeError – If any metric is not a torch.nn.Module with a forward() method.

Note

  • Supports automatic mixed precision (AMP) if enabled in __init__

  • Automatically saves best validation model when validation improves

  • Handles DataParallel models automatically

  • Learning rate scheduler can step per epoch or per gradient update

  • For multi-GPU/distributed training, use FabricTrainer or AcceleratorTrainer

predict(loader)[source]

Generates predictions for all data in the loader.

Parameters:

loader – DataLoader containing data for prediction.

Returns:

Tuple of (predictions, targets) where predictions are the model outputs and targets are the ground truth labels.

predict_class(loader)[source]

Generates class predictions with probabilities for all data.

Parameters:

loader – DataLoader containing data for prediction.

Returns:

  • predicted_class: Predicted class labels

  • probability: Class probabilities or confidence scores

  • targets: Ground truth labels

Return type:

Tuple of (predicted_class, probability, targets) where

extract_features(loader, no_of_features, features_csv_file, iterations=1, target_known=True)[source]

Extracts features from the model and saves them to a CSV file.

Parameters:
  • loader – DataLoader containing data for feature extraction.

  • no_of_features – Number of features to extract from the model.

  • features_csv_file – Path to the output CSV file.

  • iterations – Number of passes through the loader. Defaults to 1.

  • target_known – Whether ground truth labels are available. If True, includes labels in the CSV file. Defaults to True.

Note

  • Features are extracted in evaluation mode

  • CSV format: If target_known=True: [class, feat_0, feat_1, …]

  • CSV format: If target_known=False: [feat_0, feat_1, …]

show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]

Visualizes model predictions on sample images.

Parameters:
  • loader – DataLoader containing data for visualization.

  • image_inverse_transform – Transformation to reverse image normalization for display. Defaults to None.

  • samples – Number of samples to display. Defaults to 9.

  • cols – Number of columns in the visualization grid. Defaults to 3.

  • figsize – Figure size as (width, height) tuple. Defaults to (10, 10).

  • target_known – Whether ground truth targets are available for comparison. Defaults to True.

deepml.transforms module

class deepml.transforms.AlbumentationTorchTranforms(albu_transforms=None, torch_transforms=None)[source]

Bases: object

This class is a composition of albumentations augmentation and torchvision.transforms.ToTensor() This first applies albumentations transformations followed by torch transforms if any.

albumentations transforms gets applied on both image and mask, however the torch transforms gets applied on only on input image and not on the target mask.

__init__(albu_transforms=None, torch_transforms=None)[source]
class deepml.transforms.ImageInverseTransform(mean, std)[source]

Bases: object

Implementation of the inverse transform for image using mean and std_dev Accepts image_batch in #B, #C, #H #W order

__init__(mean, std)[source]
class deepml.transforms.ImageNetInverseTransform[source]

Bases: ImageInverseTransform

Imagenet inverse transform accepts image_batch in #B, #C, #H #W order

__init__()[source]
class deepml.transforms.DivideBy255[source]

Bases: object

Divide by 255

class deepml.transforms.MulticlassSegmentationTargetTransform(num_classes)[source]

Bases: object

Converts categorical class index tensor into one-hot vector required for multiclass segmentation.

__init__(num_classes)[source]

deepml.utils module

deepml.utils.create_text_image(text, img_size=(224, 224), text_color='black')[source]
deepml.utils.transform_target(target, classes=None)[source]

Accepts target value either single dimensional torch.Tensor or (int, float) :type target: :param target: :type classes: :param classes: :return:

deepml.utils.transform_input(x, image_inverse_transform=None)[source]

Accepts input image batch in #BCHW form

Parameters:
  • x – input image batch

  • image_inverse_transform – an optional inverse transform to apply

Returns:

deepml.utils.get_random_samples_batch_from_loader(loader, samples=None)[source]
deepml.utils.get_random_samples_batch_from_dataset(dataset, samples=8)[source]

Returns a random batch of samples from the dataset. :type dataset: :param dataset: torch.utils.data.Dataset or any iterable dataset :type samples: :param samples: no. of samples to return, defaults to 8 :rtype: list :return: list of samples from the dataset

deepml.utils.blend(image, mask, alpha=0.6, beta=0.4)[source]

Blends an input image with a mask using specified alpha and beta values. :type image: Tensor :param image: torch.Tensor of size BCHW, Grayscale or RGB image to blend with the mask of size #HWC or #HW :type mask: Tensor :param mask: torch.Tensor, torch.Tensor of size BCHW , mask to blend with the input image of size #HWC or #HW :type alpha: float :param alpha: alpha blending factor for the RGB image :type beta: float :param beta: beta blending factor for the mask :rtype: array :return: torch.Tensor of original size, blended image

deepml.visualize module

deepml.visualize.plot_images(images, labels=None, cols=4, figsize=(10, 10), fontsize=14)[source]

Displays a grid of images with optional labels using matplotlib.

Creates a multi-panel figure showing images in a grid layout with optional titles for each image.

Parameters:
  • images (List[ndarray]) – List of images as numpy arrays in HWC or HW format.

  • labels (List[str]) – List of labels/titles for each image. If provided, must have the same length as images. Defaults to None.

  • cols (int) – Number of columns in the grid. Rows are calculated automatically. Defaults to 4.

  • figsize (Tuple[int, int]) – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).

  • fontsize (int) – Font size for image titles. Defaults to 14.

Note

The function automatically calculates the number of rows needed based on the number of images and columns. Axes ticks are hidden for cleaner visualization.

deepml.visualize.plot_images_with_title(image_generator, samples, cols=4, figsize=(10, 10), fontsize=14)[source]

Displays a grid of images with colored titles using matplotlib.

Creates a multi-panel figure showing images in a grid layout with titles that can have custom colors (useful for showing correct/incorrect predictions).

Parameters:
  • image_generator

    Generator or iterable yielding tuples of (image, title, title_color) where:

    • image: numpy array in HWC or HW format

    • title: String title for the image

    • title_color: Optional color string (e.g., ‘red’, ‘green’, ‘#ff0000’). If None, uses default matplotlib text color.

  • samples (int) – Total number of images to display from the generator.

  • cols – Number of columns in the grid. Defaults to 4.

  • figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).

  • fontsize – Font size for image titles. Defaults to 14.

Note

This function is commonly used for showing model predictions where title colors indicate correctness (green for correct, red for incorrect).

deepml.visualize.plot_images_with_bboxes(image_generator, samples, cols=4, figsize=(10, 10), fontsize=14, classes=None, class_color_map=None, cmap='tab10')[source]

Displays a grid of images with bounding boxes and class labels.

Creates a multi-panel figure showing images with drawn bounding boxes and labeled class names. Each bounding box is colored based on its class.

Parameters:
  • image_generator

    Generator or iterable yielding tuples of (image, title, bboxes) where:

    • image: numpy array in HWC or HW format

    • title: String title for the image

    • bboxes: List of bounding boxes, each as [class_id, xmin, ymin, width, height] where class_id can be an integer index or string label.

  • samples (int) – Total number of images to display from the generator.

  • cols – Number of columns in the grid. Defaults to 4.

  • figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).

  • fontsize – Font size for image titles and bbox labels. Defaults to 14.

  • classes (List[str]) – Optional list mapping class indices to class names. If provided and class_id is an integer, uses classes[class_id] as the label. Defaults to None.

  • class_color_map (dict) – Optional dictionary mapping class IDs or names to color strings (e.g., ‘#ff0000’, ‘red’). If a class has no mapping, falls back to the colormap. Defaults to None.

  • cmap (str) – Matplotlib colormap name used as fallback for bbox colors when class_color_map doesn’t provide a color. Defaults to “tab10”.

Note

Bounding boxes are drawn with red edges and labeled with a colored background box containing the class name. Label text is white for better visibility against the colored background.

deepml.visualize.show_images_from_loader(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(5, 5), classes=None, title_color=None)[source]

Displays random samples of images from a DataLoader.

Randomly selects and displays images from a PyTorch DataLoader with their corresponding labels as titles.

Parameters:
  • loader – PyTorch DataLoader returning batches of (image, label) tensors.

  • image_inverse_transform – Optional callable to reverse image normalization or transformations before display (e.g., denormalization). Defaults to None.

  • samples – Number of images to display. Defaults to 9.

  • cols – Number of columns in the grid. Defaults to 3.

  • figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (5, 5).

  • classes – Optional list of class names for converting label indices to text. If None and loader.dataset has a ‘classes’ attribute, uses that. Defaults to None.

  • title_color – Optional color string for all image titles (e.g., ‘blue’). Defaults to None.

Note

Images are randomly sampled from the DataLoader. If the DataLoader’s dataset has a ‘classes’ attribute, it will be used automatically for label names unless overridden by the classes parameter.

deepml.visualize.show_images_from_dataset(dataset, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), classes=None, title_color=None)[source]

Displays random samples of images from a Dataset.

Randomly selects and displays images from a PyTorch Dataset with their corresponding labels as titles.

Parameters:
  • dataset – PyTorch Dataset returning (image, label) tuples.

  • image_inverse_transform – Optional callable to reverse image normalization or transformations before display (e.g., denormalization). Defaults to None.

  • samples – Number of images to display. Defaults to 9.

  • cols – Number of columns in the grid. Defaults to 3.

  • figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).

  • classes – Optional list of class names for converting label indices to text. If None and dataset has a ‘classes’ attribute, uses that. Defaults to None.

  • title_color – Optional color string for all image titles (e.g., ‘blue’). Defaults to None.

Note

Images are randomly sampled from the Dataset. If the Dataset has a ‘classes’ attribute, it will be used automatically for label names unless overridden by the classes parameter.

deepml.visualize.show_images_from_folder(img_dir, images=None, open_file_func=None, samples=9, cols=3, figsize=(10, 10), title_color=None)[source]

Displays random samples of images from a folder.

Randomly selects and displays images from a directory with filenames as titles.

Parameters:
  • img_dir – Directory path containing image files.

  • images – Optional list of image filenames to display. If None, all files in img_dir are used and randomly sampled. Defaults to None.

  • open_file_func (Callable) – Optional callable to open image files. Should accept a file path and return an image object. If None, uses PIL.Image.open. Defaults to None.

  • samples – Number of images to display. If fewer images exist, displays all. Defaults to 9.

  • cols – Number of columns in the grid. Defaults to 3.

  • figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).

  • title_color – Optional color string for all image titles (e.g., ‘blue’). Defaults to None.

Note

If the number of requested samples exceeds available images, all images are displayed. Images are randomly sampled without replacement.

deepml.visualize.show_images_from_dataframe(dataframe, img_dir=None, image_file_name_column='image', image_filepath_column=None, open_file_func=None, label_column=None, bbox_label_column=None, samples=9, cols=3, figsize=(10, 10), classes=None, class_color_map=None, cmap='tab10')[source]

Displays random samples of images from a pandas DataFrame.

Randomly selects and displays images specified in a DataFrame, with optional labels and bounding boxes.

Parameters:
  • dataframe – pandas DataFrame containing image file information.

  • img_dir – Directory containing images. Required if image_filepath_column is not provided. Defaults to None.

  • image_file_name_column – Column name containing image filenames (used with img_dir). Defaults to “image”.

  • image_filepath_column – Column name containing absolute image file paths. If provided, takes precedence over image_file_name_column and img_dir. Defaults to None.

  • open_file_func (Callable) – Optional callable to open image files. Should accept a file path and return an image object. If None, uses PIL.Image.open. Defaults to None.

  • label_column (str) – Column name containing image labels. If None, displays row indices instead. Defaults to None.

  • bbox_label_column (str) – Column name containing bounding box data. Each entry should be a list of bounding boxes in format [class_id, xmin, ymin, width, height]. If provided, displays images with bounding boxes. Defaults to None.

  • samples – Number of random images to display from the DataFrame. Defaults to 9.

  • cols – Number of columns in the grid. Defaults to 3.

  • figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).

  • classes – Optional list mapping class indices to class names for bbox labels. Defaults to None.

  • class_color_map (dict) – Optional dictionary mapping class IDs or names to color strings (e.g., ‘#ff0000’, ‘red’). Used for bbox colors. Defaults to None.

  • cmap (str) – Matplotlib colormap name used as fallback for bbox colors when class_color_map doesn’t provide a color. Defaults to “tab10”.

Note

  • If bbox_label_column is provided, displays images with bounding boxes using plot_images_with_bboxes.

  • Otherwise, displays images with titles using plot_images_with_title.

  • Images are randomly sampled from the DataFrame.

Module contents