deepml package
Subpackages
- deepml.geospatial package
- deepml.metrics package
- deepml.model_arch package
- Submodules
- deepml.model_arch.dlinknet module
- deepml.model_arch.refine_net module
- deepml.model_arch.unet module
- Module contents
Submodules
deepml.accelerator_trainer module
- class deepml.accelerator_trainer.AcceleratorTrainer(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_step_policy='epoch', accelerator_config=None)[source]
Bases:
BaseLearnerTraining class using HuggingFace Accelerate for distributed training.
This trainer leverages the Accelerate library for seamless distributed training, mixed precision, and device management across CPUs, GPUs, and TPUs. It supports gradient accumulation, gradient clipping, and automatic model/optimizer preparation.
- __init__(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_step_policy='epoch', accelerator_config=None)[source]
Initializes the AcceleratorTrainer.
- Parameters:
task (
Task) – Task object defining the learning task (e.g., classification, segmentation).optimizer (
Optimizer) – PyTorch optimizer instance for parameter updates.criterion (
Module) – Loss function module.lr_scheduler (
Optional[_LRScheduler]) – Learning rate scheduler instance. Defaults to None.lr_scheduler_step_policy (
str) – When to call scheduler.step(). Valid options are"epoch"(step after each epoch) or"step"(step after each optimizer update). Defaults to"epoch".accelerator_config (
Optional[dict]) – Optional dictionary of keyword arguments passed to Accelerate.Accelerator() for configuration. Common options include: -gradient_accumulation_steps: Number of steps to accumulate gradients -mixed_precision: Mixed precision mode (“no”, “fp16”, “bf16”) -device_placement: Whether to automatically place tensors on device -split_batches: Whether to split batches across devices Defaults to None (uses Accelerate defaults).
Note
Unlike FabricTrainer, this class accepts an lr_scheduler instance directly rather than a factory function (lr_scheduler_fn).
- fit(train_loader, val_loader=None, epochs=10, save_model_after_every_epoch=5, metrics=None, gradient_clip_value=None, gradient_clip_max_norm=None, resume_from_checkpoint=None, load_optimizer_state=False, load_scheduler_state=False, logger=None, non_blocking=True, image_inverse_transform=None, logger_img_size=None)[source]
Trains the model for the specified number of epochs using Accelerate.
Handles the complete training workflow including model preparation, distributed training coordination, checkpointing, validation, and metric logging.
- Parameters:
train_loader (
DataLoader) – DataLoader for training data.val_loader (
DataLoader) – DataLoader for validation data. Defaults to None.epochs (
int) – Total number of epochs to train. Defaults to 10.save_model_after_every_epoch (
int) – Frequency (in epochs) to save model checkpoints. Defaults to 5.metrics (
Dict[str,Module]) – Dictionary mapping metric names to metric instances. Each metric must be a torch.nn.Module with a forward() method. Defaults to None.gradient_clip_value (
Optional[float]) – Maximum absolute value for gradient clipping. Gradients will be clipped to [-gradient_clip_value, gradient_clip_value]. Mutually exclusive with gradient_clip_max_norm. Defaults to None.gradient_clip_max_norm (
Optional[float]) – Maximum L2 norm for gradient clipping. Mutually exclusive with gradient_clip_value. Defaults to None.resume_from_checkpoint (
str) – Path to checkpoint file to resume training from. Defaults to None.load_optimizer_state (
bool) – Whether to load optimizer state from checkpoint. Defaults to False.load_scheduler_state (
bool) – Whether to load learning rate scheduler state from checkpoint. Defaults to False.logger (
MLExperimentLogger) – Experiment logger for tracking metrics and artifacts. If None, uses TensorboardLogger. Defaults to None.non_blocking (
bool) – Whether to use asynchronous CUDA tensor transfers. Defaults to True.image_inverse_transform (
Callable) – Transformation to reverse image normalization for visualization in TensorBoard. Defaults to None.logger_img_size (
Union[int,Tuple[int,int]]) – Image size (int or tuple) for TensorBoard logging. Defaults to None.
- Returns:
Dictionary containing training history with metric names as keys and lists of values as entries.
- Raises:
ValueError – If both gradient_clip_value and gradient_clip_max_norm are provided.
TypeError – If any metric is not a torch.nn.Module with a forward() method.
Note
All model, optimizer, scheduler, and dataloaders are prepared by Accelerate
Only the main process saves checkpoints and manages logging
All processes synchronize at the end of each epoch using wait_for_everyone()
The model is automatically unwrapped when saving best validation checkpoint
- fit_temp(train_loader, val_loader, epochs=10, metrics={})[source]
Temporary/experimental training method with simplified Accelerate workflow.
Warning: This method appears to be legacy/debug code and should not be used in production. Use the
fit()method instead.- Parameters:
train_loader – DataLoader for training data.
val_loader – DataLoader for validation data.
epochs – Number of epochs to train. Defaults to 10.
metrics (
dict) – Dictionary mapping metric names to metric functions. Defaults to {}.
Note
- This method has several issues compared to the main
fit()method: References
self.modelinstead ofself._modelHardcoded checkpoint paths
Missing checkpoint management features
Uses deprecated
gather_for_metrics()instead ofgather()
- This method has several issues compared to the main
This should likely be removed or refactored to align with
fit()
- Deprecated:
Use
fit()method instead for production training.
deepml.base module
- class deepml.base.BaseLearner(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_fn=None, lr_scheduler_step_policy='epoch')[source]
Bases:
ABC- __init__(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_fn=None, lr_scheduler_step_policy='epoch')[source]
- create_state_dict(model, optimizer, criterion, lr_scheduler=None, epoch=-1, train_loss=inf, val_loss=inf)[source]
- save(tag, model, optimizer, criterion, lr_scheduler=None, epoch=-1, train_loss=inf, val_loss=inf, **kwargs)[source]
- static update_metrics_with_simple_moving_average(source_metrics_dict, target_metrics_dict, step)[source]
deepml.constants module
deepml.datasets module
- class deepml.datasets.ImageRowDataFrameDataset(dataframe, target_column=None, image_size=(28, 28), transform=None)[source]
Bases:
DatasetDataset for reading images stored as flattened arrays in DataFrame rows.
This dataset treats each row of a DataFrame as a flattened image array, which is then reshaped to the specified image dimensions.
- dataframe
DataFrame containing flattened image data (without target column).
- target_column
Series containing target labels, if provided.
- samples
Number of samples in the dataset.
- image_size
Tuple specifying the output image dimensions (height, width).
- transform
Optional transformation callable to apply to images.
- __init__(dataframe, target_column=None, image_size=(28, 28), transform=None)[source]
Initializes the ImageRowDataFrameDataset.
- Parameters:
dataframe (
DataFrame) – DataFrame where each row contains a flattened image array.target_column (
str) – Name of the column containing target labels. If provided, this column is extracted and removed from the DataFrame. Defaults to None.image_size (
Tuple[int,int]) – Dimensions to reshape each image to as (height, width). Defaults to (28, 28).transform (
Callable) – Optional callable to transform images (e.g., torchvision transforms). Defaults to None.
Note
The DataFrame is reset with a fresh index, and the target column (if specified) is removed from the image data.
- __getitem__(index)[source]
Retrieves an image and its label at the specified index.
- Parameters:
index (
int) – Index of the sample to retrieve.- Returns:
image: Transformed PIL Image or tensor of shape specified by image_size
label: Target label if target_column was provided, otherwise 0
- Return type:
Tuple of (image, label) where
- class deepml.datasets.ImageDataFrameDataset(dataframe, image_file_name_column='image', target_columns=None, image_dir=None, transforms=None, target_transform=None, open_file_func=None)[source]
Bases:
DatasetDataset for reading images from file paths specified in a DataFrame.
This dataset loads images from disk based on file paths listed in a DataFrame, making it suitable for image classification and regression tasks.
- dataframe
DataFrame containing image file paths and optional target columns.
- image_file_name_column
Name of the column containing image filenames.
- target_columns
Column name(s) containing target values.
- image_dir
Base directory containing images.
- transforms
Transformation callable to apply to images.
- samples
Number of samples in the dataset.
- target_transform
Transformation callable to apply to targets.
- open_file_func
Custom function for opening image files.
- __init__(dataframe, image_file_name_column='image', target_columns=None, image_dir=None, transforms=None, target_transform=None, open_file_func=None)[source]
Initializes the ImageDataFrameDataset.
- Parameters:
dataframe (
DataFrame) – DataFrame containing image file paths and optional targets.image_file_name_column (
str) – Name of the column containing image filenames. Defaults to “image”.target_columns (
Union[int,List[str]]) – Column name(s) containing target values. Can be a single column name (str) or list of column names for multi-target tasks. If None, no targets are loaded. Defaults to None.image_dir (
str) – Base directory containing images. If provided, filenames from the DataFrame are joined with this directory. Defaults to None.transforms (
Callable) – Optional callable to transform images (e.g., torchvision transforms). Defaults to None.target_transform (
Callable) – Optional callable to transform target values. Defaults to None.open_file_func (
Callable) – Custom callable to open image files. Should accept a file path and return an image object. If None, uses PIL.Image.open. Defaults to None.
Note
The DataFrame is reset with a fresh index to ensure consistent indexing.
- __len__()[source]
Returns the total number of samples in the dataset.
- Return type:
- Returns:
Number of samples.
- __getitem__(index)[source]
Retrieves an image and its target at the specified index.
- Parameters:
index (
int) – Index of the sample to retrieve.- Returns:
image: Transformed image as PIL Image or tensor
target: Target value(s) as tensor if target_columns was provided, otherwise 0
- Return type:
Tuple of (image, target) where
Note
If image_dir is provided, the image path is constructed by joining image_dir with the filename from the DataFrame.
- class deepml.datasets.ImageListDataset(image_dir, transforms=None, open_file_func=None)[source]
Bases:
DatasetDataset for loading all images from a directory.
This dataset reads all files from a specified directory and treats them as images. It returns both the image and its filename, making it useful for inference or unlabeled image processing tasks.
- image_dir
Directory path containing image files.
- images
List of image filenames in the directory.
- transforms
Optional transformation callable to apply to images.
- open_file_func
Custom function for opening image files.
- __init__(image_dir, transforms=None, open_file_func=None)[source]
Initializes the ImageListDataset.
- Parameters:
image_dir (
str) – Directory path containing image files.transforms (
Callable) – Optional callable to transform images (e.g., torchvision transforms). Defaults to None.open_file_func (
Callable) – Custom callable to open image files. Should accept a file path and return an image object. If None, uses PIL.Image.open. Defaults to None.
Note
All files in the directory are assumed to be images. No filtering is applied.
- class deepml.datasets.SegmentationDataFrameDataset(dataframe, image_dir, mask_dir=None, image_col='image', mask_col=None, albu_torch_transforms=None, target_transform=None, train=True, open_file_func=None)[source]
Bases:
DatasetDataset for semantic segmentation with images and corresponding masks.
This dataset loads images and their corresponding segmentation masks from directories specified in a DataFrame. It supports both training mode (with masks) and inference mode (without masks).
- dataframe
DataFrame containing image and mask file information.
- image_dir
Directory containing input images.
- mask_dir
Directory containing segmentation masks (required for training).
- image_col
Column name for image filenames.
- mask_col
Column name for mask filenames.
- albu_torch_transforms
Albumentations transforms for augmentation.
- target_transform
Additional transforms for masks only.
- samples
Number of samples in the dataset.
- train
Whether the dataset is in training mode.
- open_file_func
Custom function for opening image files.
Note
Image and mask files should have the same name unless mask_col specifies a different column. The open_file_func should accept an image_file_path and return a numpy array or PIL Image.
- __init__(dataframe, image_dir, mask_dir=None, image_col='image', mask_col=None, albu_torch_transforms=None, target_transform=None, train=True, open_file_func=None)[source]
Initializes the SegmentationDataFrameDataset.
- Parameters:
dataframe (
DataFrame) – DataFrame containing image and mask file information.image_dir (
str) – Directory path containing input images.mask_dir (
str) – Directory path containing segmentation masks. Required when train=True. Defaults to None.image_col (
str) – Name of the DataFrame column containing image filenames. Defaults to “image”.mask_col (
str) – Name of the DataFrame column containing mask filenames. If None, uses the same filenames as image_col. Defaults to None.albu_torch_transforms (
Callable) – Albumentations transforms to apply to both image and mask. Should return a dictionary with “image” and “mask” keys. Defaults to None.target_transform (
Callable) – Additional transform to apply only to the mask after albumentations transforms. Defaults to None.train (
bool) – Whether the dataset is in training mode. If True, loads and returns masks. If False, returns filenames instead of masks. Defaults to True.open_file_func (
Callable) – Custom callable to open image/mask files. Should accept a file path and return a numpy array. If None, uses PIL.Image.open with conversion to numpy array. Defaults to None.
- Raises:
AssertionError – If train=True and mask_dir is None.
Note
The DataFrame is reset with a fresh index for consistent indexing
In training mode, returns (image, mask) tuples
In inference mode, returns (image, filename) tuples
- __len__()[source]
Returns the total number of samples in the dataset.
- Return type:
- Returns:
Number of samples.
- __getitem__(index)[source]
Retrieves an image and its mask (or filename) at the specified index.
- Parameters:
index (
int) – Index of the sample to retrieve.- Returns:
image: Transformed image tensor from albumentations
target: If train=True, transformed mask tensor. If train=False, string filename of the image.
- Return type:
Tuple of (image, target) where
Note
In training mode, applies albumentations transforms to both image and mask
In inference mode, applies albumentations transforms only to image
Additional target_transform is applied to mask if provided (training only)
deepml.fabric_trainer module
- class deepml.fabric_trainer.FabricTrainer(task, optimizer, criterion, lr_scheduler_fn=None, lr_scheduler_step_policy='epoch', accelerator='auto', strategy='auto', devices='auto', precision='32-true', num_nodes=1, fabric_plugins=None)[source]
Bases:
BaseLearnerTraining class for learning model weights using Lightning Fabric.
This trainer leverages Lightning Fabric for distributed training, mixed precision, and hardware acceleration while maintaining a simple PyTorch-like interface.
It supports features like gradient accumulation, gradient clipping, learning rate scheduling, checkpointing, and logging with experiment tracking integration. The trainer is designed to be flexible and extensible for various types of learning tasks defined by the Task abstraction.
- __init__(task, optimizer, criterion, lr_scheduler_fn=None, lr_scheduler_step_policy='epoch', accelerator='auto', strategy='auto', devices='auto', precision='32-true', num_nodes=1, fabric_plugins=None)[source]
Initializes the FabricTrainer.
- Parameters:
task (
Task) – Task object defining the learning task (e.g., classification, segmentation).optimizer (
Optimizer) – PyTorch optimizer instance for parameter updates.criterion (
Module) – Loss function module.lr_scheduler_fn (
Optional[Callable[[Optimizer],_LRScheduler]]) – Factory function that creates a learning rate scheduler. Should accept an optimizer and return a scheduler instance. Example:lambda optimizer: StepLR(optimizer, step_size=5, gamma=0.5). Defaults to None.lr_scheduler_step_policy (
str) – When to call scheduler.step(). Valid options are"epoch"(step after each epoch) or"step"(step after each gradient update). Defaults to"epoch".accelerator (
Union[str,int]) – Hardware accelerator to use. Options:"cpu","cuda","mps","gpu","tpu", or"auto". Defaults to"auto".strategy (
Union[str,int]) – Distributed training strategy. Options:"dp","ddp","fsdp","deepspeed","ddp_spawn", or"auto". Defaults to"auto".devices (
Union[str,int]) – Number or list of devices to use. Can be int, str, or"auto". Defaults to"auto".precision (
str) – Training precision. Options:"16-mixed","32-true","64-true","bf16-mixed","bf16-true", or"auto". Defaults to"32-true".num_nodes (
int) – Number of nodes for multi-node distributed training. Defaults to 1.fabric_plugins (
Optional) – Optional Fabric plugins for custom behaviors (e.g., DeepSpeedPlugin, BitsandbytesPrecision). Defaults to None.
Example
>>> from lightning_fabric.plugins import BitsandbytesPrecision >>> plugin = BitsandbytesPrecision(mode="int8") >>> trainer = FabricTrainer( ... task=task, ... optimizer=optimizer, ... criterion=criterion, ... fabric_plugins=plugin ... )
- fit(train_loader, val_loader=None, epochs=10, save_model_after_every_epoch=5, metrics=None, gradient_accumulation_steps=1, gradient_clip_value=None, gradient_clip_max_norm=None, resume_from_checkpoint=None, load_optimizer_state=False, load_scheduler_state=False, logger=None, non_blocking=True, image_inverse_transform=None, logger_img_size=None)[source]
Trains the model for the specified number of epochs.
This method launches distributed training using Lightning Fabric and handles checkpointing, logging, and training history management.
- Parameters:
train_loader (
DataLoader) – DataLoader for training data.val_loader (
DataLoader) – DataLoader for validation data. Defaults to None.epochs (
int) – Total number of epochs to train. Defaults to 10.save_model_after_every_epoch (
int) – Frequency (in epochs) to save model checkpoints. Defaults to 5.metrics (
Dict[str,Module]) – Dictionary mapping metric names to metric instances. Each metric must be a torch.nn.Module with a forward() method. Defaults to None.gradient_accumulation_steps (
int) – Number of steps to accumulate gradients before performing an optimizer step. Simulates larger batch sizes. Defaults to 1.gradient_clip_value (
Optional[float]) – Maximum absolute value for gradient clipping. Gradients will be clipped to [-gradient_clip_value, gradient_clip_value]. Defaults to None (no clipping).gradient_clip_max_norm (
Optional[float]) – Maximum L2 norm for gradient clipping. Defaults to None (no clipping).resume_from_checkpoint (
str) – Path to checkpoint file to resume training from. Defaults to None.load_optimizer_state (
bool) – Whether to load optimizer state from checkpoint. Defaults to False.load_scheduler_state (
bool) – Whether to load learning rate scheduler state from checkpoint. Defaults to False.logger (
MLExperimentLogger) – Experiment logger for tracking metrics and artifacts. If None, uses TensorboardLogger. Defaults to None.non_blocking (
bool) – Whether to use asynchronous CUDA tensor transfers. Defaults to True.image_inverse_transform (
Callable) – Transformation to reverse image normalization for visualization in TensorBoard. Defaults to None.logger_img_size (
Union[int,Tuple[int,int]]) – Image size (int or tuple) for TensorBoard logging. Defaults to None.
Note
After training completes, the latest model checkpoint is automatically loaded into the trainer’s model and optimizer.
- predict(loader)[source]
Generates predictions for the given data loader.
- Parameters:
loader – DataLoader containing data for prediction.
- Returns:
Tuple of (predictions, targets) where predictions are the model outputs and targets are the ground truth labels.
- predict_class(loader)[source]
Generates class predictions with probabilities for the given data loader.
- Parameters:
loader – DataLoader containing data for prediction.
- Returns:
predicted_class: Predicted class labels
probability: Class probabilities or confidence scores
targets: Ground truth labels
- Return type:
Tuple of (predicted_class, probability, targets) where
- show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]
Visualizes model predictions on sample images.
- Parameters:
loader – DataLoader containing data for visualization.
image_inverse_transform – Transformation to reverse image normalization for display. Defaults to None.
samples – Number of samples to display. Defaults to 9.
cols – Number of columns in the visualization grid. Defaults to 3.
figsize – Figure size as (width, height) tuple. Defaults to (10, 10).
target_known – Whether ground truth targets are available for comparison. Defaults to True.
deepml.losses module
- class deepml.losses.JaccardLoss(*args: Any, **kwargs: Any)[source]
Bases:
ModuleJaccard Loss (Intersection over Union) for segmentation tasks.
Computes 1 - IoU as a differentiable loss function for both binary and multiclass segmentation.
- activation
Activation function applied to output logits. Softmax2d for multiclass, Sigmoid for binary.
- class deepml.losses.RMSELoss(*args: Any, **kwargs: Any)[source]
Bases:
ModuleRoot Mean Squared Error loss.
Computes sqrt(MSE + eps) to provide a differentiable RMSE loss that avoids numerical instability near zero.
- mse
Underlying MSELoss module.
- eps
Small epsilon value added before the square root for numerical stability.
- class deepml.losses.WeightedBCEWithLogitsLoss(*args: Any, **kwargs: Any)[source]
Bases:
ModuleWeighted Binary Cross-Entropy loss with logits.
Applies separate weights to positive and negative samples in the binary cross-entropy computation.
- w_p
Weight for positive samples.
- w_n
Weight for negative samples.
- __init__(w_p=None, w_n=None)[source]
Initializes WeightedBCEWithLogitsLoss.
- Parameters:
w_p – Weight applied to the positive class loss term. Defaults to None.
w_n – Weight applied to the negative class loss term. Defaults to None.
- forward(logits, labels, epsilon=1e-07)[source]
Computes the weighted binary cross-entropy loss.
- Parameters:
logits – Raw model output logits of shape (N,) or (N, 1).
labels – Binary ground truth labels of shape (N,).
epsilon – Small constant to avoid log(0). Defaults to 1e-7.
- Returns:
Scalar tensor representing the weighted BCE loss.
- class deepml.losses.ContrastiveLoss(*args: Any, **kwargs: Any)[source]
Bases:
ModuleContrastive loss for siamese networks.
Encourages embeddings of similar pairs to be close together and embeddings of dissimilar pairs to be at least margin apart.
- margin
Minimum distance margin between negative pairs.
- distance_func
Optional custom distance function. If None, pairwise Euclidean distance is used.
- label_transform
Optional transformation applied to target labels before loss computation.
- __init__(margin=2.0, distance_func=None, label_transform=None)[source]
Initializes ContrastiveLoss.
- Parameters:
margin – The distance margin between positive and negative class. Defaults to 2.0.
distance_func – Custom distance function to use. If None, Euclidean pairwise distance is used. Defaults to None.
label_transform – Transformation function to apply on the target label, e.g., lambda label: label[:, 0]. Defaults to None.
- class deepml.losses.AngularPenaltySMLoss(*args: Any, **kwargs: Any)[source]
Bases:
ModuleAngular Penalty Softmax Loss for deep face recognition.
Implements three angular margin-based softmax losses:
ArcFace: Additive angular margin loss. See ArcFace.
SphereFace: Multiplicative angular margin loss. See SphereFace.
CosFace: Additive cosine margin loss. See CosFace.
- s
Scaling factor for the logits.
- m
Angular or cosine margin penalty.
- loss_type
One of ‘arcface’, ‘sphereface’, or ‘cosface’.
- in_features
Size of the input feature vector.
- out_features
Number of output classes.
- fc
Fully connected layer mapping input features to class logits (without bias).
- eps
Small epsilon for numerical stability in acos clamping.
- __init__(in_features, out_features, loss_type='arcface', eps=1e-07, s=None, m=None)[source]
Initializes AngularPenaltySMLoss.
- Parameters:
in_features – Dimensionality of the input feature embeddings.
out_features – Number of target classes.
loss_type – Type of angular penalty loss. Must be one of ‘arcface’, ‘sphereface’, or ‘cosface’. Defaults to ‘arcface’.
eps – Small constant for numerical stability when clamping values for acos. Defaults to 1e-7.
s – Scaling factor for logits. If None, uses the default for the chosen loss type (64.0 for arcface/sphereface, 30.0 for cosface). Defaults to None.
m – Margin penalty. If None, uses the default for the chosen loss type (0.5 for arcface, 1.35 for sphereface, 0.4 for cosface). Defaults to None.
- Raises:
AssertionError – If loss_type is not one of the supported types.
- forward(x, labels)[source]
Computes the angular penalty softmax loss.
- Parameters:
x – Input feature embeddings of shape (N, in_features).
labels – Ground truth class labels of shape (N,), with values in the range [0, out_features).
- Returns:
Scalar tensor representing the negative mean log probability.
- Raises:
AssertionError – If input and labels have mismatched batch sizes, or if labels contain values outside the valid range.
deepml.lr_scheduler_utils module
- deepml.lr_scheduler_utils.setup_one_cycle_lr_scheduler_with_warmup(optimizer, steps_per_epoch, warmup_steps=None, warmup_ratio=None, num_epochs=50, max_lr=0.001, anneal_strategy='cos')[source]
Sets up a OneCycleLR learning rate scheduler with warmup phase.
Creates a OneCycleLR scheduler that includes a warmup phase specified either by the number of steps or as a ratio of total training steps. The scheduler follows the 1-cycle policy: warmup → annealing to max_lr → annealing to min_lr.
- Parameters:
optimizer – PyTorch optimizer instance to schedule.
steps_per_epoch (
int) – Number of optimizer steps in one epoch. Typicallylen(train_loader). When using gradient accumulation or distributed training, adjust accordingly:len(train_loader) // gradient_accumulation_steps // num_processes.warmup_steps (
Optional[int]) – Number of warmup steps before reaching max_lr. Must be less than total training steps. Mutually exclusive with warmup_ratio. Defaults to None.warmup_ratio (
Optional[float]) – Ratio of total training steps to use for warmup (0-1). Mutually exclusive with warmup_steps. Defaults to None.num_epochs (
int) – Total number of training epochs. Defaults to 50.max_lr (
float) – Maximum learning rate during the cycle. Defaults to 1e-3.anneal_strategy (
Literal['cos','linear']) – Annealing strategy after warmup. Options: -"cos": Cosine annealing (smooth decay) -"linear": Linear annealing Defaults to"cos".
- Returns:
OneCycleLR scheduler instance configured with the specified parameters.
- Raises:
AssertionError – If neither warmup_steps nor warmup_ratio is provided.
AssertionError – If both warmup_steps and warmup_ratio are provided.
AssertionError – If warmup_steps >= total training steps.
AssertionError – If warmup_ratio is not between 0 and 1.
Example
>>> from torch.optim import Adam >>> optimizer = Adam(model.parameters(), lr=1e-4) >>> scheduler = setup_one_cycle_lr_scheduler_with_warmup( ... optimizer, ... steps_per_epoch=100, ... warmup_ratio=0.1, ... num_epochs=50, ... max_lr=1e-3 ... )
Note
The OneCycleLR policy divides training into three phases: 1. Warmup: Learning rate increases from initial_lr to max_lr 2. Annealing: Learning rate decreases from max_lr towards min_lr 3. The pct_start parameter controls the fraction of total steps for warmup
total_stepsis set tonum_epochs * steps_per_epoch + 1rather than the “bare” product. PyTorch’sOneCycleLR.__init__callsstep()once internally during construction, consuming one slot before training begins. The trainer then callsstep()once per batch (num_epochs * steps_per_epochtimes total). Without the+1the very last batch would trigger aValueError: Tried to step N+1 times.
deepml.tasks module
- class deepml.tasks.Task(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]
Bases:
ABCAbstract base class for all deep learning tasks.
This class provides the foundation for task-specific implementations including model management, device handling, and prediction workflows.
Subclasses must implement methods for transforming targets and outputs, batch prediction, training and evaluation steps, and visualization.
- __init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]
Initializes the Task.
- Parameters:
model (
Module) – PyTorch model instance to be trained or used for inference.model_dir (
str) – Directory path for saving and loading model checkpoints.load_saved_model (
bool) – Whether to load a previously saved model from model_dir. Defaults to False. Set to True if you want to load model weights from a checkpoint file in model_dir.model_file_name (
str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.device (
str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. When “auto”, automatically selects the best available device. Defaults to “auto”.
- Raises:
AssertionError – If model is not a torch.nn.Module instance, or if model_dir is None, or if model_file_name is not a string, or if device is not one of the valid options.
- property model
- property model_dir
- property device
- property model_file_name
- move_input_to_device(x, device=None, non_blocking=False, **kwargs)[source]
Moves input data to the specified device.
Handles various input types including tensors, lists, tuples, and dictionaries containing tensors.
- Parameters:
x (
Union[Tensor,list,tuple,dict]) – Input data to move. Can be a single tensor, list/tuple of tensors, or dictionary with tensor values.device (
Union[device,str,None]) – Target device. If None, uses the task’s default device. Defaults to None.non_blocking (
bool) – Whether to use asynchronous transfer. Defaults to False.**kwargs (
dict) – Additional keyword arguments (unused).
- Return type:
- Returns:
Input data moved to the target device, maintaining the original data structure.
- transform_input(x, image_inverse_transform=None)[source]
Applies optional inverse transformation to input images.
- abstract transform_target(y)[source]
Transforms target data for visualization or evaluation.
- Parameters:
y – Target data in model format.
- Returns:
Transformed target data.
- abstract transform_output(prediction)[source]
Transforms model output for visualization or evaluation.
- Parameters:
prediction – Model output in raw format.
- Returns:
Transformed prediction data.
- abstract predict_batch(x, *args, **kwargs)[source]
Performs prediction on a single batch.
- Parameters:
x – Input batch.
*args – Additional positional arguments.
**kwargs – Additional keyword arguments.
- Returns:
Model predictions for the batch.
- abstract train_step(x, y, *args, **kwargs)[source]
- Executes a single training step.
Apply any batch based transformation to the target as well, if needed.
- abstract predict(loader)[source]
Generates predictions for all data in the loader.
- Parameters:
loader – DataLoader containing data for prediction.
- Returns:
Predictions and targets.
- abstract predict_class(loader)[source]
Generates class predictions for all data in the loader.
- Parameters:
loader – DataLoader containing data for prediction.
- Returns:
Predicted classes, probabilities, and targets.
- abstract show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]
Visualizes model predictions.
- Parameters:
loader – DataLoader containing data for visualization.
image_inverse_transform – Transformation to reverse normalization.
samples – Number of samples to display.
cols – Number of columns in visualization grid.
figsize – Figure size tuple.
target_known – Whether ground truth is available.
- abstract write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224)[source]
Writes predictions to experiment logger.
- Parameters:
tag – Tag identifier for logged data.
loader – DataLoader containing data.
logger – Experiment logger instance.
image_inverse_transform – Transformation to reverse normalization.
global_step – Current training step/epoch.
img_size – Image size for logging.
- abstract evaluate(loader, criterion, metrics=None, non_blocking=False)[source]
Evaluates model performance on the given data.
- Parameters:
loader (
DataLoader) – DataLoader containing evaluation data.criterion (
Module) – Loss function module.non_blocking – Whether to use async CUDA transfers.
- Returns:
Dictionary of evaluation metrics.
- class deepml.tasks.NeuralNetTask(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]
Bases:
TaskBase task implementation for general deep learning tasks.
This class provides a simple implementation suitable for any deep learning task. It performs predictions without applying task-specific transformations and does not write to TensorBoard by default.
Use this class when you need a minimal task implementation without specialized handling for classification, segmentation, or regression.
- __init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]
Initializes the NeuralNetTask.
- Parameters:
model (
Module) – PyTorch model instance to be trained or used for inference.model_dir (
str) – Directory path for saving and loading model checkpoints.load_saved_model (
bool) – Whether to load a previously saved model from model_dir. Defaults to False.model_file_name (
str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.device (
str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.
- predict_batch(x, *args, **kwargs)[source]
Performs prediction on a single batch.
- Parameters:
x (
Tensor) – Input batch tensor.*args – Additional positional arguments.
**kwargs – Additional keyword arguments. If ‘model’ key is present, uses that model instead of the task’s default model.
- Returns:
Model predictions for the batch.
- train_step(x, y, *args, **kwargs)[source]
Executes a single training step.
- Parameters:
x – Input batch.
y – Target batch.
*args – Additional positional arguments.
**kwargs – Additional keyword arguments.
- Returns:
Tuple of (predictions, inputs, targets).
- eval_step(x, y, *args, **kwargs)[source]
Executes a single evaluation step.
- Parameters:
x – Input batch.
y – Target batch.
*args – Additional positional arguments.
**kwargs – Additional keyword arguments.
- Returns:
Tuple of (predictions, inputs, targets).
- predict(loader)[source]
Generates predictions for all batches in the data loader.
- Parameters:
loader (
DataLoader) – DataLoader containing data for prediction.- Returns:
predictions: Concatenated tensor of all model predictions
targets: Concatenated tensor or list of all ground truth labels
- Return type:
Tuple of (predictions, targets) where
- Raises:
AssertionError – If loader is None or empty.
- predict_class(loader)[source]
Generates class predictions for all data in the loader.
- Parameters:
loader (
DataLoader) – DataLoader containing data for prediction.- Raises:
NotImplementedError – This method must be implemented by subclasses.
- show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]
Visualizes model predictions.
- Parameters:
loader (
DataLoader) – DataLoader containing data for visualization.image_inverse_transform (
Callable) – Transformation to reverse normalization.samples (
int) – Number of samples to display.cols (
int) – Number of columns in visualization grid.target_known (
bool) – Whether ground truth is available.
- Raises:
NotImplementedError – This method must be implemented by subclasses.
- transform_target(y)[source]
Transforms target data for visualization or evaluation.
- Parameters:
y (
Any) – Target data in model format.- Raises:
NotImplementedError – This method must be implemented by subclasses.
- transform_output(prediction)[source]
Transforms model output for visualization or evaluation.
- Parameters:
prediction – Model output in raw format.
- Raises:
NotImplementedError – This method must be implemented by subclasses.
- write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224, **kwargs)[source]
Writes predictions to experiment logger.
- Parameters:
tag (
str) – Tag identifier for logged data.loader – DataLoader containing data.
logger – Experiment logger instance.
image_inverse_transform – Transformation to reverse normalization.
global_step – Current training step/epoch.
img_size – Image size for logging.
**kwargs (
dict) – Additional keyword arguments.
Note
Default implementation does nothing. Override in subclasses for custom logging behavior.
- evaluate(loader, metrics=None, non_blocking=False)
Evaluates the model on the given data loader using specified metrics.
- Parameters:
loader (
DataLoader) – DataLoader containing evaluation data.metrics (
Dict[str,Module]) – Dictionary mapping metric names to metric modules. Each metric should be a torch.nn.Module with a forward() method. Defaults to None.non_blocking – Whether to use asynchronous CUDA transfers. Defaults to False.
- Returns:
Dictionary mapping metric names to their average values across all batches.
- Raises:
Exception – If loader is None.
- class deepml.tasks.Segmentation(model, model_dir, mode='binary', load_saved_model=False, model_file_name='latest_model.pt', device='auto', num_classes=1, threshold=0.5, color_map=None)[source]
Bases:
NeuralNetTaskTask implementation for binary and multiclass semantic segmentation.
This class handles pixel-level classification tasks including binary and multiclass segmentation with customizable color mapping for visualization.
- mode
Segmentation mode (“binary” or “multiclass”).
- num_classes
Number of segmentation classes.
- threshold
Threshold for binary segmentation predictions.
- class_index_to_color
Dictionary mapping class indices to colors.
- palette
Color palette for visualization (PIL format).
- __init__(model, model_dir, mode='binary', load_saved_model=False, model_file_name='latest_model.pt', device='auto', num_classes=1, threshold=0.5, color_map=None)[source]
Initializes the Segmentation task.
- Parameters:
model (
Module) – PyTorch model architecture for segmentation.model_dir (
str) – Directory path for saving/loading model checkpoints.mode (
str) – Segmentation mode. Options: “binary” or “multiclass”. Defaults to “binary”.load_saved_model (
bool) – Whether to load a previously saved model. Defaults to False.model_file_name (
str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.device (
str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.num_classes (
int) – Number of segmentation classes. For binary segmentation, use 1 (class 0: background, class 1: foreground). Defaults to 1.threshold (
float) – Probability threshold for binary segmentation predictions. Defaults to 0.5.color_map (
dict) – Dictionary mapping class indices to colors. If None, uses default color maps: - Binary: {0: 0, 1: 255} (grayscale) - Multiclass: {0: [0,0,0], 1: [R,G,B], …} (RGB triplets) For multiclass, random RGB colors are generated if not specified. Class 0 is always background (black). Defaults to None.
- Raises:
AssertionError – If num_classes is not an integer or is less than 1.
Example
>>> model = UNet(in_channels=3, out_channels=3) >>> color_map = {0: [0,0,0], 1: [255,0,0], 2: [0,255,0]} >>> task = Segmentation( ... model=model, ... model_dir="./models", ... mode="multiclass", ... num_classes=3, ... color_map=color_map ... )
- predict_batch(x, *args, **kwargs)[source]
Performs prediction on a single batch.
- Parameters:
- Return type:
- Returns:
Model predictions for the batch.
- save_prediction(loader, save_dir)[source]
Generates and saves segmentation predictions as PNG images.
Performs inference on the data loader and saves predicted segmentation masks as PNG files with the appropriate color palette.
- Parameters:
loader (
DataLoader) – DataLoader yielding batches of (images, filenames). The second element must be a list of filename strings.save_dir (
str) – Output directory path where prediction PNG files will be saved. Directory will be created if it doesn’t exist.
- Raises:
AssertionError – If loader is None, empty, or save_dir is None.
Note
Filenames that don’t end with ‘.png’ will be automatically converted to PNG format with the .png extension.
- predict_class(loader)[source]
Generates class predictions for all data in the loader.
- Parameters:
loader (
DataLoader) – DataLoader containing data for prediction.- Raises:
NotImplementedError – This method must be implemented by subclasses.
- show_predictions(loader, image_inverse_transform=None, samples=4, cols=3, figsize=(16, 16), target_known=True)[source]
Visualizes segmentation predictions on sample images.
Displays input images, ground truth masks, and predicted masks in a matplotlib figure with overlays.
- Parameters:
loader (
DataLoader) – DataLoader containing data for visualization.image_inverse_transform (
Callable) – Transformation to reverse image normalization for display. Defaults to None.samples (
int) – Number of samples to display. Defaults to 4.cols (
int) – Number of columns in the visualization grid. Defaults to 3.figsize (
Tuple[int,int]) – Figure size as (width, height) tuple. Defaults to (16, 16).target_known (
bool) – Whether ground truth targets are available for comparison. Defaults to True.
- transform_target(y)[source]
Transforms target mask to RGB color image for visualization.
- Parameters:
y (
Tensor) – Target segmentation mask with class indices.- Returns:
RGB color image tensor decoded using the class color palette.
- transform_output(predictions)[source]
Converts model predictions to class indices.
Applies sigmoid (binary) or softmax (multiclass) activation and converts probabilities to discrete class indices.
- Parameters:
predictions (
Tensor) – Model output logits of shape (B, C, H, W) where: - B: batch size - C: number of classes (1 for binary, >1 for multiclass) - H: height - W: width- Return type:
- Returns:
Tensor of class indices with shape (B, H, W). For binary segmentation, values are 0 or 1. For multiclass, values are in range [0, num_classes).
- Raises:
AssertionError – If predictions is not 4-dimensional (BCHW format).
Note
Binary: Uses sigmoid activation with threshold (default 0.5)
Multiclass: Uses softmax activation with argmax
- decode_segmentation_mask(class_indices)[source]
Converts class indices to RGB color images for visualization.
- Parameters:
class_indices (
Tensor) – Batch of segmentation masks with shape (B, H, W) containing class indices.- Returns:
For binary: C=1 (grayscale)
For multiclass: C=3 (RGB)
Colors are mapped according to the class_index_to_color palette.
- Return type:
Batch of RGB images with shape (B, C, H, W) where
Note
Uses PIL Image palette for efficient color mapping in multiclass mode.
- log_prediction(tag, predictions, x, targets, logger, image_inverse_transform, global_step, img_size=224, **kwargs)[source]
Logs input images, target masks, and output masks to the experiment logger.
Creates a visualization grid showing input images, ground truth masks, ground truth overlays, predicted masks, and predicted overlays side by side.
- Parameters:
tag (
str) – Tag identifier for the logged images in the experiment tracker.predictions (
Tensor) – Model predictions with shape (B, C, H, W) or (B, H, W).x (
Tensor) – Input images with shape (B, C, H, W).targets (
Tensor) – Ground truth masks with shape (B, H, W) or (B, C, H, W).logger (
MLExperimentLogger) – Experiment logger instance for tracking visualizations.image_inverse_transform (
Callable) – Callable to reverse image normalization for proper visualization.global_step (
int) – Current training step/epoch for the logger.img_size (
Union[int,Tuple[int,int],None]) – Target size for resizing images. Can be int or (H, W) tuple. If None, no resizing is performed. Defaults to 224.**kwargs (
dict) – Additional keyword arguments passed through.
Note
Override this method to customize the logging behavior. The default implementation creates a grid with 5 images per sample: input, target mask, target overlay, predicted mask, and predicted overlay.
- write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224, **kwargs)[source]
Writes input images, targets, and predictions to the experiment logger.
Samples random batches from the data loader, generates predictions, and logs visualizations to the experiment tracker.
- Parameters:
tag (
str) – Tag identifier for the logged images in the experiment tracker.loader (
DataLoader) – DataLoader containing data for visualization.logger (
MLExperimentLogger) – Experiment logger instance for tracking visualizations.image_inverse_transform (
Callable) – Callable to reverse image normalization for proper visualization.global_step (
int) – Current training step/epoch for the logger.img_size (
Union[int,Tuple[int,int],None]) – Target size for resizing images. Can be int or (H, W) tuple. If None, no resizing is performed. Defaults to 224.**kwargs (
dict) – Additional keyword arguments passed to eval_step.
- class deepml.tasks.ImageRegression(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]
Bases:
NeuralNetTaskTask implementation for image regression problems.
This class handles tasks where the model predicts continuous values from images, such as age estimation, pose estimation, or depth prediction.
The task supports visualization of predictions alongside ground truth values and logging to experiment trackers.
- __init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto')[source]
Initializes the ImageRegression task.
- Parameters:
model (
Module) – PyTorch model instance for regression.model_dir (
str) – Directory path for saving and loading model checkpoints.load_saved_model (
bool) – Whether to load a previously saved model from model_dir. Defaults to False.model_file_name (
str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.device (
str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.
- show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]
Visualizes model predictions on sample images.
Displays random samples from the loader with their ground truth values and predicted values in a matplotlib figure.
- Parameters:
loader (
DataLoader) – DataLoader containing data for visualization.image_inverse_transform (
Callable) – Transformation to reverse image normalization for display. Defaults to None.samples (
int) – Number of samples to display. Defaults to 9.cols (
int) – Number of columns in the visualization grid. Defaults to 3.figsize (
Tuple[int,int]) – Figure size as (width, height) tuple. Defaults to (10, 10).target_known (
bool) – Whether ground truth targets are available for comparison. Defaults to True.
- transform_target(y)[source]
Transforms target tensor to a rounded float value.
- Parameters:
y (
Tensor) – Target tensor (single value).- Returns:
Rounded float value to 2 decimal places.
- transform_output(prediction)[source]
Transforms prediction tensor to a rounded float value.
- Parameters:
prediction (
Tensor) – Prediction tensor (single value).- Returns:
Rounded float value to 2 decimal places.
- write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224)[source]
Writes predictions with ground truth values to the experiment logger.
Creates a visualization grid showing input images alongside their ground truth and predicted values as text overlays.
- Parameters:
tag (
str) – Unique tag identifier for the logged images.loader (
DataLoader) – DataLoader containing data for visualization.logger (
MLExperimentLogger) – Experiment logger instance for tracking visualizations.image_inverse_transform (
Callable) – Transformation to reverse image normalization.global_step (
int) – Current training epoch/step for the logger.img_size (
Union[int,Tuple[int,int],None]) – Image size for TensorBoard logging. Can be int or (H, W) tuple. If None, no visualization is written. Defaults to 224.
- predict_class(loader)[source]
Generates class predictions for all data in the loader.
- Parameters:
loader – DataLoader containing data for prediction.
- Raises:
NotImplementedError – This method must be implemented by subclasses.
- class deepml.tasks.ImageClassification(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto', classes=None)[source]
Bases:
NeuralNetTaskTask implementation for image classification.
This class handles both binary and multiclass classification tasks where each image belongs to exactly one class. Supports custom class labels and visualization of predictions.
- _classes
Optional sequence of class names for human-readable labels.
- __init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto', classes=None)[source]
Initializes the ImageClassification task.
- Parameters:
model (
Module) – PyTorch model instance for classification.model_dir (
str) – Directory path for saving and loading model checkpoints.load_saved_model (
bool) – Whether to load a previously saved model from model_dir. Defaults to False.model_file_name (
str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.device (
str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.classes (
Sequence) – Optional sequence of class names (e.g., [‘cat’, ‘dog’]). If provided, predictions will use these labels instead of class indices. Defaults to None.
- predict_class(loader)[source]
Generates class predictions with probabilities for all data.
- Parameters:
loader (
DataLoader) – DataLoader containing data for prediction.- Returns:
predicted_class: Tensor of predicted class indices
probability: Tensor of prediction confidence scores
targets: Ground truth class labels
- Return type:
Tuple of (predicted_class, probability, targets) where
- transform_target(y)[source]
Transforms target class index to human-readable label if available.
- Parameters:
y – Target class index.
- Returns:
Class name if classes are defined, otherwise returns the index.
- transform_output(predictions)[source]
Converts model predictions to class indices and probabilities.
Applies sigmoid (binary) or softmax (multiclass) activation and extracts the predicted class and its probability.
- Parameters:
predictions (
Tensor) – Model output logits with shape (B, num_classes) for multiclass or (B, 1) for binary classification.- Returns:
indices: Tensor of predicted class indices with shape (B,)
probabilities: Tensor of prediction confidences with shape (B,)
- Return type:
Tuple of (indices, probabilities) where
Note
Binary: Uses sigmoid with 0.5 threshold
Multiclass: Uses softmax with argmax
- show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]
Visualizes model predictions on sample images.
Displays random samples from the loader with their ground truth labels, predicted labels, and confidence scores in a matplotlib figure.
- Parameters:
loader (
DataLoader) – DataLoader containing data for visualization.image_inverse_transform (
Callable) – Transformation to reverse image normalization for display. Defaults to None.samples (
int) – Number of samples to display. Defaults to 9.cols (
int) – Number of columns in the visualization grid. Defaults to 3.figsize (
Tuple[int,int]) – Figure size as (width, height) tuple. Defaults to (10, 10).target_known (
bool) – Whether ground truth targets are available for comparison. If True, titles will be colored green (correct) or red (incorrect). Defaults to True.
- write_prediction_to_logger(tag, loader, logger, image_inverse_transform, global_step, img_size=224)[source]
Writes predictions with labels to the experiment logger.
Creates a visualization grid showing input images alongside their ground truth and predicted class labels with confidence scores.
- Parameters:
tag (
str) – Unique tag identifier for the logged images.loader – DataLoader containing data for visualization.
logger (
MLExperimentLogger) – Experiment logger instance for tracking visualizations.image_inverse_transform – Transformation to reverse image normalization.
global_step (
int) – Current training epoch/step for the logger.img_size – Image size for logging. Can be int or (H, W) tuple. If None, no visualization is written. Defaults to 224.
Note
Predictions are colored green for correct classifications and red for incorrect ones.
- class deepml.tasks.MultiLabelImageClassification(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto', classes=None)[source]
Bases:
ImageClassificationTask implementation for multi-label image classification.
This class handles classification tasks where each image can belong to multiple classes simultaneously (e.g., an image containing both a cat and a dog).
- _classes
Optional sequence of class names for human-readable labels.
- __init__(model, model_dir, load_saved_model=False, model_file_name='latest_model.pt', device='auto', classes=None)[source]
Initializes the MultiLabelImageClassification task.
- Parameters:
model (
Module) – PyTorch model instance for multi-label classification.model_dir – Directory path for saving and loading model checkpoints.
load_saved_model (
bool) – Whether to load a previously saved model from model_dir. Defaults to False.model_file_name (
str) – Name of the model checkpoint file. Defaults to “latest_model.pt”.device (
str) – Device to use for computation. Options: “auto”, “cpu”, “cuda”, or “mps”. Defaults to “auto”.classes – Optional sequence of class names for labeling. Defaults to None.
- predict_class(loader)[source]
Generates multi-label class predictions with probabilities for all data.
- Parameters:
loader – DataLoader containing data for prediction.
- Returns:
predicted_class: Binary tensor indicating predicted classes
probability: Tensor of class probabilities for all classes
targets: Ground truth multi-label targets
- Return type:
Tuple of (predicted_class, probability, targets) where
- transform_target(y)[source]
Transforms target class indices to comma-separated class labels.
- Parameters:
y – Binary tensor or list where 1 indicates the class is present.
- Returns:
Comma-separated string of class names if classes are defined, otherwise returns the original indices.
- transform_output(predictions)[source]
Converts model predictions to binary class labels and probabilities.
Applies sigmoid activation and thresholding to convert logits into multi-label predictions.
- Parameters:
predictions – Model output logits with shape (B, num_classes).
- Returns:
indices: Binary tensor with shape (B, num_classes). Value is 1 if class is predicted (probability > 0.5), else 0.
probabilities: Tensor of class probabilities with shape (B, num_classes) after sigmoid activation.
- Return type:
Tuple of (indices, probabilities) where
Note
Uses sigmoid activation with 0.5 threshold for each class independently.
deepml.tracking module
- class deepml.tracking.MLExperimentLogger[source]
Bases:
ABCAbstract base class for experiment tracking and logging.
This class defines the interface for logging machine learning experiments across different platforms (TensorBoard, MLflow, Weights & Biases, etc.).
Subclasses must implement all abstract methods to provide platform-specific logging functionality.
- abstract log_params(**kwargs)[source]
Logs hyperparameters and configuration for the experiment.
- Parameters:
**kwargs – Arbitrary keyword arguments containing parameters to log. Common parameters include model architecture, optimizer settings, learning rate, batch size, etc.
- abstract log_artifact(tag, value, step, artifact_path=None)[source]
Logs an artifact (file, tensor, or other data) to the experiment.
- abstract log_model(tag, value, step, artifact_path=None)[source]
Logs a model checkpoint or weights to the experiment.
- class deepml.tracking.TensorboardLogger(model_dir)[source]
Bases:
MLExperimentLoggerTensorBoard experiment logger implementation.
This logger writes experiment data to TensorBoard, including metrics, images, model graphs, and other artifacts.
- writer
TensorBoard SummaryWriter instance for logging.
- __init__(model_dir)[source]
Initializes the TensorboardLogger.
Creates a new run directory within the model directory and initializes the TensorBoard SummaryWriter.
- Parameters:
model_dir – Base directory path for saving TensorBoard logs. A new timestamped run directory will be created within this path.
- log_params(**kwargs)[source]
Logs hyperparameters and model graph to TensorBoard.
- Parameters:
**kwargs – Keyword arguments. If ‘task’ and ‘loader’ are provided, writes the model computational graph to TensorBoard.
- class deepml.tracking.MLFlowLogger(experiment_name='Default', tracking_uri=None, log_model_weights=True)[source]
Bases:
MLExperimentLoggerMLflow experiment logger implementation.
This logger writes experiment data to MLflow tracking server, including metrics, parameters, model checkpoints, and images.
- mlflow
MLflow module instance.
- log_model_weights
Whether to log model weights as artifacts.
Note
Requires mlflow package to be installed.
- __init__(experiment_name='Default', tracking_uri=None, log_model_weights=True)[source]
Initializes the MLFlowLogger.
Sets up the MLflow experiment and optionally configures the tracking URI.
- log_params(**kwargs)[source]
Logs hyperparameters to MLflow.
- Parameters:
**kwargs – Arbitrary keyword arguments containing parameters to log.
- log_artifact(tag, value, step, artifact_path=None)[source]
Logs an artifact to MLflow.
- Parameters:
Note
Currently not implemented. Override to add custom artifact logging.
- log_model(tag, value, step, artifact_path=None)[source]
Logs a model checkpoint to MLflow.
- Parameters:
Note
Only logs if log_model_weights is True and artifact_path is provided.
- class deepml.tracking.WandbLogger(delete_intermediate_artifacts_versions=True, **kwargs)[source]
Bases:
MLExperimentLoggerWeights & Biases (wandb) experiment logger implementation.
This logger writes experiment data to Weights & Biases, including metrics, parameters, model artifacts, and images. Supports automatic cleanup of intermediate artifact versions to avoid storage overflow.
- wandb
Wandb module instance.
- delete_intermediate_artifacts_versions
Whether to delete old artifact versions automatically.
Note
Requires wandb package to be installed.
- __init__(delete_intermediate_artifacts_versions=True, **kwargs)[source]
Initializes the WandbLogger.
- Parameters:
- log_params(**kwargs)[source]
Logs hyperparameters to Weights & Biases.
- Parameters:
**kwargs – Arbitrary keyword arguments containing parameters to log. These will be added to the wandb config.
- log_artifact(tag, value, step, artifact_path=None)[source]
Logs an artifact to Weights & Biases.
- Parameters:
Note
Image logging for tensors is currently not implemented (TODO).
- log_model(tag, value, step, artifact_path=None)[source]
Logs a model checkpoint to Weights & Biases.
Creates a wandb Artifact for the model and optionally deletes older versions if delete_intermediate_artifacts_versions is True.
- Parameters:
Note
If delete_intermediate_artifacts_versions is enabled, only the latest version of the artifact is retained to save storage space.
deepml.trainer module
- class deepml.trainer.Learner(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_step_policy='epoch', load_state=False, use_amp=False)[source]
Bases:
objectTraining class for learning model weights using PyTorch.
This trainer provides straightforward training functionality with support for learning rate scheduling, automatic mixed precision (AMP), gradient accumulation, and gradient clipping. It’s designed for single-device training and works well in interactive environments like Jupyter notebooks.
For multi-GPU or distributed training, consider using
FabricTrainerorAcceleratorTrainer.- epochs_completed
Number of epochs completed in training.
- best_val_loss
Best validation loss achieved during training.
- history
Dictionary storing training history metrics across epochs.
- logger
Experiment logger for tracking metrics and artifacts.
Note
This trainer is ideal for: - Single GPU/CPU training - Jupyter notebook environments - Simple training workflows without distributed requirements - Debugging and prototyping
- __init__(task, optimizer, criterion, lr_scheduler=None, lr_scheduler_step_policy='epoch', load_state=False, use_amp=False)[source]
Initializes the Learner.
- Parameters:
task (
Task) – Task object defining the learning task (e.g., classification, segmentation).optimizer (
Optimizer) – PyTorch optimizer instance for parameter updates.criterion (
Module) – Loss function module.lr_scheduler – Learning rate scheduler instance. Defaults to None.
lr_scheduler_step_policy (
str) – When to call scheduler.step(). Valid options are"epoch"(step after each epoch) or"step"(step after each optimizer update). Defaults to"epoch".load_state (
bool) – Whether to resume model training. If True, loads optimizer state, scheduler state (if any), and training history from checkpoint. Defaults to False.use_amp (
bool) – Whether to use automatic mixed precision (AMP) for training. Defaults to False.
- set_optimizer(optimizer)[source]
Sets the optimizer for training.
- Parameters:
optimizer (
Optimizer) – PyTorch optimizer instance.- Raises:
AssertionError – If optimizer is not a torch.optim.Optimizer instance.
- set_criterion(criterion)[source]
Sets the loss function for training.
- Parameters:
criterion (
Module) – Loss function module.- Raises:
AssertionError – If criterion is not a torch.nn.Module instance.
- set_lr_scheduler(lr_scheduler, lr_scheduler_step_policy='epoch')[source]
Sets the learning rate scheduler.
- Parameters:
lr_scheduler – Learning rate scheduler instance. If None, no scheduler is used.
lr_scheduler_step_policy (
str) – When to call scheduler.step(). Valid options are"epoch"or"step". Defaults to"epoch".
- Raises:
AssertionError – If lr_scheduler_step_policy is not
"epoch"or"step".
- save(tag, save_optimizer_state=False, epoch=-1, train_loss=None, val_loss=None)[source]
Saves model checkpoint and training state.
- Parameters:
tag (
str) – Name tag for the checkpoint file (without extension).save_optimizer_state (
bool) – Whether to include optimizer state in the checkpoint. Defaults to False.epoch (
int) – Current epoch number. Defaults to -1.train_loss (
float) – Training loss value for this checkpoint. Defaults to None.val_loss (
float) – Validation loss value for this checkpoint. Defaults to None.
- Returns:
Full path to the saved checkpoint file.
- Return type:
Note
Automatically handles DataParallel models
Saves scheduler state if scheduler is configured
Saves AMP scaler state if AMP is enabled
Logs the model to the experiment logger
- validate(loader, criterion, metrics=None, non_blocking=False)
Evaluates the model on the validation data.
- Parameters:
loader (
DataLoader) – DataLoader for validation data.criterion (
Module) – Loss function module.metrics (
Dict[str,Module]) – Dictionary mapping metric names to metric modules. Defaults to None.non_blocking – Whether to use asynchronous CUDA transfers. Defaults to False.
- Returns:
OrderedDict mapping metric names to their average values across all batches.
- Raises:
Exception – If loader is None.
Note
Model is set to eval() mode
Gradients are disabled via @torch.no_grad() decorator
Metrics are computed as running averages
- fit(train_loader, val_loader=None, epochs=10, steps_per_epoch=None, save_model_after_every_epoch=5, metrics=None, gradient_accumulation_steps=1, gradient_clip_value=0, gradient_clip_algorithm='norm', logger=None, non_blocking=True, image_inverse_transform=None, logger_img_size=None)[source]
Trains the model for the specified number of epochs.
- Parameters:
train_loader (
DataLoader) – DataLoader for training data.val_loader (
DataLoader) – DataLoader for validation data. Defaults to None.epochs (
int) – Total number of epochs to train. Defaults to 10.steps_per_epoch (
int) – Number of steps per epoch. Should be around len(train_loader) to ensure full dataset coverage. If None, defaults to len(train_loader). Defaults to None.save_model_after_every_epoch (
int) – Frequency (in epochs) to save model checkpoints. Defaults to 5.metrics (
Dict[str,Module]) – Dictionary mapping metric names to metric instances. Each metric must be a torch.nn.Module with a forward() method. Defaults to None.gradient_accumulation_steps (
int) – Number of steps to accumulate gradients before performing an optimizer step. Simulates larger batch sizes. Must be > 0. Defaults to 1.gradient_clip_value (
float) – Maximum value for gradient clipping. If 0, no clipping is applied. Defaults to 0.gradient_clip_algorithm (
str) – Gradient clipping algorithm. Options: -"norm": Clip by gradient norm (recommended) -"value": Clip by gradient value Defaults to"norm".logger (
MLExperimentLogger) – Experiment logger for tracking metrics and artifacts. If None, uses TensorboardLogger. Defaults to None.non_blocking (
bool) – Whether to use asynchronous CUDA tensor transfers. Defaults to True.image_inverse_transform (
Callable) – Transformation to reverse image normalization for visualization in TensorBoard. Defaults to None.logger_img_size (
Union[int,Tuple[int,int]]) – Image size (int or tuple) for TensorBoard logging. Defaults to None.
- Raises:
AssertionError – If steps_per_epoch > len(train_loader).
AssertionError – If gradient_accumulation_steps <= 0.
AssertionError – If gradient_clip_algorithm not in [“norm”, “value”].
TypeError – If any metric is not a torch.nn.Module with a forward() method.
Note
Supports automatic mixed precision (AMP) if enabled in __init__
Automatically saves best validation model when validation improves
Handles DataParallel models automatically
Learning rate scheduler can step per epoch or per gradient update
For multi-GPU/distributed training, use FabricTrainer or AcceleratorTrainer
- predict(loader)[source]
Generates predictions for all data in the loader.
- Parameters:
loader – DataLoader containing data for prediction.
- Returns:
Tuple of (predictions, targets) where predictions are the model outputs and targets are the ground truth labels.
- predict_class(loader)[source]
Generates class predictions with probabilities for all data.
- Parameters:
loader – DataLoader containing data for prediction.
- Returns:
predicted_class: Predicted class labels
probability: Class probabilities or confidence scores
targets: Ground truth labels
- Return type:
Tuple of (predicted_class, probability, targets) where
- extract_features(loader, no_of_features, features_csv_file, iterations=1, target_known=True)[source]
Extracts features from the model and saves them to a CSV file.
- Parameters:
loader – DataLoader containing data for feature extraction.
no_of_features – Number of features to extract from the model.
features_csv_file – Path to the output CSV file.
iterations – Number of passes through the loader. Defaults to 1.
target_known – Whether ground truth labels are available. If True, includes labels in the CSV file. Defaults to True.
Note
Features are extracted in evaluation mode
CSV format: If target_known=True: [class, feat_0, feat_1, …]
CSV format: If target_known=False: [feat_0, feat_1, …]
- show_predictions(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), target_known=True)[source]
Visualizes model predictions on sample images.
- Parameters:
loader – DataLoader containing data for visualization.
image_inverse_transform – Transformation to reverse image normalization for display. Defaults to None.
samples – Number of samples to display. Defaults to 9.
cols – Number of columns in the visualization grid. Defaults to 3.
figsize – Figure size as (width, height) tuple. Defaults to (10, 10).
target_known – Whether ground truth targets are available for comparison. Defaults to True.
deepml.transforms module
- class deepml.transforms.AlbumentationTorchTranforms(albu_transforms=None, torch_transforms=None)[source]
Bases:
objectThis class is a composition of albumentations augmentation and torchvision.transforms.ToTensor() This first applies albumentations transformations followed by torch transforms if any.
albumentations transforms gets applied on both image and mask, however the torch transforms gets applied on only on input image and not on the target mask.
- class deepml.transforms.ImageInverseTransform(mean, std)[source]
Bases:
objectImplementation of the inverse transform for image using mean and std_dev Accepts image_batch in #B, #C, #H #W order
- class deepml.transforms.ImageNetInverseTransform[source]
Bases:
ImageInverseTransformImagenet inverse transform accepts image_batch in #B, #C, #H #W order
deepml.utils module
- deepml.utils.transform_target(target, classes=None)[source]
Accepts target value either single dimensional torch.Tensor or (int, float) :type target: :param target: :type classes: :param classes: :return:
- deepml.utils.transform_input(x, image_inverse_transform=None)[source]
Accepts input image batch in #BCHW form
- Parameters:
x – input image batch
image_inverse_transform – an optional inverse transform to apply
- Returns:
- deepml.utils.get_random_samples_batch_from_dataset(dataset, samples=8)[source]
Returns a random batch of samples from the dataset. :type dataset: :param dataset: torch.utils.data.Dataset or any iterable dataset :type samples: :param samples: no. of samples to return, defaults to 8 :rtype:
list:return: list of samples from the dataset
- deepml.utils.blend(image, mask, alpha=0.6, beta=0.4)[source]
Blends an input image with a mask using specified alpha and beta values. :type image:
Tensor:param image: torch.Tensor of size BCHW, Grayscale or RGB image to blend with the mask of size #HWC or #HW :type mask:Tensor:param mask: torch.Tensor, torch.Tensor of size BCHW , mask to blend with the input image of size #HWC or #HW :type alpha:float:param alpha: alpha blending factor for the RGB image :type beta:float:param beta: beta blending factor for the mask :rtype:array:return: torch.Tensor of original size, blended image
deepml.visualize module
- deepml.visualize.plot_images(images, labels=None, cols=4, figsize=(10, 10), fontsize=14)[source]
Displays a grid of images with optional labels using matplotlib.
Creates a multi-panel figure showing images in a grid layout with optional titles for each image.
- Parameters:
images (
List[ndarray]) – List of images as numpy arrays in HWC or HW format.labels (
List[str]) – List of labels/titles for each image. If provided, must have the same length as images. Defaults to None.cols (
int) – Number of columns in the grid. Rows are calculated automatically. Defaults to 4.figsize (
Tuple[int,int]) – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).fontsize (
int) – Font size for image titles. Defaults to 14.
Note
The function automatically calculates the number of rows needed based on the number of images and columns. Axes ticks are hidden for cleaner visualization.
- deepml.visualize.plot_images_with_title(image_generator, samples, cols=4, figsize=(10, 10), fontsize=14)[source]
Displays a grid of images with colored titles using matplotlib.
Creates a multi-panel figure showing images in a grid layout with titles that can have custom colors (useful for showing correct/incorrect predictions).
- Parameters:
image_generator –
Generator or iterable yielding tuples of (image, title, title_color) where:
image: numpy array in HWC or HW format
title: String title for the image
title_color: Optional color string (e.g., ‘red’, ‘green’, ‘#ff0000’). If None, uses default matplotlib text color.
samples (
int) – Total number of images to display from the generator.cols – Number of columns in the grid. Defaults to 4.
figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).
fontsize – Font size for image titles. Defaults to 14.
Note
This function is commonly used for showing model predictions where title colors indicate correctness (green for correct, red for incorrect).
- deepml.visualize.plot_images_with_bboxes(image_generator, samples, cols=4, figsize=(10, 10), fontsize=14, classes=None, class_color_map=None, cmap='tab10')[source]
Displays a grid of images with bounding boxes and class labels.
Creates a multi-panel figure showing images with drawn bounding boxes and labeled class names. Each bounding box is colored based on its class.
- Parameters:
image_generator –
Generator or iterable yielding tuples of (image, title, bboxes) where:
image: numpy array in HWC or HW format
title: String title for the image
bboxes: List of bounding boxes, each as [class_id, xmin, ymin, width, height] where class_id can be an integer index or string label.
samples (
int) – Total number of images to display from the generator.cols – Number of columns in the grid. Defaults to 4.
figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).
fontsize – Font size for image titles and bbox labels. Defaults to 14.
classes (
List[str]) – Optional list mapping class indices to class names. If provided and class_id is an integer, uses classes[class_id] as the label. Defaults to None.class_color_map (
dict) – Optional dictionary mapping class IDs or names to color strings (e.g., ‘#ff0000’, ‘red’). If a class has no mapping, falls back to the colormap. Defaults to None.cmap (
str) – Matplotlib colormap name used as fallback for bbox colors when class_color_map doesn’t provide a color. Defaults to “tab10”.
Note
Bounding boxes are drawn with red edges and labeled with a colored background box containing the class name. Label text is white for better visibility against the colored background.
- deepml.visualize.show_images_from_loader(loader, image_inverse_transform=None, samples=9, cols=3, figsize=(5, 5), classes=None, title_color=None)[source]
Displays random samples of images from a DataLoader.
Randomly selects and displays images from a PyTorch DataLoader with their corresponding labels as titles.
- Parameters:
loader – PyTorch DataLoader returning batches of (image, label) tensors.
image_inverse_transform – Optional callable to reverse image normalization or transformations before display (e.g., denormalization). Defaults to None.
samples – Number of images to display. Defaults to 9.
cols – Number of columns in the grid. Defaults to 3.
figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (5, 5).
classes – Optional list of class names for converting label indices to text. If None and loader.dataset has a ‘classes’ attribute, uses that. Defaults to None.
title_color – Optional color string for all image titles (e.g., ‘blue’). Defaults to None.
Note
Images are randomly sampled from the DataLoader. If the DataLoader’s dataset has a ‘classes’ attribute, it will be used automatically for label names unless overridden by the classes parameter.
- deepml.visualize.show_images_from_dataset(dataset, image_inverse_transform=None, samples=9, cols=3, figsize=(10, 10), classes=None, title_color=None)[source]
Displays random samples of images from a Dataset.
Randomly selects and displays images from a PyTorch Dataset with their corresponding labels as titles.
- Parameters:
dataset – PyTorch Dataset returning (image, label) tuples.
image_inverse_transform – Optional callable to reverse image normalization or transformations before display (e.g., denormalization). Defaults to None.
samples – Number of images to display. Defaults to 9.
cols – Number of columns in the grid. Defaults to 3.
figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).
classes – Optional list of class names for converting label indices to text. If None and dataset has a ‘classes’ attribute, uses that. Defaults to None.
title_color – Optional color string for all image titles (e.g., ‘blue’). Defaults to None.
Note
Images are randomly sampled from the Dataset. If the Dataset has a ‘classes’ attribute, it will be used automatically for label names unless overridden by the classes parameter.
- deepml.visualize.show_images_from_folder(img_dir, images=None, open_file_func=None, samples=9, cols=3, figsize=(10, 10), title_color=None)[source]
Displays random samples of images from a folder.
Randomly selects and displays images from a directory with filenames as titles.
- Parameters:
img_dir – Directory path containing image files.
images – Optional list of image filenames to display. If None, all files in img_dir are used and randomly sampled. Defaults to None.
open_file_func (
Callable) – Optional callable to open image files. Should accept a file path and return an image object. If None, uses PIL.Image.open. Defaults to None.samples – Number of images to display. If fewer images exist, displays all. Defaults to 9.
cols – Number of columns in the grid. Defaults to 3.
figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).
title_color – Optional color string for all image titles (e.g., ‘blue’). Defaults to None.
Note
If the number of requested samples exceeds available images, all images are displayed. Images are randomly sampled without replacement.
- deepml.visualize.show_images_from_dataframe(dataframe, img_dir=None, image_file_name_column='image', image_filepath_column=None, open_file_func=None, label_column=None, bbox_label_column=None, samples=9, cols=3, figsize=(10, 10), classes=None, class_color_map=None, cmap='tab10')[source]
Displays random samples of images from a pandas DataFrame.
Randomly selects and displays images specified in a DataFrame, with optional labels and bounding boxes.
- Parameters:
dataframe – pandas DataFrame containing image file information.
img_dir – Directory containing images. Required if image_filepath_column is not provided. Defaults to None.
image_file_name_column – Column name containing image filenames (used with img_dir). Defaults to “image”.
image_filepath_column – Column name containing absolute image file paths. If provided, takes precedence over image_file_name_column and img_dir. Defaults to None.
open_file_func (
Callable) – Optional callable to open image files. Should accept a file path and return an image object. If None, uses PIL.Image.open. Defaults to None.label_column (
str) – Column name containing image labels. If None, displays row indices instead. Defaults to None.bbox_label_column (
str) – Column name containing bounding box data. Each entry should be a list of bounding boxes in format [class_id, xmin, ymin, width, height]. If provided, displays images with bounding boxes. Defaults to None.samples – Number of random images to display from the DataFrame. Defaults to 9.
cols – Number of columns in the grid. Defaults to 3.
figsize – Size of the matplotlib figure as (width, height) tuple. Defaults to (10, 10).
classes – Optional list mapping class indices to class names for bbox labels. Defaults to None.
class_color_map (
dict) – Optional dictionary mapping class IDs or names to color strings (e.g., ‘#ff0000’, ‘red’). Used for bbox colors. Defaults to None.cmap (
str) – Matplotlib colormap name used as fallback for bbox colors when class_color_map doesn’t provide a color. Defaults to “tab10”.
Note
If bbox_label_column is provided, displays images with bounding boxes using plot_images_with_bboxes.
Otherwise, displays images with titles using plot_images_with_title.
Images are randomly sampled from the DataFrame.