Main Features

Loading data

Detectools provides a custom Dataset (DetectionDataset) and a custom DataLoader (DetectionLoader) that inheriths from PyTorch Dataset & DataLoaders and allow to read & transform both images and targets. To use this Dataset data should be stored in a folder that respect the following structure:

. Dataset Name
    ├── images
        ├── name01.png
        ├── name02.png
        └── ...
    └── coco_annotations.json

File “coco_annotations.json” contain detection annotations in COCO format.

class detectools.data.DetectionDataset(dataset_path: str, preprocessing: ~typing.Callable = Compose(       ConvertImageDtype()       Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], inplace=False) ), augmentation: ~detectools.data.augmentation.Augmentation | None = None, convert_labels_dict: ~typing.Dict[int, int] | None = None, min_border_size: float = 10, rescale_boxes_from_masks: bool = False)[source]

Detectools Dataset.

Parameters:
  • dataset_path (str) – Path to dataset.

  • preprocessing (Callable[Tensor], optional) – Callable that preprocess tensor images (i.e. Scale values to 0-1 then normalize with image weights channels mean). Defaults to build_preprocessing().

  • augmentation (Augmentation, optional) – Augmentation pipeline for both images and targets. Defaults to None.

  • convert_labels_dict (Dict[int, int], optional) – Dict to dynamically convert labels ({old_labels : new_labels}). Defaults to None.

  • min_border_size (float, optional) – Minimum size of boxes, remove boxes that have border < min_border_size. Defaults to 10.

  • rescale_boxes_from_masks (bool, optional) – Compute boxes from masks after transformations (time consuming). Defaults to False.

Example:

>>> from detectools.dataset import DetectionDataset
>>> data_path = "path/to/data"
>>> dataset = DetectionDataset(data_path)
>>> image, target, image_name = dataset[1]
>>> print(type(image), type(target), type(image_name))
<class 'torch.Tensor' >, <class 'DetectionFormat' >, <class 'str'>
>>> print(image.shape, target.size, image_name)
torch.Size([3,512,512]), 5, 'img_01.png'

Attributes

image_folder

Path to images folder.

Type:

str

coco

COCO dataset.

Type:

COCO

coco_indexes

Indexes of images in coco dataset.

Type:

List[int]

name_id_dict

Dict of correspondance between image name and coco index.

Type:

Dict[str, int]

categories

List of categories as “categories” section in COCO json file.

Type:

List[Dict[str, Any]]

classes

(List[str]): List of classes names.

preprocessing

Callable that preprocess image images (i.e. Scale values to 0-1 then normalize with image weights channels mean). Defaults to build_preprocessing().

Type:

Callable, optional

augmentation

Augmentation pipeline for both images and targets. Defaults to None.

Type:

Augmentation, optional

convert_labels_dict

Dict to dynamically convert labels ({old_labels : new_labels}). Defaults to None.

Type:

Dict[int, int], optional

min_border_size

Minimum size of boxes, remove boxes that have border < min_border_size. Defaults to 10.

Type:

float, optional

rescale_boxes_from_masks

Compute boxes from masks after transformations (time consuming). Defaults to False.

Type:

bool, optional

Methods

export_dataset(dataset_path: str | Path, indices: Sequence[int] | None = None)[source]

Export all or a part of dataset annotations to a json file.

Parameters:
  • dataset_path (Union[str, Path]) – Path to write json.

  • indices (Sequence[int], optional) – Indices of images to export. Default to None.

get_image_data(image_name: str, transform: bool = False) Tuple[Tensor, BaseFormat, str][source]

Return image Tensor & BaseFormat for image_name. It’s a __getitem__ with name as indice.

Parameters:
  • image_name (str) – Name of image to gather.

  • transform (bool, optional) – To apply transformations or not. default False.

Returns:

  • Image Tensor

  • BaseFormat

Return type:

Tuple[Tensor, BaseFormat, str]

load_from_coco(index: int) Tuple[Tensor, BaseFormat, str][source]

Gather image name, indices & corresponding annotations from coco dataset.

Parameters:

index (int) – Index of Image/Target pair.

Returns:

  • Transformed Image.

  • Transformed Target.

  • Name of image.

Return type:

Tuple[Tensor, BaseFormat, str]

transform(image: Tensor, target: BaseFormat) Tuple[Tensor, BaseFormat][source]
Apply transformation to image/target pair:
  • Augmentation

  • Preprocessing

  • Labels conversion (if convert_labels_dict)

  • Sanitize boxes

Parameters:
  • image (Tensor) – Tensor image.

  • target (BaseFormat) – Target.

Returns:

  • Transformed Image.

  • Transformed Target.

Return type:

Tuple[Tensor, BaseFormat]

class detectools.data.DetectionLoader(*args, **kwargs)[source]

Child class of DataLoader that batchify images and BaseFormats. DetectionLoader support any features from torch Dataloaders (Sampler, etc..).

>>> from detectools.dataset import DetectionLoader
>>> loader = DetectionLoader(dataset, batch_size=2)
>>> for i, (images, targets, names) in enumerate(loader):
>>>     if i >= 1:
>>>         break
>>>     print(type(images), type(targets), type(names))
>>>     print(images.shape, len(targets.formats.keys()), len(names)
<class 'torch.Tensor' >, <class 'BatchedFormats' >, <class 'list' >
torch.Size([2,3,512,512]), 2, 2

Methods

collate_fn(batch: List[Tuple[Tensor, BaseFormat]]) Tuple[Tensor, BatchedFormats][source]
Parameters:

batch (List[Tuple[Tensor, BaseFormat]]) – List of pairs image/target.

Returns:

  • Batch images (N, 3, H, W).

  • BaseFormats wrapped into BatchedFormats class.

Return type:

Tuple[Tensor, BatchedFormats]

pad_to_larger(images: List[Tensor], targets: List[BaseFormat]) Tuple[List[Tensor], List[BaseFormat]][source]

Pad images and targets to larger image size.

Parameters:
  • images (List[Tensor]) – Images.

  • targets (List[BaseFormat]) – Targets.

Returns:

  • Images padded.

  • targets padded.

Return type:

Tuple[List[Tensor], List[BaseFormat]]

Trainning process

The whole trainning process can be done using Trainer class. It wraps all trainning utilities (optimizer, scheduler, tensorboard, etc.).

class detectools.train.Trainer(model: BaseModel, otpimizer: Optimizer, log_dir: str = '', metrics: List[Metric] = [], device: Literal['cpu', 'cuda'] = 'cpu', nms_threshold: float = 0.45, confidence_threshold: float = 0.5)[source]

Trainer wrap the whole training process.

Parameters:
  • model (BaseModel) – Detectools model.

  • otpimizer (Optimizer) – Optimizer (from torch.optim).

  • log_dir (str, optional) – Path to store tensorboard logs. Defaults to “” (no log).

  • metrics (List[Metric], optional) – List of detectools metrics that will be computed at valid. Defaults to [].

  • device (Literal['cpu', 'cuda'], optional) – Device to use for trainning. Defaults to “cpu”.

  • nms_thr (float, optional) – IoU threshold to consider 2 boxes as overlapping in NMS algorithm using for valid loop. Defaults to 0.45.

  • confidence_thr (float, optional) – Minimum confidence score for each predicted object to be kept, used for valid loop. Defaults to 0.5.

  • Example – Trainning loop.


>>> from detectools.dataset import DetectionDataset, DetectionLoader
>>> from torch.utils.data import random_split
>>> import torch
>>> trainer : Trainer # Trainer already built
>>> data_path = "path/to/data"
>>> dataset = DetectionDataset(data_path)
>>> train_set, valid_set, test_set = random_split(dataset, 0.8,0.2,0.0)
>>> train_loader, valid_loader = DetectionLoader(train_set), DetectionLoader(valid_set)
>>> for e in range(10):
>>>     trainer.train_epoch(train_loader)
>>>     trainer.valid_epoch(valid_loader)
>>> torch.save(trainer.model, "/model/path/model.pth")

Attributes:

model

Detectools model.

Type:

BaseModel

otpimizer

Optimizer (from torch.optim).

Type:

Optimizer

log_dir

Path to store tensorboard logs.

Type:

str, optional

metrics

List of detectools metrics that will be computed at valid.

Type:

List[Metric], optional

device

Device to use for trainning.

Type:

Literal['cpu', 'cuda'], optional

nms_thr

IoU threshold to consider 2 boxes as overlapping in NMS algorithm using for valid loop.

Type:

float, optional

confidence_thr

Minimum confidence score for each predicted object to be kept, used for valid loop.

Type:

float, optional

Methods

compute_metrics(predictions: BatchedFormats, targets: BatchedFormats) Dict[str, Tensor][source]

Compute metrics.

Parameters:
  • predictions (BatchedFormats) – Predictions.

  • targets (BatchedFormats) – Targets.

  • predictions – Predictions.

  • targets – Targets.

Returns:

  • Metric values.

Return type:

Dict[str, Tensor]

epoch(loader: DetectionLoader, ep_number: int, mode: Literal['Train', 'Valid'] = 'Train', tag: str = '') Dict[str, Tensor][source]

Run trainning epoch.

Parameters:
  • loader (DetectionLoader) – DetectionLoader.

  • ep_number (int) – Epoch number.

  • mode (Literal['Train', 'Valid'], optional) – Mode of epoch, if “Valid” metrics are computed and gradient shuted down. Defaults to “Train”.

  • tag (str, optional) – Tag to link to epoch. Defaults to “”.

Returns:

  • Epochs values (Losses & metrics).

Return type:

Dict[str, Tensor]

log_string(epoch_dict: Dict[str, Tensor]) str[source]

Transform epoch dict in string.

Parameters:

epoch_dict (Dict[str, Tensor]) – Dict of epoch values to display.

Returns:

  • String to print with epoch values.

Return type:

str

train_epoch(loader: DetectionLoader, ep_number: int, tag: str = 'Train', *args, **kwargs) Dict[str, Tensor][source]

Run train epoch.

Parameters:
  • loader (DetectionLoader) – DetectionLoader.

  • ep_number (int) – Epoch number.

  • tag (str, optional) – Tag to link to epoch. Defaults to “Train”.

Returns:

  • Epochs values (Losses).

Return type:

Dict[str, Tensor]

train_step(images: Tensor, targets: BatchedFormats) Dict[str, Tensor][source]

Run forward pass, loss computation and backward pass.

Parameters:
  • images (Tensor) – Batch images

  • targets (BatchedFormats) – Batch targets.

Returns:

  • Dict of losses containing (total loss at key ‘loss’).

Return type:

Dict[str, Tensor]

valid_epoch(loader: DetectionLoader, ep_number: int, tag: str = 'Valid', *args, **kwargs) Dict[str, Tensor][source]

Run train epoch.

Parameters:
  • loader (DetectionLoader) – DetectionLoader.

  • ep_number (int) – Epoch number.

  • tag (str, optional) – Tag to link to epoch. Defaults to “Valid”.

Returns:

  • Epochs values (Losses & metrics).

Return type:

Dict[str, Tensor]

valid_step(images: Tensor, targets: BatchedFormats) Tuple[Dict[str, Tensor], Dict[str, Dict[str, Tensor]]][source]

Run train step, return loss dict.

Parameters:
  • images (Tensor) – Batch images.

  • targets (BatchedFormats) – Targets.

Returns:

  • Losses and metrics values.

Return type:

Tuple[Dict[str, Tensor], Dict[str, Dict[str, Tensor]]]

Inference process

class detectools.inference.Predictor(model: ~detectools.models.base.BaseModel, patch_size: ~typing.Tuple[int] | None = None, overlap: float = 0.0, nms_thr: float = 0.45, confidence_thr: float = 0.5, max_detection: int = 300, batch_size: int = 16, preprocessing: ~typing.Callable = Compose(       ConvertImageDtype()       Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], inplace=False) ), device: ~typing.Literal['cpu', 'cuda'] = 'cpu')[source]

Predictor class wrap whole inference process to predict on images, including large images with patchification process.

Parameters:
  • model (BaseModel) – Model from detectools.

  • patch_size (Tuple[int], optional) – Size of patch to predict on, if None patch_size = image_size. Defaults to None.

  • overlap (float, optional) – Size of patches to patchify large image, if 0.0 no patchification done. Defaults to 0.0.

  • nms_thr (float, optional) – IoU threshold to consider 2 boxes as overlapping in NMS algorithm. Defaults to 0.45.

  • confidence_thr (float, optional) – Minimum confidence score to keep predicted objects. Defaults to 0.5.

  • max_detection (int, optional) – Maximum objects to keep in each prediction, the ones with higer scores are kept. Defaults to 300.

  • batch_size (int, optional) – Batch size for inference process, patches will be process in batch. Defaults to 16.

  • preprocessing (Callable, optional) – Preprocessing function to prepare image. Defaults to build_preprocessing().

  • device (Literal['cpu', 'cuda'], optional) – Device to use for prediction. Defaults to “cpu”.

Example: Predict on image.

>>> from detectools.predictor import Predictor
>>> from detectools import write_json
>>> import torch
>>> model = torch.load("/path/to/model.pth")
>>> predictor = Predictor(model=model)
>>> predictions = predictor.predict("/path/to/rgb/image.png")
>>> pred_coco_dict = predictions.coco()
>>> write_json("/path/to/output.json", pred_coco_dict)

Attributes:

model

Model from detectools.

Type:

BaseModel

patch_size

Size of patch to predict on, if None, patch_size == image_size. Defaults to None.

Type:

Tuple[int]

overlap

Size of patches to patchify large image, if 0.0 no patchification done. Defaults to 0.0.

Type:

float

nms_thr

IoU threshold to consider 2 boxes as overlapping in NMS algorithm. Defaults to 0.45.

Type:

float

confidence_thr

Minimum confidence score to keep predicted objects. Defaults to 0.5.

Type:

float

max_detection

Maximum objects to keep in each prediction, the ones with higer scores are kept. Defaults to 300.

Type:

int

batch_size

Batch size for inference process, patches will be process in batch. Defaults to 16.

Type:

int,

preprocessing

Preprocessing function to prepare image. Defaults to build_preprocessing().

Type:

Callable

device

Device to use for prediction. Defaults to “cpu”.

Type:

Literal['cpu', 'cuda']

Methods:

forward_pass(batch_patchs: Tensor) List[BaseFormat][source]

Get predictions for patches.

Parameters:

batch_patchs (Tensor) – Batch of image patches.

Returns:

  • Patches predictions.

Return type:

List[BaseFormat]

predict(image: Tensor | str, visualisation_path: str = '') BaseFormat[source]

Predict on image.

Parameters:
  • image (Union[Tensor, str]) – Image to predict. If is a path, load image and predict on it.

  • visualisation_path (str, optional) – Path to save prediction visualisation. Defaults to “” (no visualisation).. Defaults to “”.

Returns:

  • Prediction.

Return type:

BaseFormat