Models
Detectools models inherits from abstract class detectools.BaseModel that define necessary functions to be use in trainning and inference process (through Trainer & Predictor classes). They also inheriths from a class from the packages they come from (i.e. ultralytics or hugging face) to be used in development.
- class detectools.models.Mask2Former(num_classes: int = 1, pretrain: Literal['large', 'medium', 'small', 'tiny'] = 'tiny', overlap_mask_thr: float = 0.8)[source]
Mask2Former model class in detectools. This class inheriths from Mask2FormerForUniversalSegmentation (HuggingFace, transformers) and BaseModel (detectools). Construct Mask2Former model from huggingface/transformer model architectures.
- Parameters:
num_classes (
int, optional) – Number of classes. Defaults to 1.pretrain (
Literal['large', 'medium', 'small', 'tiny'], optional) – Size of the pretrained model. Defaults to “tiny”.overlap_mask_thr (
float, optional) – Mask threshold to merge masks from Mask2FormerOutput. Defaults to 0.8.
Attributes:
- confidence_thr
Confidence score threshold to consider object as true prediction.
- Type:
float
- max_detection
Maximum number of object to predict on one image.
- Type:
int
- nms_threshold
IoU threshold to consider 2 boxes as overlapping for Non Max Suppression algorithm.
- Type:
float
- num_classes
Number of classes.
- Type:
int
- size_configs
Dict of existing depth configuration for Mask2Former.
- Type:
Dict[str, str]
Methods:
- build_boxes(masks: Tensor) Tensor[source]
Build boxes from segmentation mask.
- Parameters:
masks (
Tensor) – Segmentation mask.- Returns:
Boxes (N, 4).
- Return type:
Tensor
- build_results(raw_outputs: Mask2FormerForUniversalSegmentationOutput, spatial_size: Tuple[int, int]) BatchedFormats[source]
Transform model outputs into BatchedFormats for results.
- Parameters:
raw_outputs (
Mask2FormerForUniversalSegmentationOutput) – Mask2Former output.spatial_size (
Tuple[int, int]) – Size of original image (H, W).
- Returns:
Model output as BatchedFormats.
- Return type:
BatchedFormats
- get_predictions(images: Tensor) BatchedFormats[source]
Prepare images, Apply model forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormats.
- Return type:
BatchedFormats
- inputs_to_device(input: Any, device: Literal['cpu', 'cuda'])[source]
Send Mask2Former inputs to device.
- prepare(images: Tensor, targets: BatchedFormats | None = None) Dict[str, Tensor | Dict[Any, Any]][source]
Transform images and targets into Mask2Former specific format for prediction & loss computation.
- Parameters:
images (
Tensor) – Batch images.targets (
BatchedFormats, optional) – Batched targets from DetectionDataset.
- Returns:
Images data prepared for Mask2Former.
If targets: images + targets prepared for Mask2Former.
- Return type:
Union[Any, Tuple[Any]]
- prepare_target(target: SegmentationFormat) Tuple[Tensor, Dict[int, int]][source]
Prepare targets for Mask2Former model.
- Parameters:
target (
SegmentationFormat) – Target.- Returns:
Segmentation map.
Dict of correspondance {object_id : object_label}.
- Return type:
Tuple[Tensor, Dict[int, int]]
- run_forward(images: Tensor, targets: BatchedFormats, predict: bool = False) Dict[str, Tensor] | Tuple[Dict[str, Tensor], BatchedFormats][source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormats) – Batch targets.predict (
bool, optional) – To return predictions or not. Defaults to False.
- Returns:
Loss dict.
If predict: Predictions.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormats]]
- class detectools.models.YoloDetection(architecture: str = 'yolov8m', num_classes: int = 1, pretrained=True, confidence_thr: float = 0.5, max_detection: int = 300, nms_threshold: float = 0.45, *args, **kwargs)[source]
YOLO detection model class in detectools. This class inheriths from DetectionModel (Ultralytics) and BaseModel (detectools). Load yolo architecture from ultralytics repository. If pretrained load a pretrain model from ultralytics.
- Parameters:
architecture (
str, optional) – Architecture to use to build YOLO model. Check Ultralytics availables architectures . Defaults to “yolov8m”.num_classes (
int, optional) – Number of classes in the task. Defaults to 1.pretrained (
bool, optional) – To use pretrained weights. Defaults to True.confidence_thr (
float, optional) – Confidence score threshold to consider object as true prediction. Defaults to 0.5.max_detection (
int, optional) – Maximum number of object to predict on one image. Defaults to 300.nms_threshold (
float, optional) – IoU threshold to consider 2 boxes as overlapping for Non Max Suppression algorithm.. Defaults to 0.45.
Attributes:
- confidence_thr
Confidence score threshold to consider object as true prediction.
- Type:
float
- max_detection
Maximum number of object to predict on one image.
- Type:
int
- nms_threshold
IoU threshold to consider 2 boxes as overlapping for Non Max Suppression algorithm.
- Type:
float
- num_classes
Number of classes.
- Type:
int
Methods:
- build_results(raw_outputs: List[Tensor], prebuild_outputs: Tensor) BatchedFormats[source]
Transform model outputs into Batch DetectionFormat for results.
- Parameters:
raw_outputs (
List[Tensor]) – Model outputs.prebuild_outputs (
Tensor) – Extracted boxes from YOLO raw outputs.
- Returns:
Batched predictions.
- Return type:
BatchedFormats
- compute_loss(raw_outputs: Tensor, targets: Dict[str, Tensor]) Dict[str, Tensor][source]
Compute loss with predictions & targets.
- Parameters:
raw_outputs (
Any) – Raw output of model.targets (
DetectionFormat) – Targets in YOLO format.
- Returns:
Loss dict with total loss (key: “loss”) & sublosses.
- Return type:
Dict[str, Tensor]
- get_predictions(images: Tensor) BatchedFormats[source]
Prepare images, Apply YOLO forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormats.
- Return type:
BatchedFormats
- prepare(images: Tensor, targets: BatchedFormats | None = None) Tensor | Tuple[Tensor, Dict[str, Tensor]][source]
Transform images and targets into YOLO specific format for prediction & loss computation.
- Parameters:
images (
Tensor) – Batch images.targets (
BatchedFormats, optional) – Batched targets from DetectionDataset.
- Returns:
Images data prepared for YOLO.
If targets: images + targets prepared for YOLO.
- Return type:
Union[Tensor, Tuple[Tensor, Dict[str, Tensor]]]
- prepare_image(images: Tensor) Tuple[Tensor, Tuple[int]][source]
Pad images if needed & return padding values.
- Parameters:
images (
Tensor) – Batch_images.- Returns:
Padded images.
Padding values.
- Return type:
Tuple[Tensor, Tuple[int]]
- prepare_target(targets: BatchedFormats) Dict[str, Tensor][source]
Transform DetectionFormat targets into yolo targets format.
- Parameters:
targets (
BatchedFormats) – Batch targets.- Returns:
Targets in YOLO format.
- Return type:
Dict[str, Tensor]
- retrieve_spatial_size(raw_outputs: List[Tensor]) Tuple[int][source]
Retrieve image shape from raw_outputs and stride values.
- Parameters:
raw_outputs (
List[Tensor]) – Raw ouptuts from YOLO model.- Returns:
Size of input image (H, W).
- Return type:
Tuple[int]
- run_forward(images: Tensor, targets: BatchedFormats, predict: bool = False) Dict[str, Tensor] | Tuple[Dict[str, Tensor], BatchedFormats][source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormats) – Batch targets.predict (
bool, optional) – To return predictions or not. Defaults to False.
- Returns:
Loss dict.
If predict: predictions.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormats]]
- to_device(device: Literal['cpu', 'cuda'])[source]
Send model & criterion to device.
- Parameters:
device (
Literal['cpu', 'cuda']) – Device to send model on.
- yolo_pad_requirements(input_object: Tensor | DetectionFormat) List[int][source]
Return values for padding to fit ‘divisible by 32’ requirement.
- Parameters:
input_object (
Union[Tensor, DetectionFormat]) – Input to pad (image or DetectionFormat).- Returns:
Padding values.
- Return type:
List[int]
- class detectools.models.Yolov8Segmentation(architecture: str = 'yolov8n-seg', pretrained=True, confidence_thr: float = 0.5, max_detection: int = 300, nms_threshold: float = 0.45, num_classes: int = 1, *args, **kwargs)[source]
YOLO segmentation model class in detectools. This class inheriths from SegmentationModel (Ultralytics) and BaseModel (detectools). Load yolo architecture from ultralytics repository. If pretrained load a pretrain model from ultralytics.
- Parameters:
architecture (
str, optional) – Architecture to use to build YOLO model. Check Ultralytics availables architectures . Defaults to “yolov8m”.num_classes (
int, optional) – Number of classes in the task. Defaults to 1.pretrained (
bool, optional) – To use pretrained weights. Defaults to True.confidence_thr (
float, optional) – Confidence score threshold to consider object as true prediction. Defaults to 0.5.max_detection (
int, optional) – Maximum number of object to predict on one image. Defaults to 300.nms_threshold (
float, optional) – IoU threshold to consider 2 boxes as overlapping for Non Max Suppression algorithm.. Defaults to 0.45.
Attributes:
- confidence_thr
Confidence score threshold to consider object as true prediction.
- Type:
float
- max_detection
Maximum number of object to predict on one image.
- Type:
int
- nms_threshold
IoU threshold to consider 2 boxes as overlapping for Non Max Suppression algorithm.
- Type:
float
- num_classes
Number of classes.
- Type:
int
Methods:
- build_results(raw_output: Tuple[Tensor, ...]) BatchedFormats[source]
Transform model outputs into Batch SegmentationFormat for results.
- Parameters:
raw_outputs (
List[Tensor]) – Model outputs.prebuild_outputs (
Tensor) – Extracted boxes from YOLO raw outputs.
- Returns:
Batched predictions.
- Return type:
BatchedFormats
- compute_loss(predictions: Tuple, target: Dict) Dict[str, Tensor][source]
Compute loss with predictions & targets.
- Parameters:
raw_outputs (
Any) – Raw output of model.targets (
DetectionFormat) – Targets in YOLO format.
- Returns:
Loss dict with total loss (key: “loss”) & sublosses.
- Return type:
Dict[str, Tensor]
- get_predictions(images: Tensor) BatchedFormats[source]
Prepare images, Apply YOLO forward pass and build results.
- Parameters:
images (
Tensor) – RGB images Tensor.- Returns:
Predictions for images as BatchedFormats.
- Return type:
BatchedFormats
- mask2yolo(mask: Tensor) Tensor[source]
Convert stacked binary to yolo mask, i.e (1, h, w) with values in [0, … , Nobjs] This shape is suitable for yolov8 loss.
- Parameters:
mask (
Tensor) – Stacked binary mask (N, H, W).- Returns:
YOLO segmentation mask.
- Return type:
Tensor
- prebuild_output(raw_output: Tuple[Tensor, ...]) Tuple[Tensor, ...][source]
Unpack Yolov8-seg (eval mode) raw results.
- Parameters:
raw_output (
Tuple[Tensor, ...]) – Yolov8 raw eval mode results.- Returns:
boxes (N_batch, N_obj, cxcywh).
cls_scores (N_batch, N_cls).
mask_weights (N_batch, N_obj, 32).
protos (N_batch, protos).
- Return type:
Tuple[Tensor, ...]
- prepare(images: Tensor, targets: BatchedFormats | None = None) Any | Tuple[Any][source]
Transform images and targets into YOLO specific format for prediction & loss computation.
- Parameters:
images (
Tensor) – Batch images.targets (
BatchedFormats, optional) – Batched targets from DetectionDataset.
- Returns:
Images data prepared for YOLO.
If targets: images + targets prepared for YOLO.
- Return type:
Union[Tensor, Tuple[Tensor, Dict[str, Tensor]]]
- prepare_image(images: Tensor) Tuple[Tensor, int][source]
Pad images if needed & return padding values.
- Parameters:
images (
Tensor) – Batch_images.- Returns:
Padded images.
Padding values.
- Return type:
Tuple[Tensor, Tuple[int]]
- prepare_target(target: BatchedFormats) Dict[str, Tensor][source]
Transform SegmentationFormat targets into yolo-seg targets format.
- Parameters:
targets (
BatchedFormats) – Batch targets.- Returns:
Targets in YOLO format.
- Return type:
Dict[str, Tensor]
- proto2mask(protos: Tensor, weights: Tensor, boxes: Tensor, shape: Tuple[int]) Tensor[source]
Combine protos and weights to get masks, then crop instances from boxes (Useful in predictions).
- Parameters:
protos (
Tensor) – Sub masks (32, …).weights (
Tensor) – YOLO mask weights (32, …).boxes (
Tensor) – Boxes (N, 4) in XYXY format.shape (
Tuple[int]) – Original image size (H, W).
- Returns:
YOLO segmentation mask.
- Return type:
Tensor
- retrieve_spatial_size(raw_outputs: List[Tensor]) Tuple[int, int][source]
Retrieve image shape from raw_outputs and stride values.
- Parameters:
raw_outputs (
List[Tensor]) – Raw ouptuts from YOLO model.- Returns:
Size of input image (H, W).
- Return type:
Tuple[int]
- run_forward(images: Tensor, targets: BatchedFormats, predict: bool = False) Dict[str, Tensor] | Tuple[Dict[str, Tensor], BatchedFormats][source]
Compute loss from images and if target passed, compute loss & return both loss dict and results.
- Parameters:
images (
Tensor) – Batch RGB images.targets (
BatchedFormats) – Batch targets.predict (
bool, optional) – To return predictions or not. Defaults to False.
- Returns:
Loss dict.
If predict: predictions.
- Return type:
Union[Dict[str, Tensor], Tuple[Dict[str, Tensor], BatchedFormats]]
- to_device(device: Literal['cpu', 'cuda'])[source]
Send model & criterion to device.
- Parameters:
device (
Literal['cpu', 'cuda']) – Device to send model on.
- yolo_pad_requirements(input_object: Tensor | SegmentationFormat) Tuple[int, ...][source]
Return values for padding to fit ‘divisible by 32’ requirement.
- Parameters:
input_object (
Union[Tensor, DetectionFormat]) – Input to pad (image or DetectionFormat).- Returns:
Padding values.
- Return type:
List[int]