torchvision.models.detection.mask_rcnn — Torchvision main …
https://pytorch.org/.../torchvision/models/detection/mask_rcnn.htmlDuring training, the model expects both the input tensors, as well as a targets (list of dictionary), containing: - boxes (``FloatTensor[N, 4]``): the ground-truth boxes in ``[x1, y1, x2, y2]`` format, with ``0 <= x1 < x2 <= W`` and ``0 <= y1 < y2 <= H``. - labels (``Int64Tensor[N]``): the class label for each ground-truth box - masks (``UInt8Tensor[N, H, W]``): the segmentation binary masks ...
1. Predict with pre-trained Mask RCNN models — gluoncv 0.11.0 …
https://cv.gluon.ai/build/examples_instance/demo_mask_rcnn.htmlThe Mask RCNN model returns predicted class IDs, confidence scores, bounding boxes coordinates and segmentation masks. Their shape are (batch_size, num_bboxes, 1), (batch_size, num_bboxes, 1) (batch_size, num_bboxes, 4), and (batch_size, num_bboxes, mask_size, mask_size) respectively. For the model used in this tutorial, mask_size is 14.
[1703.06870] Mask R-CNN - arxiv.org
https://arxiv.org/abs/1703.0687020.03.2017 · The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps.