This paper proposed the cascaded architecture to improve the bounding box
quality in object detection. It is to improve the Faster R-CNN framework, which
is run in two stages.
[more]
Lin et al (2017) Feature Pyramid Networks for Object Detection
This paper proposed feature pyramid network to find scale invariant object detection, i.e., a model that can detect objects of different scales. One way to tackle scale invariant problem is to form an image pyramid of different scale and process each with the same model. This is a brute-force approach....
[more]
He et al (2016) Deep Residual Learning for Image Recognition
This is the ResNet paper. Not only this proposed the shortcut connection
architecture, but also give the architecture for image classification and
object detection.
[more]
He, Gkioxari and Dollar and Girshick (2017) Mask R-CNN
This is the Mask R-CNN paper. It extends Faster R-CNN to produce pixel masks, i.e., for semantic segmentation. It extends Faster R-CNN by adding a branch for predicting segmentation masks on each Region of Interest (RoI). However, this have to support with RoIAlign instead of RoIPooling, to avoid error introduced...
[more]
Shelhamer, Long, and Darrell (2017) Fully Convolutional Networks for Semantic Segmentation
This journal paper and its 2015 conference paper is to propose a solution for semantic segmentation, i.e., classification of pixels on an image to objects. It is more refined than a bounding box. The paper proposed a solution that involves only convolutional layers but not fully connected layers so that...
[more]