Lin et al (2017) Feature Pyramid Networks for Object Detection

This paper proposed feature pyramid network to find scale invariant object detection, i.e., a model that can detect objects of different scales. One way to tackle scale invariant problem is to form an image pyramid of different scale and process each with the same model. This is a brute-force approach.... [more]

He, Gkioxari and Dollar and Girshick (2017) Mask R-CNN

This is the Mask R-CNN paper. It extends Faster R-CNN to produce pixel masks, i.e., for semantic segmentation. It extends Faster R-CNN by adding a branch for predicting segmentation masks on each Region of Interest (RoI). However, this have to support with RoIAlign instead of RoIPooling, to avoid error introduced... [more]