Liu et al (2016) SSD: Single Shot MultiBox Detector

This paper distinct from previous work in the sense that the older approach of object detection first hypothesize bounding boxes, resample features for each box, then apply a classifier. This paper proposed a network that does not resample for bounding box hypotheses but equally accurate. It can do high speed... [more]

Wang et al (2020) HRNet

Classification networks such as AlexNet, VGGNet, GoogLeNet, ResNet are all reducing spatial size and produce a low-resolution representation. High-resolution representation is produced in U-net, for example, using dilated convolution and upsampling. This paper proposes HRNet that maintain high-resolution representation through the whole process. It starts from high resolution convolution stream... [more]

Wei et al (2016) Convolutional Pose Machine

This paper proposes Convolutional Pose Machines (CPMs), which is a computer vision deep learning model to identify human poses in the form of keypoints. The output of the model are 2D belief maps, i.e., a heatmap of the predicted probability of the location of a keypoint. The architecture of the... [more]

Lin et al (2017) Focal Loss for Dense Object Detection

This is the paper proposed RetinaNet and also the focal loss function to better train object detection models. Object detection models are in two camps, there are two-stage proposal-driven models such as R-CNN, and one-stage detector such that YOLO and SSD. The paper claimed that the prior result on one-stage... [more]