Abstract
This work introduces a novel two steps framework for the task of multi-human pose estimation. Multi-person pose estimation in wild images is a challenging problem, where human detector inevitably suffers from errors both in localization and recognition. These undesirable errors would ultimately result in failures of most CNN-based single-person pose estimators. In this paper, a novel regional multi-person pose estimation (RMPE) framework is proposed to facilitate single-person pose estimator in presence of the inaccurate human detector. In particular, our framework consists of three novel techniques, namely, symmetric spatial transformer networks (SSTN), deep proposals generator (DPG) and parametric pose non-maximum suppression (NMS). Extensive experimental results have demonstrated the validity and effectiveness of the proposed approach. In comparison to state-of-the-art approaches, the proposed approach significantly achieves 16% relative increase in mAP and 600 times speed up on MPII (multi person) dataset.