HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences

CVPR 2021

Feitong Tan1,2      Danhang Tang1      Mingsong Dou1      Kaiwen Guo1      Rohit Pandey1      Cem Keskin1     
Ruofei Du1      Deqing Sun1      Sofien Bouaziz1      Sean Fanello1      Ping Tan2      Yinda Zhang1

1   Google                                 2    Simon Fraser University
                                       






Abstract


In this paper, we address the problem of building dense correspondences between human images under arbitrary camera viewpoints and body poses. Prior art either assumes small motion between frames or relies on local descriptors, which cannot handle large motion or visually ambiguous body parts, e.g. left v.s. right hand. In contrast, we propose a deep learning framework that maps each pixel to a feature space, where the feature distances reflect the geodesic distances among pixels as if they were projected onto the surface of a 3D human scan. To this end, we introduce novel loss functions to push features apart according to their geodesic distances on the surface. Without any semantic annotation, the proposed embeddings automatically learn to differentiate visually similar parts and align different subjects into an unified feature space. Extensive experiments show that the learned embeddings can produce accurate correspondences between images with remarkable generalization capabilities on both intra and inter subjects.




Paper



[arXiv]     [GitHub]    

Download the paper (6M) here.

Download the supplementary (9M) here.


Citation

BibTeX, 1 KB

@inproceedings{tan2021humangps,
        author = {Tan, Feitong and Tang, Danhang and Mingsong, Dou and Kaiwen, Guo and Pandey, Rohit and Keskin, Cem and Du, Ruofei and Sun, Deqing and Bouaziz, Sofien and Fanello, Sean and Tan, Ping and Zhang, Yinda},
        title = {HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences},
        booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
        month = {June},
        year = {2021}
    }
        


Video Presentation



Download the video (21M) here.



Live Demo


1. Click Choose File to upload a human image and a mask image (recommended w : h = 256 : 384) or use the example images:
2. Click 'Process' button to run the model. You may use , , , and to doodle on the mask image.
Note that the first time of 'Process' takes longer time for initialization. The model may take a few seconds to load in some region.

HumanGPS







HumanGPS v.s. DensePose



Videos from Yasamin Jafarian et.al.. Not used in paper and only for demo purpose.



Applications



Nonrigid Tracking and Fusion


Qualitative comparison of non-rigid fusion with learned correspondences

Quantitative comparison of non-rigid tracking with learned correspondences





Morphing Examples


Intra-subject Morphing
Input 1
Input 2
Result
Input 1
Input 2
Result
Input 1
Input 2
Result

Inter-subject Morphing
Input 1
Input 2
Result
Input 1
Input 2
Result
Input 1
Input 2
Result
Input 1
Input 2
Result