TY - GEN
T1 - Towards Accurate 3D Human Body Reconstruction from Silhouettes
AU - Smith, Brandon M.
AU - Chari, Visesh
AU - Agrawal, Amit
AU - Rehg, James M.
AU - Sever, Ram
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - We propose a novel computer vision system for reconstructing 3D body shapes from 2D images with the goal of producing highly accurate anthropomorphic measurements from a pair of images. We adopt a supervised learning approach that maps silhouette images to 3D body shapes via a convolutional neural network (CNN). We propose three key improvements over previous approaches: (1) Large-scale realistic synthetic data generation, including more realistic variations in segmentation noise and camera viewpoints. (2) A multi-task learning (MTL) approach to predicting multiple outputs such as shape, 3D joint locations, pose angles, and body volume. (3) A new network architecture that additionally takes known body measurements (e.g., height) and per-pixel segmentation confidence as input. Ablation studies show the improvement in accuracy due to the various components of our system. Results demonstrate that our system produces state-of-the-art results on body circumference errors. We also analyze the repeatability of our system in the presence of realistic camera, background, and pose variations. Our system achieves a vertex standard deviation of ~3mm on the [36] CAESAR dataset.
AB - We propose a novel computer vision system for reconstructing 3D body shapes from 2D images with the goal of producing highly accurate anthropomorphic measurements from a pair of images. We adopt a supervised learning approach that maps silhouette images to 3D body shapes via a convolutional neural network (CNN). We propose three key improvements over previous approaches: (1) Large-scale realistic synthetic data generation, including more realistic variations in segmentation noise and camera viewpoints. (2) A multi-task learning (MTL) approach to predicting multiple outputs such as shape, 3D joint locations, pose angles, and body volume. (3) A new network architecture that additionally takes known body measurements (e.g., height) and per-pixel segmentation confidence as input. Ablation studies show the improvement in accuracy due to the various components of our system. Results demonstrate that our system produces state-of-the-art results on body circumference errors. We also analyze the repeatability of our system in the presence of realistic camera, background, and pose variations. Our system achieves a vertex standard deviation of ~3mm on the [36] CAESAR dataset.
KW - Anthropomorphic Measurements
KW - Human Body Reconstruction
KW - Multitask Learning
KW - Segmentation Confidence
KW - Shape From Silhouette
KW - Synthetic Data
UR - http://www.scopus.com/inward/record.url?scp=85075024090&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075024090&partnerID=8YFLogxK
U2 - 10.1109/3DV.2019.00039
DO - 10.1109/3DV.2019.00039
M3 - Conference contribution
AN - SCOPUS:85075024090
T3 - Proceedings - 2019 International Conference on 3D Vision, 3DV 2019
SP - 279
EP - 288
BT - Proceedings - 2019 International Conference on 3D Vision, 3DV 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Conference on 3D Vision, 3DV 2019
Y2 - 15 September 2019 through 18 September 2019
ER -