Face detection with end-to-end integration of a convNet and a 3D model

Yunzhu Li, Benyuan Sun, Tianfu Wu, Yizhou Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution


This paper presents a method for face detection in the wild, which integrates a ConvNet and a 3D mean face model in an end-to-end multi-task discriminative learning framework. The 3D mean face model is predefined and fixed (e.g., we used the one provided in the AFLW dataset). The ConvNet consists of two components: (i) The face proposal component computes face bounding box proposals via estimating facial key-points and the 3D transformation (rotation and translation) parameters for each predicted key-point w.r.t. the 3D mean face model. (ii) The face verification component computes detection results by pruning and refining proposals based on facial key-points based configuration pooling. The proposed method addresses two issues in adapting stateof- the-art generic object detection ConvNets (e.g., faster R-CNN) for face detection: (i) One is to eliminate the heuristic design of predefined anchor boxes in the region proposals network (RPN) by exploiting a 3D mean face model. (ii) The other is to replace the generic RoI (Regionof- Interest) pooling layer with a configuration pooling layer to respect underlying object structures. The multi-task loss consists of three terms: the classification Softmax loss and the location smooth l1-losses of both the facial key-points and the face bounding boxes. In experiments, our ConvNet is trained on the AFLW dataset only and tested on the FDDB benchmark with fine-tuning and on the AFW benchmark without finetuning. The proposed method obtains very competitive state-of-the-art performance in the two benchmarks.

Original languageEnglish (US)
Title of host publicationComputer Vision - 14th European Conference, ECCV 2016, Proceedings
EditorsBastian Leibe, Jiri Matas, Nicu Sebe, Max Welling
Number of pages17
ISBN (Print)9783319464862
StatePublished - 2016
Externally publishedYes
Event14th European Conference on Computer Vision, ECCV 2016 - Amsterdam, Netherlands
Duration: Oct 11 2016Oct 14 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9907 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other14th European Conference on Computer Vision, ECCV 2016


  • ConvNet
  • Deep learning
  • Face 3D model
  • Face detection
  • Multi-task learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Face detection with end-to-end integration of a convNet and a 3D model'. Together they form a unique fingerprint.

Cite this