这篇论文中,我们提出了一个基于结合人脸检测和关键点对准的多任务级联的CNNs。实验结果表明我们的方法始终比当前最先进的方法要表现要好,同时也保持了运行速度。未来,我们将会把人脸检测和其他人脸分析任务结合起来,进一步提升效果。
REFERENCES
[1] B. Yang, J. Yan, Z. Lei, and S. Z. Li, “Aggregate channel eatures for
multi-view face detection,” in IEEE International Joint Conference on
Biometrics, 2014, pp. 1-8.
Fig. 4. (a) Evaluation on FDDB. (b-d) Evaluation on three subsets of WIDER
FACE. The number following the method indicates the average accuracy. (e)
Evaluation on AFLW for face alignment
Fig. 3. (a) Validation loss of O-Net with and without hard sample mining. (b)
“JA” denotes joint face alignment learning while “No JA” denotes do not joint
it. “No JA in BBR” denotes do not joint it while training the CNN for bounding
box regression.5
[2] P. Viola and M. J. Jones, “Robust real-time face detection. International
journal of computer vision,” vol. 57, no. 2, pp. 137-154, 2004
[3] M. T. Pham, Y. Gao, V. D. D. Hoang, and T. J. Cham, “Fast polygonal
integration and its application in extending haar-like features to improve
object detection,” in IEEE Conference on Computer Vision and Pattern
Recognition, 2010, pp. 942-949.
[4] Q. Zhu, M. C. Yeh, K. T. Cheng, and S. Avidan, “Fast human detection
using a cascade of histograms of oriented gradients,” in IEEE Computer
Conference on Computer Vision and Pattern Recognition, 2006, pp.
1491-1498.
[5] M. Mathias, R. Benenson, M. Pedersoli, and L. Van Gool, “Face detection
without bells and whistles,” in European Conference on Computer Vision,
2014, pp. 720-735.
[6] J. Yan, Z. Lei, L. Wen, and S. Li, “The fastest deformable part model for
object detection,” in IEEE Conference on Computer Vision and Pattern
Recognition, 2014, pp. 2497-2504.
[7] X. Zhu, and D. Ramanan, “Face detection, pose estimation, and landmark
localization in the wild,” in IEEE Conference on Computer Vision and
Pattern Recognition, 2012, pp. 2879-2886.
[8] M. Köstinger, P. Wohlhart, P. M. Roth, and H. Bischof, “Annotated facial
landmarks in the wild: A large-scale, real-world database for facial land
mark localization,” in IEEE Conference on Computer Vision and Pattern
Recognition Workshops, 2011, pp. 2144-2151.
[9] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in Advances in neural infor
mation processing systems, 2012, pp. 1097-1105.
[10] Y. Sun, Y. Chen, X. Wang, and X. Tang, “Deep learning face representa
tion by joint identification-verification,” in Advances in Neural Infor
mation Processing Systems, 2014, pp. 1988-1996.
[11] S. Yang, P. Luo, C. C. Loy, and X. Tang, “From facial parts responses to
face detection: A deep learning approach,” in IEEE International Confer
ence on Computer Vision, 2015, pp. 3676-3684.
[12] X. P. Burgos-Artizzu, P. Perona, and P. Dollar, “Robust face landmark
estimation under occlusion,” in IEEE International Conference on Com
puter Vision, 2013, pp. 1513-1520.
[13] X. Cao, Y. Wei, F. Wen, and J. Sun, “Face alignment by explicit shape
regression,” International Journal of Computer Vision, vol 107, no. 2, pp.
177-190, 2012.
[14] T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23,
no. 6, pp. 681-685, 2001.
[15] X. Yu, J. Huang, S. Zhang, W. Yan, and D. Metaxas, “Pose-free facial
landmark fitting via optimized part mixtures and cascaded deformable
shape model,” in IEEE International Conference on Computer Vision,
2013, pp. 1944-1951.
[16] J. Zhang, S. Shan, M. Kan, and X. Chen, “Coarse-to-fine auto-encoder
networks (CFAN) for real-time face alignment,” in European Conference
on Computer Vision, 2014, pp. 1-16.
[17] Luxand Incorporated: Luxand face SDK, http://www.luxand.com/
[18] D. Chen, S. Ren, Y. Wei, X. Cao, and J. Sun, “Joint cascade face detection
and alignment,” in European Conference on Computer Vision, 2014, pp.
109-122.
[19] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “A convolutional neural
network cascade for face detection,” in IEEE Conference on Computer
Vision and Pattern Recognition, 2015, pp. 5325-5334.
[20] C. Zhang, and Z. Zhang, “Improving multiview face detection with mul
ti-task deep convolutional neural networks,” IEEE Winter Conference
on Applications of Computer Vision, 2014, pp. 1036-1041.
[21] X. Xiong, and F. Torre, “Supervised descent method and its applications to
face alignment,” in IEEE Conference on Computer Vision and Pattern
Recognition, 2013, pp. 532-539.
[22] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, “Facial landmark detection by
deep multi-task learning,” in European Conference on Computer Vision,
2014, pp. 94-108.
[23] Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the
wild,” in IEEE International Conference on Computer Vision, 2015, pp.
3730-3738.
[24] S. Yang, P. Luo, C. C. Loy, and X. Tang, “WIDER FACE: A Face Detec
tion Benchmark”. arXiv preprint arXiv:1511.06523.
[25] V. Jain, and E. G. Learned-Miller, “FDDB: A benchmark for face detec
tion in unconstrained settings,” Technical Report UMCS-2010-009, Uni
versity of Massachusetts, Amherst, 2010.
[26] B. Yang, J. Yan, Z. Lei, and S. Z. Li, “Convolutional channel features,” in
IEEE International Conference on Computer Vision, 2015, pp. 82-90.
[27] R. Ranjan, V. M. Patel, and R. Chellappa, “A deep pyramid deformable
part model for face detection,” in IEEE International Conference on Bio
metrics Theory, Applications and Systems, 2015, pp. 1-8.
[28] G. Ghiasi, and C. C. Fowlkes, “Occlusion Coherence: Detecting and
Localizing Occluded Faces,” arXiv preprint arXiv:1506.08347.
[29] S. S. Farfade, M. J. Saberian, and L. J. Li, “Multi-view face detection using
deep convolutional neural networks,” in ACM on International Conference
on Multimedia Retrieval, 2015, pp. 643-650.