TY - GEN
T1 - DeepFont
T2 - 23rd ACM International Conference on Multimedia, MM 2015
AU - Wang, Zhangyang
AU - Yang, Jianchao
AU - Jin, Hailin
AU - Shechtman, Eli
AU - Agarwala, Aseem
AU - Brandt, Jonathan
AU - Huang, Thomas S.
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/10/13
Y1 - 2015/10/13
N2 - As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers. We study the Visual Font Recognition (VFR) problem [4], and advance the state-of-The-Art remarkably by developing the DeepFont system. First of all, we build up the first avail-able large-scale VFR dataset, named AdobeVFR, consisting of both labeled synthetic data and partially labeled real-world data. Next, to combat the domain mismatch between available training and testing data, we introduce a Convo-lutional Neural Network (CNN) decomposition approach, using a domain adaptation technique based on a Stacked Convolutional Auto-Encoder (SCAE) that exploits a large corpus of unlabeled real-world text images combined with synthetic data preprocessed in a specific way. Moreover, we study a novel learning-based model compression approach, in order to reduce the DeepFont model size without sacrific-ing its performance. The DeepFont system achieves an ac-curacy of higher than 80% (top-5) on our collected dataset, and also produces a good font similarity measure for font selection and suggestion. We also achieve around 6 times compression of the model without any visible loss of recog-nition accuracy.
AB - As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers. We study the Visual Font Recognition (VFR) problem [4], and advance the state-of-The-Art remarkably by developing the DeepFont system. First of all, we build up the first avail-able large-scale VFR dataset, named AdobeVFR, consisting of both labeled synthetic data and partially labeled real-world data. Next, to combat the domain mismatch between available training and testing data, we introduce a Convo-lutional Neural Network (CNN) decomposition approach, using a domain adaptation technique based on a Stacked Convolutional Auto-Encoder (SCAE) that exploits a large corpus of unlabeled real-world text images combined with synthetic data preprocessed in a specific way. Moreover, we study a novel learning-based model compression approach, in order to reduce the DeepFont model size without sacrific-ing its performance. The DeepFont system achieves an ac-curacy of higher than 80% (top-5) on our collected dataset, and also produces a good font similarity measure for font selection and suggestion. We also achieve around 6 times compression of the model without any visible loss of recog-nition accuracy.
KW - Deep Learning
KW - Domain Adaptation
KW - Model Compression
KW - Visual Font Recognition
UR - http://www.scopus.com/inward/record.url?scp=84962846971&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962846971&partnerID=8YFLogxK
U2 - 10.1145/2733373.2806219
DO - 10.1145/2733373.2806219
M3 - Conference contribution
AN - SCOPUS:84962846971
T3 - MM 2015 - Proceedings of the 2015 ACM Multimedia Conference
SP - 451
EP - 459
BT - MM 2015 - Proceedings of the 2015 ACM Multimedia Conference
PB - Association for Computing Machinery
Y2 - 26 October 2015 through 30 October 2015
ER -