TY - GEN
T1 - Lipreading by locality discriminant graph
AU - Fu, Yun
AU - Zhou, Xi
AU - Liu, Ming
AU - Hasegawa-Johnson, Mark
AU - Huang, Thomas S.
PY - 2006
Y1 - 2006
N2 - The major problem in building a good lipreading system is to extract effective visual features from the enormous quantity of video sequences data. For appearance-based feature analysis in lipreading, classical methods, e.g. DCT, PCA and LDA, are usually applied to dimensionality reduction. We present a new pattern classification algorithm, called Locality Discriminant Graph (LDG), and develop a novel lipreading framework to successfully apply LDG to the problem. LDG takes the advantages of both manifold learning and Fisher criteria to seek the linear embedding which preserves the local neighborhood affinity within same class while discriminating the neighborhood among different classes. The LDG embedding is computed in closed-form and tuned by the only open parameter of k-NN number. Experiments on AVICAR corpus provide evidence that the graph-based pattern classification methods can outperform classical ones for lipreading.
AB - The major problem in building a good lipreading system is to extract effective visual features from the enormous quantity of video sequences data. For appearance-based feature analysis in lipreading, classical methods, e.g. DCT, PCA and LDA, are usually applied to dimensionality reduction. We present a new pattern classification algorithm, called Locality Discriminant Graph (LDG), and develop a novel lipreading framework to successfully apply LDG to the problem. LDG takes the advantages of both manifold learning and Fisher criteria to seek the linear embedding which preserves the local neighborhood affinity within same class while discriminating the neighborhood among different classes. The LDG embedding is computed in closed-form and tuned by the only open parameter of k-NN number. Experiments on AVICAR corpus provide evidence that the graph-based pattern classification methods can outperform classical ones for lipreading.
KW - Audio-visual speech
KW - Discrete cosine transform
KW - Discriminant analysis
KW - Graph embedding
KW - Lipreading
UR - http://www.scopus.com/inward/record.url?scp=48149097576&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=48149097576&partnerID=8YFLogxK
U2 - 10.1109/ICIP.2007.4379312
DO - 10.1109/ICIP.2007.4379312
M3 - Conference contribution
AN - SCOPUS:48149097576
SN - 1424414377
SN - 9781424414376
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - III325-III328
BT - 2007 IEEE International Conference on Image Processing, ICIP 2007 Proceedings
T2 - 14th IEEE International Conference on Image Processing, ICIP 2007
Y2 - 16 September 2007 through 19 September 2007
ER -