This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.
Yehia, H. C., Kuratate, T., & Vatikiotis-Bateson, E. (2002). Linking facial animation, head motion and speech acoustics. Journal of phonetics, 30(3), 555–568.Search in Google Scholar
Greenwood, D., Laycock, & S., Matthews, I. (2017). Predicting head pose from speech with a conditional variational autoencoder. Interspeech 2017, 3991-3995.Search in Google Scholar
Czap, L., & Kilik, R. (2015). Automatic gesture generation. Production Systems and Information Engineering, 7, 5–14.Search in Google Scholar
Zhou Y., Han X., Shechtman E., Echevarria j., Kalogerakis E., & Li D. (2020). MakeltTalk: speaker-aware talking-head animation. ACM Transactions on Graphics (TOG) 39, 6, 1–15Search in Google Scholar
Kim, H., Garrido, P., Tewari, A., Xu, W., Thies, J., Niessner, M., ... & Theobalt, C. (2018). Deep video portraits. ACM Transactions on Graphics (TOG), 37(4), 1-14.Search in Google Scholar
Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. Ismb, 8, 93–103.Search in Google Scholar
Getz, G., Levine E., & Domany, E. (2000). Coupled two-way clustering analysis of gene microarray data. Proceedings of the National Academy of Sciences, 97(22), 12079–12084.Search in Google Scholar
Deng, Z., Narayanan, S., Busso, C., & Neumann U. (2004). Audio-based head motion synthesis for avatar-based telepresence systems. Proceedings of the 2004 ACM SIGMM workshop on Effective telepresence, 24–30.Search in Google Scholar
Grimm, M., & Neumann, U., & Busso, C., Deng Z., & Narayanan S. (2005). Natural head motion synthesis driven by acoustic prosodic features. Journal of Visualization and Computer Animation, (3-5), 283–290.Search in Google Scholar
Grimm, M., Neumann, U., Busso, C., Deng, Z., & Narayanan S. (2007). Rigid head motion in expressive speech animation: Analysis and synthesis. IEEE Transactions on Audio, Speech, and Language Processing, 3, 1075–1086.Search in Google Scholar
Matthews, I., Laycock S., & Greenwood, D. (2018). Joint learning of facial expression and head pose from speech., 15, 2484–2488.Search in Google Scholar
Hofer, G., & Shimodaira, H. (2007). Automatic head motion prediction from speech data. In Interspeech 2007, 722-725.Search in Google Scholar
Ji, Xinya, et al. (2022). Eamm: One-shot emotional talking face via audio-based emotion-aware motion model. ACM SIGGRAPH 2022 Conference Proceedings. 2022.Search in Google Scholar
Lu, Y., Chai, J., & Cao, X. (2021). Live speech portraits: real-time photorealistic talking-head animation. ACM Transactions on Graphics (TOG), 40(6), 1-17.Search in Google Scholar
Ben Youssef, A., Shimodaira, H., & Braude, D. A. (2013). Articulatory features for speech-driven head motion synthesis. Proceedings of Interspeech, Lyon, France.Search in Google Scholar
Baudat, G., & Anouar, F. (2000). Generalized discriminant analysis using a kernel approach. Neural computation, 12(10), 2385–2404.Search in Google Scholar
Liu, X., Yin, J., Feng, Z., Dong, J., & Wang Lu. (2007). Orthogonal neighborhood preserving embedding for face recognition. In Image Processing, 2007. ICIP 2007. IEEE International Conference, 1, 133-136.Search in Google Scholar
Roweis, S. T. et al. (2002). Automatic alignment of hidden representations. Sixteenth Annual Conference on Neural Information Processing Systems, Vancouver, Canada, 15, 841–848.Search in Google Scholar
Tibshirani, R., Walther, G., & Hastie T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411–423.Search in Google Scholar
Davies, L. D., & Bouldin, W. D. (1979). A cluster separation measure. IEEE transactions on pattern analysis and machine intelligence, 2, 224–227.Search in Google Scholar
Calinszki, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3(1), 1–27.Search in Google Scholar