About me

I hold a PhD titled “Learning the Audiovisual Representation of Speech Applied to Emotion Recognition” and currently work as a Postdoctoral Researcher at INRIA Grenoble under the supervision of Xavier Alameda-Pineda.

My research focuses on multimodal generative models for audiovisual speech. I aim to develop interpretable generative models to enhance data analysis, control, and generation.

News

  • 🎉 Article Accepted (Jan. 2025) – “AnCoGen: Analysis, Control, and Generation of Speech with a Masked Autoencoder” accepted at ICASSP 2025.
  • 🎉 Thesis Defense (March 8, 2024) – Learning the Audiovisual Representation of Speech Applied to Emotion Recognition.
  • 🎉 Article Accepted (Jan. 9, 2024) – “A Multimodal Dynamical Variational Autoencoder for Audiovisual Speech Representation Learning” accepted in Neural Networks, 2024.
  • 🎉 Article Accepted (Apr. 14, 2023) – “A Vector Quantized Masked Autoencoder for Speech Emotion Recognition” accepted at the Self-Supervision in Audio, Speech, and Beyond (SASB) Workshop, ICASSP 2023.
  • 🎉 Article Accepted (Apr. 14, 2023) – “Learning and Controlling the Source-Filter Representation of Speech with a Variational Autoencoder” accepted in Speech Communication.