Publications

AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder

Published in IEEE ICASSP, 2025

We present AnCoGen, a new method using a masked autoencoder to unify speech signal analysis, control, and generation in single model.

Recommended citation: Samir Sadok, Simon Leglaive, Laurent Girin, Gaël Richard, Xavier Alameda-Pineda. AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder. IEEE ICASSP, 2025

Learning and controlling the source-filter representation of speech with a variational autoencoder

Published in Speech Communication, 2023

We show that the source-filter model of speech production naturally emerges in the latent space of an unsupervised VAE and we propose a weakly-supervised method to control the pitch and formant frequencies of speech signals in the VAE latent space.

Recommended citation: Learning and controlling the source-filter representation of speech with a variational autoencoder Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier Speech Communication, vol. 148, 2023. https://www-sciencedirect-com.ezproxy.universite-paris-saclay.fr/science/article/pii/S0167639323000304