Posts by Collection



Learning and controlling the source-filter representation of speech with a variational autoencoder

Published in Speech Communication, 2023

We show that the source-filter model of speech production naturally emerges in the latent space of an unsupervised VAE and we propose a weakly-supervised method to control the pitch and formant frequencies of speech signals in the VAE latent space.

Recommended citation: Learning and controlling the source-filter representation of speech with a variational autoencoder Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier Speech Communication, vol. 148, 2023.

A multimodal dynamical variational autoencoder for audiovisual speech representation learning

Published in Neural Networks (Elsevier), 2024

We present a multimodal and dynamical VAE (MDVAE) applied to unsupervised audio-visual speech representation learning.

Recommended citation: Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier. A multimodal dynamical variational autoencoder for audiovisual speech representation learning. Neural Networks (Elsevier), 2024



Practical work

Workshop, University de Rennes 1, 2021

I was supervising students during their practical work:

Project supervision

Workshop, CentralesSupelec, 2022

Supervision of two student projects for machine learning and deep learning.