Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 1

less than 1 minute read

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

Portfolio item number 1

Short description of portfolio item number 1

publications

Learning and controlling the source-filter representation of speech with a variational autoencoder

Published in Speech Communication, 2023

We show that the source-filter model of speech production naturally emerges in the latent space of an unsupervised VAE and we propose a weakly-supervised method to control the pitch and formant frequencies of speech signals in the VAE latent space.

Recommended citation: Learning and controlling the source-filter representation of speech with a variational autoencoder Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier Speech Communication, vol. 148, 2023. https://www-sciencedirect-com.ezproxy.universite-paris-saclay.fr/science/article/pii/S0167639323000304

A vector quantized masked autoencoder for speech emotion recognition

Published in Workshop ICASSP (SASB), 2023

Combined VQ-VAE (unsupervised) with MAE (self-supervised) for speech emotion recognition.

Recommended citation: Sadok Samir, Simon Leglaive and Renaud Séguier. “A vector quantized masked autoencoder for speech emotion recognition.” (2023). ☍

A multimodal dynamical variational autoencoder for audiovisual speech representation learning

Published in Neural Networks (Elsevier), 2024

We present a multimodal and dynamical VAE (MDVAE) applied to unsupervised audio-visual speech representation learning.

Recommended citation: Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier. A multimodal dynamical variational autoencoder for audiovisual speech representation learning. Neural Networks (Elsevier), 2024 (link)

AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder

Published in IEEE ICASSP, 2025

We present AnCoGen, a new method using a masked autoencoder to unify speech signal analysis, control, and generation in single model.

Recommended citation: Samir Sadok, Simon Leglaive, Laurent Girin, Gaël Richard, Xavier Alameda-Pineda. AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder. IEEE ICASSP, 2025 ☍

teaching

Practical work

Workshop, University de Rennes 1, 2021

I was supervising students during their practical work:

Project supervision

Workshop, CentralesSupelec, 2022

Supervision of two student projects for machine learning and deep learning.

Samir

Sitemap

Pages

Page Not Found

About me

Archive Layout with Content

Posts by Category

Posts by Collection

CV

Markdown

Page not in menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Future Blog Post

Blog Post number 1

portfolio

Portfolio item number 1

publications

Learning and controlling the source-filter representation of speech with a variational autoencoder

A vector quantized masked autoencoder for speech emotion recognition

A multimodal dynamical variational autoencoder for audiovisual speech representation learning

AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder

talks

Apprentissage des représentations de la parole et du langage

CFA: French Congress for Acoustics

Méthodes en traitement du signal pour l’écoute artificielle

teaching

Practical work

Project supervision