Audio Foundation Models (Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit)
-
Updated
Mar 16, 2023 - Python
Audio Foundation Models (Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit)
so-vits-svc fork with REALTIME support (voice changer) and greatly improved interface.
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
Phoneme segmentation using pre-trained speech models
This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and t…
An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning on the RAVDESS dataset.
The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at ICASSP-2023)
Speech Keyword detection using Wav2Vec Model
Repositorio creado para almacenar archivos, script y el informe final del curso de modelamiento estadístico del Diplomado en Big Data de la Pontificia Universidad Católica de Chile.
code for our paper DistilALHuBERT: A Distilled Parameter Sharing Audio Representation Model
Add a description, image, and links to the hubert topic page so that developers can more easily learn about it.
To associate your repository with the hubert topic, visit your repo's landing page and select "manage topics."