speech-recognition
Here are 2,573 public repositories matching this topic...
-
Updated
Jan 4, 2022 - C++
-
Updated
Jan 10, 2022 - Shell
-
Updated
Oct 19, 2021 - HTML
Fedora & apt-get
Specs
- Leon version: latest
- OS (or browser) version: Fedora 30
- Node.js version: 10.16.3
- Complete "npm run check" output:
➡ Here is the diagnosis about your current setup
✔ Run
✔ Run modules
✔ Reply you by texting
❗ Amazon Polly text-to-speech
❗ Google Cloud text-to-speech
❗ Watson text-to-speech
❗ Offline text-to-speech
❗ Google Cloud speech-to-text
❗ Watson spee
-
Updated
Mar 26, 2021 - JavaScript
-
Updated
Dec 14, 2021 - Python
-
Updated
Jan 6, 2022 - C++
-
Updated
Jan 7, 2022 - Python
-
Updated
Jan 11, 2022 - Python
-
Updated
Jan 11, 2022 - Jupyter Notebook
-
Updated
Jan 11, 2022 - Python
-
Updated
Sep 2, 2021 - C
-
Updated
Aug 25, 2021 - Python
目前的多音字使用 pypinyin 或者 g2pM,精度有限,想做一个基于 BERT (或者 ERNIE) 多音字预测模型,简单来说就是假设某语言有 100 个多音字,每个多音字最多有 3 个发音,那么可以在 BERT 后面接 100 个 3 分类器(简单的 fc 层即可),在预测时,找到对应的分类器进行分类即可。
参考论文:
tencent_polyphone.pdf
数据可以用 https://github.com/kakaobrain/g2pM 提供的数据
进阶:多任务的 BERT


Fast Tokenizer for DeBERTA-V3 and mDeBERTa-V3
Motivation
DeBERTa V3 is an improved version of DeBERTa. With the V3 version, the authors also released a multilingual model "mDeBERTa-base" that outperforms XLM-R-base. However, DeBERTa V3 currently lacks a FastTokenizer implementation which makes it impossible to use with some of the example scripts (They require a Fa