The Wayback Machine - https://web.archive.org/web/20200520004056/https://github.com/NVIDIA/OpenSeq2Seq
Skip to content
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Python C++ Shell Other
Branch: master
Clone or download

Latest commit

vsl9 Merge pull request #517 from jarivk/dev/separate_reset_flags
Provide separate flags to reset offsets and words
Latest commit 61204b2 Nov 19, 2019

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
calibration pickle dump fix (#372) Mar 25, 2019
ctc_decoder_with_lm Updated ctc_decoder_with_lm to upstream TF Sep 9, 2019
decoders Provide separate flags to reset offsets and words Nov 19, 2019
docker Release candidate of OpenSeq2Seq v0.2 Apr 13, 2018
docs Updated docs Jul 10, 2019
example_configs Added QuartzNet config Oct 30, 2019
external_lm_rescore update docs May 1, 2019
open_seq2seq Updated speech preprocessing tests Oct 10, 2019
scripts Script for downloading and pre-processing wikitext datasets (#494) Sep 16, 2019
.gitignore add caching of feature preprocessing Dec 10, 2018
.pylintrc Final Tacotron Update (#237) Sep 23, 2018
.style.yapf update docs Jul 11, 2018
AUTHORS Addressing review comments Apr 18, 2018
CONTRIBUTING.md Switch license type to Apache 2.0 to make external contributions easier Oct 2, 2018
Interactive_Infer_example.ipynb Final Tacotron Update (#237) Sep 23, 2018
LICENSE Switch license type to Apache 2.0 to make external contributions easier Oct 2, 2018
README.md Added original Baidu decoder code (commit a76fc69) May 20, 2019
Streaming-ASR.ipynb Updated streaming ASR notebook Jul 31, 2019
demo_streaming_asr.py Added command-line demo for streaming ASR Jul 31, 2019
frame_asr.py Updated FrameASR to support fine / coarse (forerunner) decoding Aug 6, 2019
requirements.txt Update requirements.txt Jun 7, 2019
run.py Merge pull request #333 from blisc/transfer_learning_dtype_2 Jun 25, 2019
tokenizer_wrapper.py Update tokenizer_wrapper.py May 21, 2019

README.md

License Documentation

OpenSeq2Seq

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.

Documentation and installation instructions

https://nvidia.github.io/OpenSeq2Seq/

Features

  1. Models for:
    1. Neural Machine Translation
    2. Automatic Speech Recognition
    3. Speech Synthesis
    4. Language Modeling
    5. NLP tasks (sentiment analysis)
  2. Data-parallel distributed training
    1. Multi-GPU
    2. Multi-node
  3. Mixed precision training for NVIDIA Volta/Turing GPUs

Software Requirements

  1. Python >= 3.5
  2. TensorFlow >= 1.10
  3. CUDA >= 9.0, cuDNN >= 7.0
  4. Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)

Acknowledgments

Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.

Beam search decoder with language model re-scoring implementation (in decoders) is based on Baidu DeepSpeech.

Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.

Disclaimer

This is a research project, not an official NVIDIA product.

Related resources

Paper

If you use OpenSeq2Seq, please cite this paper

@misc{openseq2seq,
    title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
    author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},
    year={2018},
    eprint={1805.10387},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
You can’t perform that action at this time.