OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

OpenSeq2Seq main goal is to allow researchers to most effectively explore various
sequence-to-sequence models. The efficiency is achieved by fully supporting
distributed and mixed-precision training.
OpenSeq2Seq is built using TensorFlow and provides all the necessary
building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.

Documentation and installation instructions

https://nvidia.github.io/OpenSeq2Seq/

Features

Models for:
1. Neural Machine Translation
2. Automatic Speech Recognition
3. Speech Synthesis
4. Language Modeling
5. NLP tasks (sentiment analysis)
Data-parallel distributed training
1. Multi-GPU
2. Multi-node
Mixed precision training for NVIDIA Volta/Turing GPUs

Software Requirements

Python >= 3.5
TensorFlow >= 1.10
CUDA >= 9.0, cuDNN >= 7.0
Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)

Acknowledgments

Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.

Beam search decoder with language model re-scoring implementation (in decoders) is based on Baidu DeepSpeech.

Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.

Disclaimer

This is a research project, not an official NVIDIA product.

Paper

If you use OpenSeq2Seq, please cite this paper

@misc{openseq2seq,
    title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
    author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},
    year={2018},
    eprint={1805.10387},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

名稱與所有者	NVIDIA/OpenSeq2Seq
主編程語言	Python
編程語言	Shell (語言數: 8)
平台
許可證	Apache License 2.0

創建於	2017-09-08 20:53:07
推送於	2021-05-11 15:50:05
最后一次提交	2020-09-09 07:37:04
發布數	8
最新版本名稱	18.12 (發布於 )
第一版名稱	v0.2 (發布於 )

星數	1.6k
關注者數	89
派生數	371
提交數	1.7k
已啟用問題?
問題數	256
打開的問題數	80
拉請求數	245
打開的拉請求數	7
關閉的拉請求數	46

已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?

OpenSeq2Seq

Github星跟蹤圖

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

Documentation and installation instructions

Features

Software Requirements

Acknowledgments

Disclaimer

Paper

主要指標

OpenSeq2Seq

Github星跟蹤圖

OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

Documentation and installation instructions

Features

Software Requirements

Acknowledgments

Disclaimer

Related resources

Paper

主要指標