Dilated recurrent neural networks

Shiyu Chang, Yang Zhang, Wei Han, Mo Yu, Xiaoxiao Guo, Wei Tan, Xiaodong Cui, Michael Witbrock, Mark Allan Hasegawa-Johnson, Thomas S Huang

Research output: Contribution to journalConference article

Abstract

Learning with recurrent neural networks (RNNs) on long sequences is a notoriously difficult task. There are three major challenges: 1) complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization. In this paper, we introduce a simple yet effective RNN connection structure, the DILATEDRNN, which simultaneously tackles all of these challenges. The proposed architecture is characterized by multi-resolution dilated recurrent skip connections, and can be combined flexibly with diverse RNN cells. Moreover, the DILATEDRNN reduces the number of parameters needed and enhances training efficiency significantly, while matching state-of-the-art performance (even with standard RNN cells) in tasks involving very long-term dependencies. To provide a theory-based quantification of the architecture's advantages, we introduce a memory capacity measure, the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures. We rigorously prove the advantages of the DILATEDRNN over other recurrent neural architectures. The code for our method is publicly available1.

Original languageEnglish (US)
Pages (from-to)77-87
Number of pages11
JournalAdvances in Neural Information Processing Systems
Volume2017-December
StatePublished - Jan 1 2017
Event31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States
Duration: Dec 4 2017Dec 9 2017

Fingerprint

Recurrent neural networks
Data storage equipment

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Chang, S., Zhang, Y., Han, W., Yu, M., Guo, X., Tan, W., ... Huang, T. S. (2017). Dilated recurrent neural networks. Advances in Neural Information Processing Systems, 2017-December, 77-87.

Dilated recurrent neural networks. / Chang, Shiyu; Zhang, Yang; Han, Wei; Yu, Mo; Guo, Xiaoxiao; Tan, Wei; Cui, Xiaodong; Witbrock, Michael; Hasegawa-Johnson, Mark Allan; Huang, Thomas S.

In: Advances in Neural Information Processing Systems, Vol. 2017-December, 01.01.2017, p. 77-87.

Research output: Contribution to journalConference article

Chang, S, Zhang, Y, Han, W, Yu, M, Guo, X, Tan, W, Cui, X, Witbrock, M, Hasegawa-Johnson, MA & Huang, TS 2017, 'Dilated recurrent neural networks', Advances in Neural Information Processing Systems, vol. 2017-December, pp. 77-87.
Chang S, Zhang Y, Han W, Yu M, Guo X, Tan W et al. Dilated recurrent neural networks. Advances in Neural Information Processing Systems. 2017 Jan 1;2017-December:77-87.
Chang, Shiyu ; Zhang, Yang ; Han, Wei ; Yu, Mo ; Guo, Xiaoxiao ; Tan, Wei ; Cui, Xiaodong ; Witbrock, Michael ; Hasegawa-Johnson, Mark Allan ; Huang, Thomas S. / Dilated recurrent neural networks. In: Advances in Neural Information Processing Systems. 2017 ; Vol. 2017-December. pp. 77-87.
@article{b34b0378533f470182311a456b9b8ffe,
title = "Dilated recurrent neural networks",
abstract = "Learning with recurrent neural networks (RNNs) on long sequences is a notoriously difficult task. There are three major challenges: 1) complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization. In this paper, we introduce a simple yet effective RNN connection structure, the DILATEDRNN, which simultaneously tackles all of these challenges. The proposed architecture is characterized by multi-resolution dilated recurrent skip connections, and can be combined flexibly with diverse RNN cells. Moreover, the DILATEDRNN reduces the number of parameters needed and enhances training efficiency significantly, while matching state-of-the-art performance (even with standard RNN cells) in tasks involving very long-term dependencies. To provide a theory-based quantification of the architecture's advantages, we introduce a memory capacity measure, the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures. We rigorously prove the advantages of the DILATEDRNN over other recurrent neural architectures. The code for our method is publicly available1.",
author = "Shiyu Chang and Yang Zhang and Wei Han and Mo Yu and Xiaoxiao Guo and Wei Tan and Xiaodong Cui and Michael Witbrock and Hasegawa-Johnson, {Mark Allan} and Huang, {Thomas S}",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
volume = "2017-December",
pages = "77--87",
journal = "Advances in Neural Information Processing Systems",
issn = "1049-5258",

}

TY - JOUR

T1 - Dilated recurrent neural networks

AU - Chang, Shiyu

AU - Zhang, Yang

AU - Han, Wei

AU - Yu, Mo

AU - Guo, Xiaoxiao

AU - Tan, Wei

AU - Cui, Xiaodong

AU - Witbrock, Michael

AU - Hasegawa-Johnson, Mark Allan

AU - Huang, Thomas S

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Learning with recurrent neural networks (RNNs) on long sequences is a notoriously difficult task. There are three major challenges: 1) complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization. In this paper, we introduce a simple yet effective RNN connection structure, the DILATEDRNN, which simultaneously tackles all of these challenges. The proposed architecture is characterized by multi-resolution dilated recurrent skip connections, and can be combined flexibly with diverse RNN cells. Moreover, the DILATEDRNN reduces the number of parameters needed and enhances training efficiency significantly, while matching state-of-the-art performance (even with standard RNN cells) in tasks involving very long-term dependencies. To provide a theory-based quantification of the architecture's advantages, we introduce a memory capacity measure, the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures. We rigorously prove the advantages of the DILATEDRNN over other recurrent neural architectures. The code for our method is publicly available1.

AB - Learning with recurrent neural networks (RNNs) on long sequences is a notoriously difficult task. There are three major challenges: 1) complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization. In this paper, we introduce a simple yet effective RNN connection structure, the DILATEDRNN, which simultaneously tackles all of these challenges. The proposed architecture is characterized by multi-resolution dilated recurrent skip connections, and can be combined flexibly with diverse RNN cells. Moreover, the DILATEDRNN reduces the number of parameters needed and enhances training efficiency significantly, while matching state-of-the-art performance (even with standard RNN cells) in tasks involving very long-term dependencies. To provide a theory-based quantification of the architecture's advantages, we introduce a memory capacity measure, the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures. We rigorously prove the advantages of the DILATEDRNN over other recurrent neural architectures. The code for our method is publicly available1.

UR - http://www.scopus.com/inward/record.url?scp=85047020041&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047020041&partnerID=8YFLogxK

M3 - Conference article

VL - 2017-December

SP - 77

EP - 87

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

SN - 1049-5258

ER -