2017年、ミシガン州立大学、IBM基礎研究所、コーネル大学の研究者らは、Knowledge Discovery and Data Mining(KDD)会議において研究発表を行った[25][26][27]。彼らの研究は、広く使われるLSTMニューラルネットワークよりも特定のデータセットに対して優れた性能を示す新規ニューラルネットワークに関するものである。
LSTMを用いるRNNは、一連の訓練において、教師あり学習のやり方で訓練できる。訓練では、最適化プロセス中で必要な勾配を計算するための通時的誤差逆伝播法(英語版)(Backpropagation through time、BPTT)と組み合わせて最急降下法のような最適化アルゴリズムを使って、(LSTMネットワークの出力層における)誤差の微分に比例してLSTMネットワークの個々の重みを変化させる。
Li, Xiangang; Wu, Xihong (15 October 2014). "Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition". arXiv:1410.4281 [cs.CL]。
Klaus Greff;Rupesh Kumar Srivastava;Jan Koutník;Bas R. Steunebrink;Jürgen Schmidhuber(2015).“LSTM: A Search Space Odyssey”.IEEE Transactions on Neural Networks and Learning Systems28(10): 2222–2232.arXiv:1503.04069.doi:10.1109/TNNLS.2016.2582924.PMID27411231.
Felix A. Gers;Jürgen Schmidhuber;Fred Cummins(2000).“Learning to Forget: Continual Prediction with LSTM”.Neural Computation12(10): 2451–2471.doi:10.1162/089976600300015015.
Graves,A.;Liwicki,M.;Fernández,S.;Bertolami,R.;Bunke,H.;Schmidhuber,J.(May 2009).“A Novel Connectionist System for Unconstrained Handwriting Recognition”.IEEE Transactions on Pattern Analysis and Machine Intelligence31(5): 855–868.doi:10.1109/tpami.2008.137.ISSN0162-8828.PMID19299860.
Xingjian Shi;Zhourong Chen;Hao Wang;Dit-Yan Yeung;Wai-kin Wong;Wang-chun Woo(2015).“Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting”.Proceedings of the 28th International Conference on Neural Information Processing Systems: 802–810.arXiv:1506.04214.Bibcode:2015arXiv150604214S.
Graves,Alex;Fernández,Santiago;Gomez,Faustino(2006).“Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks”.In Proceedings of the International Conference on Machine Learning, ICML 2006: 369–376.
Mayer,H.;Gomez,F.;Wierstra,D.;Nagy,I.;Knoll,A.;Schmidhuber,J.(October 2006).A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks.543–548.doi:10.1109/IROS.2006.282190.ISBN978-1-4244-0258-8
Graves,A.;Schmidhuber,J.(2005).“Framewise phoneme classification with bidirectional LSTM and other neural network architectures”.Neural Networks18(5–6): 602–610.doi:10.1016/j.neunet.2005.06.042.PMID16112549.
Graves,Alex;Mohamed,Abdel-rahman;Hinton,Geoffrey(2013).“Speech Recognition with Deep Recurrent Neural Networks”.Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on: 6645–6649.
Eck,Douglas;Schmidhuber,Jürgen(2002-08-28).Learning the Long-Term Structure of the Blues.Lecture Notes in Computer Science.2415.Springer, Berlin, Heidelberg.284–289.doi:10.1007/3-540-46084-5_47.ISBN978-3540460848
Schmidhuber,J.;Gers,F.;Eck,D.;Schmidhuber,J.;Gers,F.(2002).“Learning nonregular languages: A comparison of simple recurrent networks and LSTM”.Neural Computation14(9): 2039–2041.doi:10.1162/089976602320263980.PMID12184841.
A. Graves, J. Schmidhuber. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks. Advances in Neural Information Processing Systems 22, NIPS'22, pp 545–552, Vancouver, MIT Press, 2009.
M. Baccouche, F. Mamalet, C Wolf, C. Garcia, A. Baskurt. Sequential Deep Learning for Human Action Recognition. 2nd International Workshop on Human Behavior Understanding (HBU), A.A. Salah, B. Lepri ed. Amsterdam, Netherlands. pp. 29–39. Lecture Notes in Computer Science 7065. Springer. 2011
Hochreiter,S.;Heusel,M.;Obermayer,K.(2007).“Fast model-based protein homology detection without alignment”.Bioinformatics23(14): 1728–1736.doi:10.1093/bioinformatics/btm247.PMID17488755.
Thireou,T.;Reczko,M.(2007).“Bidirectional Long Short-Term Memory Networks for predicting the subcellular localization of eukaryotic proteins”.IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)4(3): 441–446.doi:10.1109/tcbb.2007.1015.PMID17666763.