Skip to content

Lstm

LSTM

Long Short-Term Memory networks

→ Type of recurrent-neural-network that solves vanishing gradient problem → Used in NLP for sequential data processing → Contains memory cells with gates (forget, input, output)

Solves vanishing-gradient and exploding-gradient in rnn

Uses gates to control information flow:

  • Forget gate
  • Input gate
  • Output gate

Before transformer architectures dominated NLP


*References

#ml-notes

lstm


*References

On this page