Take a simple feed forward Neural Network first, shown below. It has the input coming in, red dot, to the hidden layer, blue dot, which results in black dot output.

A simple NN

An RNN feeds it’s output to itself at next time-step, forming a loop, passing down much needed information.

RNN feeding hidden state value to itself

To better understand the flow, look at the unrolled version below, where each RNN has different input (token in a sequence) and output at each time step.

Unrolled RNN, from time step 0 to t

The NN A takes in input at each time step while giving output h and passing information to itself for next incoming input t+1 step. The incoming sequential data is encoded using RNN first before being utilized to determine the intent/action via another feed forward network for decision.

RNNs have become the go-to NNs to be used for various tasks involving notion of sequential data, such as: speech recognition, language modeling, translation, image captioning etc.

Let’s say we ask a question to your in-house developed AI Assistant named Piri (or whatever ‘iri’ you prefer), “what time is it?”, here we try to break the sequence and color code it.

How RNNs work for the tokens of sequence

Final query retrieved as a result of processing the entire sequence

Memory: An essential requirement for making Neural Networks smart(er)

Humans tend to retrieve information from memory, short or long, use current information with it and derive logic to take next action (or impulse/habit, again based on previous experiences).

Similar is the idea to make RNN hold on to previous information or state(s). As the output of a recurrent neuron, at a given time step t, is clearly a function of the previous input (or think of it as previous input with accumulated information) till time step t-1, one could consider this mechanism as a form of memory. Any part of a neural network that has the notion of preserving state, even partially, across time steps is usually referred to as a memory-cell.

Each recurrent neuron has an output as well as a hidden state which is passed to next step neuron.

hidden state

Unrolled RNN with hidden state and output at each time step

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

via WordPress https://ramseyelbasheer.wordpress.com/2021/01/30/a-simple-overview-of-rnn-lstm-and-attention-mechanism/

A simple overview of RNN, LSTM and Attention Mechanism

RNNs to the rescue

Memory: An essential requirement for making Neural Networks smart(er)

Popular posts from this blog

Fully Explained DBScan Clustering Algorithm with Python

Streamlit — Deploy your app in just a few minutes

Hierarchical clustering explained