In progress
Deep Learning This series of blogs are my notes from the class 11-785 Introduction to Deep Learning, taught by Bhiksha Raj at CMU. For my own sake of understanding and simplicity, the blog has bee...
Attention Models Problem with vanilla Seq2Seq Models In the vanilla sequence-to-sequence (Seq2Seq) model with an encoder–decoder setup: The encoder reads the entire input sequence (e.g., I ...
Neural Networks Depth - length of longest path from source to sink Layer - Set of all neurons which are all at the same depth with respect to the source Gradient For a scalar function $f(x)$ wit...
Planning and Decision Making
Deep Learning - Attention & Transformers
A new version of content is available.