This post is for people who are working on or learning in deep learning with natural language processing. Minimum knowledge level: Hugging Face - Transformer or equivalent.
Who have read 3 followings papers are beneficial:
- Neural Machine Translation by Jointly Learning to Align and Translate
- Effective Approaches to Attention-based Neural Machine Translation
- Attention Is All You Need
There are 3 style of attention mechanism should not be confused with namely:
Name | Tensorflow implementation | PyTorch implementation | Paper |
---|---|---|---|
Bahdanau attention | AdditiveAttention layer | Attention Decoder | Neural Machine Translation by Jointly Learning to Align and Translate |
Luong attention | Attention layer | Decoder's Attention | Effective Approaches to Attention-based Neural Machine Translation |
MultiHeadAttention | MultiHeadAttention layer | MultiheadAttention layer | Attention Is All You Need |
Reading list recommendation:
NOTE: Please comment any attention mechanism not included in this post as well as paper, implementations.
Top comments (0)