In short, an attention-based model "focuses" on each element of the input (a word in a sentence or a different position in an image, etc.). "Focusing" means projecting different levels of attention so that the input elements are treated differently and each element of the input is weighted differently to influence the result; a non-attention model treats each element "equally".
2021-09-21