Which two are essential components of the Transformer architecture? (Select TWO)
Correct: Core component of Transformers.
Why this answer
The self-attention mechanism is essential because it allows each token in the input sequence to attend to every other token, capturing long-range dependencies without the sequential bottleneck of RNNs. This mechanism computes attention scores using queries, keys, and values, enabling parallel processing and forming the core of the Transformer's ability to model context.
Exam trap
Oracle often tests the misconception that Transformers still use recurrence or convolution for sequence processing, when in fact they rely solely on self-attention and feed-forward networks.