Hamhuo

Input and output not same length

decided by model itself which is called Seq2Seq Model

Encoder and Decoder

Transformer is a enhanced S2S model

Not a layer, is multi-layer

called Autoregressive

Introduce one of two decoder - Autoregressive decoder

Decoder is generate result

there has a way make seq as input

result will contain a dictionary, with soft-max find MOST value(Classification)

next character will based on current character

What if Error propagation?

How model decide output length?

we need a special input ‘stop’

result is not generate one and another

result have same length as BEGIN

for example, In voice recognition, slow speed though add more BEGIN

While NAT is usually worse than AT

Feedback

Was this page helpful?

Sorry to hear that. Please tell us how we can improve.

Last modified April 9, 2025: 4.9 (d1e7ba5)