2024-03-28T16:19:22Z

Transformer

Github User eason280711@eason280711

Transformer

  • Sequence-to-sequence

  • Application Scenarios

  • Transformer's Encoder

    • Architecture
    • Residual
    • Norm
  • Transformer's Decoder

    • Architecture
    • Autoregressive
    • Masked Multi-Head Attention
    • Non-autoregressive
    • Cross attention
  • Training

    • The loss function
    • Teacher forcing
  • Copy Mechanism

  • Guided Attention

  • Beam Search

  • Sampling

  • Optimizing Evaluation Metrics

  • Scheduled Sampling