TransformerDecoderLayer — PyTorch 1.10.1 documentation
https://pytorch.org/docs/stable/generated/torch.nn.TransformerDecoder...TransformerDecoderLayer¶ class torch.nn. TransformerDecoderLayer (d_model, nhead, dim_feedforward=2048, dropout=0.1, activation=<function relu>, layer_norm_eps=1e-05, batch_first=False, norm_first=False, device=None, dtype=None) [source] ¶. TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network. …