RoFormer: Enhanced Transformer with Rotary Position Embedding
arxiv.org › abs › 2104Apr 20, 2021 · Position encoding in transformer architecture provides supervision for dependency modeling between elements at different positions in the sequence. We investigate various methods to encode positional information in transformer-based language models and propose a novel implementation named Rotary Position Embedding(RoPE). The proposed RoPE encodes absolute positional information with rotation ...