Draft:RWKV

From Wikipedia, the free encyclopedia

RWKV [1] stands for Receptance-Weighted Key-Value, a neural network architecture designed for generative language modeling. This model is known for maintaining the scaling laws up to 14B. RWKV is a stateful, linear architecture that modifies traditional Transformer attention to enable efficient recurrent processing and handle sequences of any length

References[edit]

  1. ^ Peng, Bo; et al. (2023). "RWKV: Reinventing RNNS for the Transformer Era". arXiv:2305.13048 [cs.CL].