Skip to content

Power Attention

A CUDA implementation of symmetric power attention, achieving transformer-level performance with linear-cost RNN computation.

Getting Started