Power Attention
A CUDA implementation of symmetric power attention, achieving transformer-level performance with linear-cost RNN computation.
Getting Started
- Installation: Build configuration and requirements
- Benchmarking: Performance evaluation methodology