b
Discover
Models
Search
About
Parallelizing Linear Transformers with the Delta Rule over Sequence Length
1 week ago
·
NeurIPS