b
Discover
Models
Search
About
SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices
1 week ago
·
NeurIPS