scalable matmul free language modeling
https://x.com/DimitrisPapail/status/1799629008558014683
Why read Flash attention paper
"MatMul-free isn't Mult-free. There are Hadamard products.
- Less caching on GPUs without MMM/VMMs. But custom HW benefits way more."
No comments:
Post a Comment