Thursday, April 16, 2026

getting-into-ai-infra

 https://timzaman.com/getting-into-ai-infra

https://x.com/cosminnegruseri/status/2012041539137257544 - making llms practical

https://x.com/reinerpope/status/2044525525646119419 -  MatX, chip design and where silicon designed for LLMs is headed

https://x.com/FabianGloeckle/status/2044791592247066702 Formally verified code

https://aleximas.substack.com/p/how-will-ai-driven-automation-actually

Chipstack uses https://wavedrom.com/images/SNUG2016_WaveDrom.pdf


https://cheekypint.substack.com/p/reiner-pope-of-matx-on-accelerating 

(8:17) Tightly coupling SRAM and HBM on one chip (14:03) More MoE FLOPS, smaller KV cache load (16:08) Numerics: from 32-bit to 4-bit (19:02) Targeting both training and inference (22:14) Chip timelines (27:15) Logic and memory scarcity (29:42) Compute costs (32:07) Latency: from 20ms to 1ms as the new table stakes (40:50) Programming the chip (43:00) Starting MatX (47:11) Codesign without seeing the models (51:57) Interconnect design (55:44) Performance modeling philosophy (1:07:02) Prefill vs. decode (1:13:47) What's next

https://pdf.isaak.net/thesis - Scaling Brain emulation


https://x.com/danielhanchen/status/1931468866279932208 - FP8 in H100

https://x.com/amanrsanger/status/1668144627004903424 probably needs update

https://x.com/dwarkesh_sp/status/2032493847666659780 Space GPUs

https://www.cs.cmu.edu/~213/schedule.html

Introduction to Computer Systems