Adhyayan: How a Hardware-Minded Reader Should Use d2l.ai

Friday, December 26, 2025

How a Hardware-Minded Reader Should Use d2l.ai

d2l.ai is often read as a machine-learning textbook. But for someone interested in hardware, it’s more useful to treat it as a workload specification.

Below are three ways to use it effectively.

1. d2l Chapters → Hardware Stress Points

Here’s a concrete mapping from d2l topics to what they stress in hardware.

MLPs

d2l focus: linear layers, activations
Hardware reality:
- Dominated by GEMMs
- Compute-heavy, easy to saturate GPUs
- Memory reuse is excellent
Hardware lesson: why dense matmul is the ideal accelerator workload

CNNs

d2l focus: convolutions, pooling
Hardware reality:
- Convs lower to matmul or stencils
- Spatial reuse is critical
- Sensitive to memory layout
Hardware lesson: why tiling and data locality matter

RNNs / LSTMs

d2l focus: recurrence, sequence modeling
Hardware reality:
- Sequential dependency kills parallelism
- Poor GPU utilization
Hardware lesson: why RNNs were a dead end for scaling

Transformers

d2l focus: attention, feed-forward blocks
Hardware reality:
- Attention = memory-bound matmul + reductions
- Training is parallel; inference is sequential
Hardware lesson: why memory bandwidth dominates modern accelerators

Optimization (SGD, Adam)

d2l focus: update rules
Hardware reality:
- Small kernels, lots of memory traffic
- Often fused to reduce overhead
Hardware lesson: why optimizer design affects kernel fusion and bandwidth

2. Which d2l Sections Matter Most for Hardware

If you’re reading selectively:

⭐ High value for hardware understanding

Linear layers & MLPs
CNNs (especially convolution lowering)
Attention & Transformers
GPU usage chapter (use-gpu.html)

⚠️ Lower priority (hardware-agnostic)

Statistical learning theory
Classical ML (k-NN, Naive Bayes)
High-level ML math not tied to execution

The goal is not mastering ML theory — it’s understanding what patterns hardware must execute efficiently.

3. How to Read ML Textbooks as a Hardware Person

This mindset shift is the key.

When reading d2l.ai, ask:

Where are the matmuls?
How big are the tensors?
Is this compute-bound or memory-bound?
Can kernels be fused?
Is execution parallel or sequential?

Example:

“Attention has O(n²) complexity”
becomes
“This will thrash memory and stress bandwidth unless carefully tiled.”

That’s hardware literacy.

Why Patterson Ch. 7 and d2l Complement Each Other

Patterson Ch. 7 teaches machines
d2l.ai teaches workloads

Neither is sufficient alone.

Together, they let you:

predict bottlenecks
understand why TPUs look the way they do
see why scaling succeeds or fails

Dive into Deep Learning (d2l.ai) is a useful companion for understanding the structure of modern ML workloads. For hardware-minded readers, it’s most valuable once accelerator fundamentals are clear, as model descriptions can then be interpreted in terms of matrix multiplication, memory movement, and kernel behavior rather than abstract math.

One-Sentence Takeaway

Hardware explains how fast you can go; d2l explains what you’re trying to run.

Once you understand both, everything else becomes legible.

Adhyayan