Sunday, December 28, 2025

quantbeckman Golden Python rules

 quantbeckman Golden Python rules with examples

1. No Python loops in the critical path

Instead of using for, while, or comprehensions, rely on vectorized operations.

Bad:

import numpy as np x = np.random.rand(1000000) y = np.zeros_like(x) for i in range(len(x)): y[i] = x[i] ** 2 + 3

Good:

y = x**2 + 3 # fully vectorized

✅ Python never iterates manually; the work is done in compiled NumPy code.


2. No Python objects in the critical path

Use typed arrays (numpy.ndarray, array.array, numba typed memory) instead of lists/dicts/tuples.

Bad:

lst = [i*i for i in range(1000000)]

Good:

import numpy as np arr = np.arange(1000000, dtype=np.float64) arr = arr**2

Here, the array stores typed floats, no Python objects.


3. Avoid repeated calls

Function calls in Python are expensive. Fuse operations when possible.

Bad:

y = np.sqrt(np.abs(x))

np.abs creates a new array, then np.sqrt creates another.

Better:

np.sqrt(x**2) # single fused operation (if applicable)

Or in Numba you can do:

from numba import njit @njit def f(x): return np.sqrt(x**2) # no repeated Python calls

4. No allocations in the hot path

Preallocate arrays instead of creating new ones repeatedly.

Bad:

for i in range(1000): tmp = np.zeros(1000) # repeated allocation tmp += i

Good:

tmp = np.zeros(1000) for i in range(1000): tmp[:] = i # reuse buffer

5. Avoid branches (if/else)

Branches are slow in vectorized code; use masking or lookup tables.

Bad:

y = np.zeros_like(x) for i in range(len(x)): if x[i] > 0: y[i] = x[i] else: y[i] = 0

Good:

y = np.where(x > 0, x, 0)

Or with boolean masking:

y = x.copy() y[x < 0] = 0

6. Avoid conversions

Converting between types or objects costs a lot. Keep dtype consistent.

Bad:

arr = np.array([1,2,3], dtype=np.int32) arr_float = arr.astype(np.float64) # expensive

Good:

arr = np.array([1,2,3], dtype=np.float64)

Or preallocate with the correct type to begin with.


7. Avoid I/O or logging

Never print, log, or read/write files inside the hot path.

Bad:

for i in range(1000000): print(i) # kills performance

Good:

result = x**2 + 3 # compute first # log or save only once outside hot loop

No comments: