Jan 1, 20261 min read
Spectral Scaling
Senior thesis with Prof. Elad Hazan on scaling laws for large language models.

The project studied Transformer, Spectral Transform Unit, and Mamba2-style sequence mixer architectures, with a focus on how architecture affects scaling and long-memory behavior.