My Projects
Exploring technology and building solutions at the intersection of research and engineering
Sky-Light
UC Berkeley • Nov 2025
PyTorch
CUDA
Sparse Attention
LLMs
A comprehensive framework designed to advance the frontier of sparse attention research. Sky-Light bridges the gap between research innovation and production inference by unifying implementation, evaluation, and optimization of sparse attention methods for large language models.
- Unified, extensible codebase for sparse attention methods
- Standardized evaluation pipeline with public leaderboards
- Two-tier system: algorithmic discovery (PyTorch) → system optimization (kernel level)
- Tackles the quadratic scaling bottleneck of dense attention
Published research with ongoing development
fusedNeural
From-Scratch Neural Network in C
C
Numerics
Performance
Quantization
A low-level, from-scratch neural network written in C with an emphasis on performance for resource-constrained environments. Implements forward and backward passes, kernel fusion for improved cache locality, and post-training quantization.
- Forward and backward passes with backpropagation (MSE loss)
- Kernel fusion to minimize memory bandwidth
- Post-training quantization of tensors and operations
- Modular components: tensors, ops, quantized ops, loss functions
Core primitives complete; exploring multi-layer extensions