Experience
2023
Machine Learning Engineer
Aimino
Implemented computer vision solutions robust against distribution shifts.
2022
Research Assistant
Explainable Machine Learning
Explored catastrophic forgetting in foundation models.
I co-authored "Momentum-based Weight Interpolation of Zero-Shot Models for Continual Learning", which won the "Best Paper Award" at INTERPOLATE @ NeurIPS, and has been cited by publications affiliated with Google DeepMind [1][2][3][4][5], and Meta AI [6][7].
Open-source
I am a passionate proponent of open-source, and have contributed to several libraries:
EleutherAI/lm-evaluation-harness | #1928, #1897, #1893, #1863, #1916, #1878, #1865 |
huggingface/transformers | #30722, #30678, #30395 |
aws-neuron/aws-neuron-sdk | #1019 |
ContinualAI/avalanche | #896 |
Projects
Sometimes, I build stuff:
- Word Game Bench – evaluating language models on puzzles. OpenRouter sponsored the project, and Mark Chen (SVP of Research at OpenAI) expressed interest in the results.
- Answers to Chip Huyen's ML Interview Questions – a booklet answering interview questions covering Math, Computer Science, ML workflows and algorithms.
- Laser Hockey – my winning entry to a Reinforcement Learning tournament with 70 participants, organized by the Max Planck Institute for Intelligent Systems.
Blog
Other times, I write stuff:
- Speeding up decoder inference with a Key-Value (KV) cache
- Boosting transformer efficiency with Grouped-Query Attention (GQA)
- Stabilizing training and improving model convergence with RMSNorm
- Activating neurons with Gated Linear Units (GLU) and Friends
- Encoding positional information with Rotary Position Embeddings (RoPE)
- Decoupling weight decay with AdamW
- Overcoming catastrophic forgetting with Elastic Weight Consolidation (EWC)