- This event has passed.
Pushing the Frontier of (Small) Language Models
Abstract:
In this talk, I will explore key research contributions in efficient deep learning, with a focus on training smaller yet highly capable language models. I will discuss approaches such as curating high-quality datasets and designing effective training curricula. The talk will cover different stages of training, including pre-training, mid-training, and agentic reasoning and highlight techniques for pushing the boundary of performance via transfer from larger and/or more powerful union of models. I will conclude by outlining promising future research directions aligned with these ideas.
Bio:
Mojan Javaheripi leads the midtraining and synthetic data pillar at Reflection AI. Prior to joining Reflection, she was a Principal researcher and technical advisor to the CTO at Microsoft, as well as a resident researcher at OpenAI. Her research enhances open-source LLMs through new data sources, training regimens, and model architectures. She received her PhD from the University of California San Diego and her dissertation focused on efficient deep learning training and inference, adversarial robustness, and privacy-preserving deep learning.
Host: Dongkuan Xu
Note: This seminar is virtual. Zoom instructions ⤵️
Zoom URL: https://ncsu.zoom.us/j/95437928324?pwd=wzLmLrAfDsqW3dl13iD8i90Fa5biEV.1&jst=2
Zoom Meeting ID: 954 3792 8324
Zoom Passcode: 604198