Accepted abstracts

2025

  1. Alignment as Distribution Learning: Your Preference Model is Explicitly a Language Model
    Jihun Yun, Juno Kim, Jongho Park, Junhyuck Kim, Jongha Jon Ryu, Jaewoong Cho, and Kwang-Sung Jun
    2025
  2. Second-Order Bounds for [0,1]-Valued Regression via Betting Loss
    Yinan Li, Ethan Huang, Sungjoon Yoon, and Kwang-Sung Jun
    2025
  3. When Can Proxies Improve the Sample Complexity of Preference Learning?
    Yuchen Zhu, Daniel Souza, Zhengyan Shi, Mengyue Yang, Pasquale Minervini, Matt J. Kusner, and Alexander Nicholas D’Amour
    2025
  4. The Importance of Online Data: Understanding Preference Fine-tuning via Coverage
    Yuda Song, Gokul Swamy, Aarti Singh, Drew Bagnell, and Wen Sun
    2025
  5. ROC-Climbing: Test-time scaling with imperfect verifiers
    Florian E. Dorner, Yatong Chen, Andre F Cruz, and Fanny Yang
    2025
  6. All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
    Gokul Swamy, Sanjiban Choudhury, Wen Sun, Steven Wu, and Drew Bagnell
    2025
  7. Parameter Efficient Model Merging
    Margalit Glasgow, Alexander Rakhlin, Sasha Voitovych, and Fan Chen
    2025
  8. Quantitative Bounds for Length Generalization in Transformers
    Zachary Izzo, Eshaan Nichani, and Jason D. Lee
    2025
  9. Accelerating Nash Learning from Human Feedback via Mirror Prox
    Daniil Tiapkin, Daniele Calandriello, Denis Belomestny, Eric Moulines, Alexey Naumov, Kashif Rasul, Michal Valko, and Pierre Menard
    2025
  10. Low-rank fine-tuning lies between lazy training and feature learning
    Arif Kerem Dayi and Sitan Chen
    2025
  11. SharedRep-RLHF: A Shared Representation Approach to RLHF with Diverse Preferences
    Arpan Mukherjee, Marcello Bullo, and Deniz Gunduz
    2025
  12. Certifiably Safe Post-Training
    Pierre Fasterling, Leo Elmecker-Plakolm, Philip Sosnin, Calvin Tsay, and Matthew Robert Wicker
    2025
  13. Weak-to-Strong Generalization Even in Random Feature Networks, Provably
    Marko Medvedev, Kaifeng Lyu, Dingli Yu, Sanjeev Arora, Zhiyuan Li, and Nathan Srebro
    2025
  14. PILAF: Optimal Human Preference Sampling for Reward Modeling
    Yunzhen Feng, Ariel Kwiatkowski, Kunhao Zheng, Yaqi Duan, and Julia Kempe
    2025
  15. Imitation Learning and Supervised Fine Tuning with Deterministic vs Stochastic Policies and Generators
    Nirmit Joshi, Gene Li, Gal Vardi, and Nathan Srebro
    2025