Accepted abstracts

2025

Alignment as Distribution Learning: Your Preference Model is Explicitly a Language Model

Jihun Yun, Juno Kim, Jongho Park, Junhyuck Kim, Jongha Jon Ryu, Jaewoong Cho, and Kwang-Sung Jun

2025

@article{paper1,
  title = {Alignment as Distribution Learning: Your Preference Model is Explicitly a Language Model},
  author = {Yun, Jihun and Kim, Juno and Park, Jongho and Kim, Junhyuck and Jon Ryu, Jongha and Cho, Jaewoong and Jun, Kwang-Sung},
  year = {2025}
}

Second-Order Bounds for [0,1]-Valued Regression via Betting Loss

Yinan Li, Ethan Huang, Sungjoon Yoon, and Kwang-Sung Jun

2025

Bib

When Can Proxies Improve the Sample Complexity of Preference Learning?

Yuchen Zhu, Daniel Souza, Zhengyan Shi, Mengyue Yang, Pasquale Minervini, Matt J. Kusner, and Alexander Nicholas D’Amour

2025

Bib

@article{paper4,
  title = {When Can Proxies Improve the Sample Complexity of Preference Learning?},
  author = {Zhu, Yuchen and Augusto de Souza, Daniel and Shi, Zhengyan and Yang, Mengyue and Minervini, Pasquale and Kusner, Matt J. and D'Amour, Alexander Nicholas},
  year = {2025}
}

The Importance of Online Data: Understanding Preference Fine-tuning via Coverage

Yuda Song, Gokul Swamy, Aarti Singh, Drew Bagnell, and Wen Sun

2025

Bib

@article{paper6,
  title = {The Importance of Online Data: Understanding Preference Fine-tuning via Coverage},
  author = {Song, Yuda and Swamy, Gokul and Singh, Aarti and Bagnell, Drew and Sun, Wen},
  year = {2025}
}

ROC-Climbing: Test-time scaling with imperfect verifiers

Florian E. Dorner, Yatong Chen, Andre F Cruz, and Fanny Yang

2025

Bib

All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning

Gokul Swamy, Sanjiban Choudhury, Wen Sun, Steven Wu, and Drew Bagnell

2025

Bib

@article{paper9,
  title = {All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning},
  author = {Swamy, Gokul and Choudhury, Sanjiban and Sun, Wen and Wu, Steven and Bagnell, Drew},
  year = {2025}
}

Parameter Efficient Model Merging

Margalit Glasgow, Alexander Rakhlin, Sasha Voitovych, and Fan Chen

2025

Bib

Quantitative Bounds for Length Generalization in Transformers

Zachary Izzo, Eshaan Nichani, and Jason D. Lee

2025

Bib

Accelerating Nash Learning from Human Feedback via Mirror Prox

Daniil Tiapkin, Daniele Calandriello, Denis Belomestny, Eric Moulines, Alexey Naumov, Kashif Rasul, Michal Valko, and Pierre Menard

2025

Bib

@article{paper12,
  title = {Accelerating Nash Learning from Human Feedback via Mirror Prox},
  author = {Tiapkin, Daniil and Calandriello, Daniele and Belomestny, Denis and Moulines, Eric and Naumov, Alexey and Rasul, Kashif and Valko, Michal and Menard, Pierre},
  year = {2025}
}

Low-rank fine-tuning lies between lazy training and feature learning

Arif Kerem Dayi and Sitan Chen

2025

Bib

SharedRep-RLHF: A Shared Representation Approach to RLHF with Diverse Preferences

Arpan Mukherjee, Marcello Bullo, and Deniz Gunduz

2025

Bib

@article{paper14,
  title = {SharedRep-RLHF: A Shared Representation Approach to RLHF with Diverse Preferences},
  author = {Mukherjee, Arpan and Bullo, Marcello and Gunduz, Deniz},
  year = {2025}
}

Certifiably Safe Post-Training

Pierre Fasterling, Leo Elmecker-Plakolm, Philip Sosnin, Calvin Tsay, and Matthew Robert Wicker

2025

Bib

@article{paper15,
  title = {Certifiably Safe Post-Training},
  author = {Fasterling, Pierre and Elmecker-Plakolm, Leo and Sosnin, Philip and Tsay, Calvin and Wicker, Matthew Robert},
  year = {2025}
}

Weak-to-Strong Generalization Even in Random Feature Networks, Provably

Marko Medvedev, Kaifeng Lyu, Dingli Yu, Sanjeev Arora, Zhiyuan Li, and Nathan Srebro

2025

Bib

@article{paper16,
  title = {Weak-to-Strong Generalization Even in Random Feature Networks, Provably},
  author = {Medvedev, Marko and Lyu, Kaifeng and Yu, Dingli and Arora, Sanjeev and Li, Zhiyuan and Srebro, Nathan},
  year = {2025}
}

PILAF: Optimal Human Preference Sampling for Reward Modeling

Yunzhen Feng, Ariel Kwiatkowski, Kunhao Zheng, Yaqi Duan, and Julia Kempe

2025

Bib

@article{paper17,
  title = {PILAF: Optimal Human Preference Sampling for Reward Modeling},
  author = {Feng, Yunzhen and Kwiatkowski, Ariel and Zheng, Kunhao and Duan, Yaqi and Kempe, Julia},
  year = {2025}
}

Imitation Learning and Supervised Fine Tuning with Deterministic vs Stochastic Policies and Generators

Nirmit Joshi, Gene Li, Gal Vardi, and Nathan Srebro

2025

Bib

@article{paper18,
  title = {Imitation Learning and Supervised Fine Tuning with Deterministic vs Stochastic Policies and Generators},
  author = {Joshi, Nirmit and Li, Gene and Vardi, Gal and Srebro, Nathan},
  year = {2025}
}