DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm

Published in arXiv Preprint, 2024

Recommended citation: Yaswanth, N., et al. (2024). "DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm." arXiv preprint arXiv:2501.03271. https://arxiv.org/abs/2501.03271

This paper introduces DPO Kernels, an innovative approach to direct preference optimization that incorporates semantic awareness, kernel enhancement, and rich divergence measures for improved language model alignment with human preferences.

Paper Link