Exploring the Optimization of RLHF and its Variants in Aligning Large Models with Human Preferences
ITM Web Conf., 78 (2025) 01038
Published online: 08 September 2025
DOI: 10.1051/itmconf/20257801038

