Proximalized Preference Optimization for Diverse Feedback Types: A Decomposed Perspective on DPO | Read Paper on Bytez