We present Perception-R1, a scalable RL framework using Group Relative Policy Optimization (GRPO) during MLLM post-training. Key innovations: 🎯 Perceptual Perplexity Analysis: We introduce a novel ...
REC-R1 is a general framework that bridges generative large language models (LLMs) and recommendation systems via reinforcement learning. Check the paper here.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results