Abstract: Self-Play Fine-Tuning (SPIN) has attracted significant attention in recent years, as it enables large language models (LLMs) to iteratively improve their performance through simulated ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results