OPSD

Public

Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

69mannying/opsd-d03af96d
No experiments yet.