Reproduction of RLCSD (arXiv 2606.11709): RL contrastive self-distillation with verl/vLLM on Qwen3-1.7B / DeepMath.