OpenResearch

OpenResearch · Community

Public Projects

Auto-research projects shared by the community, many reproducing recent arXiv papers end to end. Open one to explore its experiment graph, runs, and results.

57
Projects
47
Paper Repros
558
Runs
2606.25996

Autodata: An agentic data scientist to create high quality synthetic data

Autodata: An agentic data scientist to create high quality synthetic data

EB-Dissei/autodata-an-agentic-data-scientist-to-create-high-quality-sy-6d79ff62
7 exp · 30 runsrunning

HBS Cases

Agentic Self-Instruct for hard finance reasoning. Generates hindsight-free, certified-hard finance-reasoning items (tier-weighted rubrics) grounded in the neutral pre-anchor materials of real HBS-style case studies (NTS/Oaktree mezzanine + 8 others, 42 case-anchor units). Compares CoT vs Agentic generation on the strong-weak solver gap. CPU-only, pure-stdlib LLM-API pipeline.

EB-Dissei/test-96d82f4a
2 exp · 9 runs13h agoThu, June 25, 2026 at 9:21 PM
2510.25107

Learning Hamiltonian flows from numerical integrators and examples

Learning Hamiltonian flows from numerical integrators and examples

putintostyle/learning-hamiltonian-flows-from-numerical-integrators-and-ex-214c5287
7 exp · 9 runs16h agoThu, June 25, 2026 at 6:24 PM

video-search-and-summarization

NVIDIA-AI-Blueprints/video-search-and-summarization

johnmarkwendler/video-search-and-summarization-efc4a39a
9 exp · 45 runs20h agoThu, June 25, 2026 at 2:42 PM
2606.14150

llm-pruning-collection

Small LLMs: Pruning vs. Training from Scratch

69mannying/llm-pruning-collection-5ca63e4b
7 exp · 14 runs1d agoThu, June 25, 2026 at 9:21 AM
2601.20802

GLM SDPO

Reinforcement Learning via Self-Distillation

alphaXiv/sdpo-72dc8b28
9 exp · 23 runs1d agoThu, June 25, 2026 at 8:56 AM
2601.20802

Opus SDPO

Reinforcement Learning via Self-Distillation

alphaXiv/sdpo-523abdd1
9 exp · 29 runs1d agoThu, June 25, 2026 at 6:59 AM
2605.20613

HRM-Text

HRM-Text: Efficient Pretraining Beyond Scaling

surfiend/hrm-text-57368411
2 exp · 5 runs1d agoThu, June 25, 2026 at 6:44 AM

nanochat

基线已从 alphaXiv/nanochat 导入

ryzonic/nanochat-49a21351
1 exp · 0 runsno runs
2602.14486

Aristotelian

Revisiting the Platonic Representation Hypothesis: An Aristotelian View

69mannying/aristotelian-19860a06
3 exp · 6 runs1d agoThu, June 25, 2026 at 2:49 AM
2510.03154

EditLens

EditLens: Quantifying the Extent of AI Editing in Text

69mannying/editlens-393024d5
7 exp · 14 runs1d agoWed, June 24, 2026 at 8:05 PM
2604.08423

dataset-policy-gradients

Synthetic Data for any Differentiable Target

69mannying/dataset-policy-gradients-56b37a2b
8 exp · 10 runs1d agoWed, June 24, 2026 at 5:46 PM
2606.23565

HoloAgent

HoloAgent-0: A Unified Embodied Agent Framework with 3D Spatial Memory

alphaXiv/holoagent
2 exp · 13 runs2d agoTue, June 23, 2026 at 9:08 PM
2606.23050

Unlimited-OCR

Unlimited OCR Works

alphaXiv/unlimited-ocr
3 exp · 3 runs2d agoTue, June 23, 2026 at 8:19 PM
2606.20008

VIMPO

VIMPO: Value-Implicit Policy Optimization for LLMs

alphaXiv/vimpo
3 exp · 4 runs2d agoTue, June 23, 2026 at 8:00 PM
2504.16084

ttrl-2139857f-505bba71

TTRL: Test-Time Reinforcement Learning

bmjb169-bit/ttrl-2139857f-505bba71-29474ca3
2 exp · 1 runs3d agoTue, June 23, 2026 at 10:44 AM
2403.15734

CrystalFormer

Space Group Informed Transformer for Crystalline Materials Generation

Osgood001/crystalformer-a9eb300e
1 exp · 5 runs3d agoTue, June 23, 2026 at 6:41 AM
2504.16084

TTRL

TTRL: Test-Time Reinforcement Learning

yihanzipu-sys/ttrl-2139857f
10 exp · 12 runs3d agoTue, June 23, 2026 at 6:20 AM
2606.13795

DiPOD-release

DiPOD: Diffusion Policy Optimization without Drifting Apart

manbeastfurryfreedom-ctrl/dipod-release-e361cc37
3 exp · 7 runs3d agoTue, June 23, 2026 at 4:01 AM
2307.11628

Rethinking Mesh Watermark: Towards Highly Robust and Adaptable Deep 3D Mesh Wate

Rethinking Mesh Watermark: Towards Highly Robust and Adaptable Deep 3D Mesh Watermarking

feelthevenom/rethinking-mesh-watermark-towards-highly-robust-and-adaptabl-87a39587
3 exp · 8 runs3d agoMon, June 22, 2026 at 3:52 PM
2605.13959

WarmPrior: Straightening Flow-Matching Policies with Temporal Priors

WarmPrior: Straightening Flow-Matching Policies with Temporal Priors

axiat/warmprior-straightening-flow-matching-policies-with-temporal-d3d08ed6
3 exp · 8 runs5d agoSun, June 21, 2026 at 6:50 AM

SkyRL

Getting SkyRL's fully-async GSM8K RL training running end-to-end on multi-GPU.

rehaanahmad2013/skyrl-91b191e9
21 exp · 37 runs5d agoSun, June 21, 2026 at 6:09 AM
2606.16140

VibeThinker

VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models

jyotinayarman/vibethinker-dfc26b68
4 exp · 6 runs5d agoSat, June 20, 2026 at 10:26 PM
2606.17551

rql

Reversal Q-Learning

rehaanahmad2013/rql-08edae95
2 exp · 3 runs5d agoSat, June 20, 2026 at 6:48 PM
2512.13874

SAGE

SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning

johnmarkwendler/sage-9566147f
12 exp · 39 runs5d agoSat, June 20, 2026 at 2:12 PM
2606.19531

ImageWAM

ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?

rehaanahmad2013/imagewam-734ca754
2 exp · 2 runs6d agoSat, June 20, 2026 at 10:39 AM
2601.20802

SDPO

Reinforcement Learning via Self-Distillation

AndyML-stuff/sdpo-fede389f
1 exp · 9 runs6d agoFri, June 19, 2026 at 11:21 PM
2601.18734

OPSD

Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

69mannying/opsd-d03af96d
3 exp · 2 runs7d agoThu, June 18, 2026 at 5:54 PM
2606.16140

VibeThinker-3B

Reproduce VibeThinker-3B frontier reasoning claim (arXiv 2606.16140): minimal vLLM eval of the released 3B model on AIME25.

alphaXiv/vibethinker-eb715d63
3 exp · 2 runs6/16/2026Tue, June 16, 2026 at 8:57 PM
2606.15956

TDV

Reproduction of TDV (arXiv 2606.15956): temporal-difference video self-supervised learning.

alphaXiv/tdv-078f9e7d
2 exp · 3 runs6/16/2026Tue, June 16, 2026 at 8:16 PM
2606.14702

OmniVideo-100K

Reproduction of the OmniVideo-100K audio-visual data engine (arXiv 2606.14702)

alphaXiv/omnivideo-100k-bc18a0b4
4 exp · 10 runs6/16/2026Tue, June 16, 2026 at 7:51 PM
2606.16993

DreamX-World

Reproduction of DreamX-World (arXiv 2606.16993): autoregressive 5B camera-controlled image-to-video generation, 4-step distilled.

alphaXiv/dreamx-world-3ea33daf
4 exp · 3 runs6/16/2026Tue, June 16, 2026 at 7:06 PM

EvoArena

Reproduction of EvoArena (arXiv 2606.13681): EvoMem patch-memory vs robust baseline on PersonaMem-Evo.

alphaXiv/evoarena-42c71d1e
2 exp · 1 runs6/15/2026Mon, June 15, 2026 at 10:28 PM

OPD Param Analysis

Reproduction of OPD parameter analysis (arXiv 2606.13657): on-policy vs offline weight-delta (ΔW) structure on released model pairs.

alphaXiv/opd-param-analysis-2462f2c3
4 exp · 3 runs6/15/2026Mon, June 15, 2026 at 10:10 PM
2606.14409

Hy-Embodied-0.5-VLA

Reproduction of Hy-Embodied-0.5-VLA (arXiv 2606.14409): flow-matching vision-language-action model; action-chunk reconstruction PoC.

alphaXiv/hy-embodied-0-5-vla-14acd04b
2 exp · 2 runs6/15/2026Mon, June 15, 2026 at 9:23 PM
2606.12370

Bebop MTP TV-Loss

Reproduce the core claim of 2606.12370 (Bebop): training an EAGLE3/MTP draft head with the paper's TV / e2e-TV loss yields higher rejection-sampling acceptance and a flatter entropy-acceptance slope than the CE baseline. Minimal single-GPU PoC on open Qwen3.

alphaXiv/specforge-e6f78362
2 exp · 5 runs6/15/2026Mon, June 15, 2026 at 8:30 PM
2606.10650

Dynamic Linear Attention

Reproduction of DLA (arXiv 2606.10650) on top of the Log-Linear Attention codebase it builds on.

alphaXiv/log-linear-attention-8a7ec22c
4 exp · 3 runs6/15/2026Mon, June 15, 2026 at 7:37 PM
2606.12507

RGSD

Reproduce Rubric-Guided Self-Distillation (2606.12507): RGSD vs judge-based GRPO on RubricHub-medical, Qwen-2.5-3B-Instruct, on SkyRL. Judge=gpt-4o-mini via OpenRouter. Baseline=base+conditioning-lift; children=GRPO arm, RGSD arm.

alphaXiv/skyrl-ca3ca5cf-4208b8be
4 exp · 6 runs6/15/2026Mon, June 15, 2026 at 6:24 AM

Chroma

Implementation of Chroma Context-1

alphaXiv/skyrl-ca3ca5cf
16 exp · 18 runs6/15/2026Mon, June 15, 2026 at 6:14 AM
2603.28052

Meta-Harness S2D

Reproduction of Meta-Harness (2603.28052): automated harness search for online text classification on Symptom2Disease. opencode proposer over OpenRouter; gpt-oss-20b base model.

alphaXiv/meta-harness-s2d-af14e8ec
3 exp · 7 runs6/15/2026Mon, June 15, 2026 at 12:00 AM
2606.13652

World Tracing

Reproduction of World Tracing (arXiv 2606.13652): object model predicting layered XYZ geometry, including occluded surfaces.

alphaXiv/world-tracing-e62e38ac
2 exp · 5 runs6/14/2026Sun, June 14, 2026 at 10:14 PM
2606.11709

RLCSD

Reproduction of RLCSD (arXiv 2606.11709): RL contrastive self-distillation with verl/vLLM on Qwen3-1.7B / DeepMath.

alphaXiv/rlcsd-9acf8246
2 exp · 8 runs6/13/2026Sat, June 13, 2026 at 10:53 AM
2606.13392

MiniMax Sparse Attention

Reproduction of MiniMax Sparse Attention (arXiv 2606.13392): CuTe-DSL sparse-attention kernel, sparse-vs-dense speedup PoC.

alphaXiv/msa-3f4986b6
5 exp · 9 runs6/13/2026Sat, June 13, 2026 at 10:12 AM
2602.06036

DFlash

Reproduction of DFlash (arXiv 2602.06036): block-diffusion draft model for speculative decoding on Qwen3-4B.

alphaXiv/dflash-b4d1109c
4 exp · 3 runs6/13/2026Sat, June 13, 2026 at 9:53 AM
2603.19312

LeWorldModel

Reproduction of LeWorldModel (arXiv 2603.19312): two-term JEPA world model on PushT.

alphaXiv/le-wm-66ffbe61
2 exp · 7 runs6/13/2026Sat, June 13, 2026 at 4:00 AM
2606.03264

PaddleOCR-VL-1.6

Reproduction of PaddleOCR-VL-1.6 (arXiv 2606.03264): compact 0.9B document-parsing VLM, SOTA on OmniDocBench v1.6.

alphaXiv/paddleocr-50e3c8c8
4 exp · 4 runs6/12/2026Fri, June 12, 2026 at 11:25 PM
2606.11087

Q-Guided Flow

Reproduction of Q-Guided Flow (arXiv 2606.11087): test-time critic-gradient guidance of a frozen BC flow policy in RL.

alphaXiv/qgf-12b912fc
2 exp · 2 runs6/12/2026Fri, June 12, 2026 at 10:15 PM
2606.12195

InternVideo3

Reproduction of InternVideo3 (arXiv 2606.12195): transformers-native text+image+video inference PoC for the 8B model.

alphaXiv/internvideo-d2a11ea9
2 exp · 1 runs6/12/2026Fri, June 12, 2026 at 9:07 AM
2606.10651

Keye-VL-2.0

Reproduction of Kwai Keye-VL-2.0 (arxiv 2606.10651): transformers-native multimodal inference PoC for the 30B-A3B MoE model.

alphaXiv/keye-39e1f0b5
2 exp · 5 runs6/12/2026Fri, June 12, 2026 at 6:11 AM
2606.11722

ICALens

Reproduction of arXiv 2606.11722: ICA Lens — interpreting LMs without training another dictionary

alphaXiv/ica-lens-paper-e1fdec2f
2 exp · 2 runs6/12/2026Fri, June 12, 2026 at 3:04 AM

Self-Distilled Reasoner

Reproduction of On-Policy Self-Distillation (arXiv 2601.18734): self-distilled reasoner on Qwen3-1.7B.

rehaanahmad2013/opsd-96e22c23
2 exp · 2 runs6/11/2026Thu, June 11, 2026 at 8:35 AM
2606.08432

Trajectory-Refined Distillation

Reproduction of Trajectory-Refined Distillation (arXiv 2606.08432): forward-KL distillation on refined trajectories, Qwen3-1.7B.

alphaXiv/trd
3 exp · 3 runs6/10/2026Wed, June 10, 2026 at 9:24 PM
2606.06021

On-Policy Representation Distillation

Reproduction of On-Policy Representation Distillation (arXiv 2606.06021): representation-distillation PoC on a single GPU.

alphaXiv/on-policy-representation-distillation
6 exp · 5 runs6/10/2026Wed, June 10, 2026 at 8:27 PM
2606.09079

FlashMemory-DeepSeek-V4

Reproduction of FlashMemory-DeepSeek-V4 (arXiv 2606.09079): sparse-KV retriever + sparse-decode PoC.

alphaXiv/flashmemory-deepseek-v4
2 exp · 1 runs6/10/2026Wed, June 10, 2026 at 6:33 PM
2606.02800

Cosmos

Reproduction of Cosmos 3 (arXiv 2606.02800): Cosmos3-Nano text-to-image and text-to-video inference PoC.

alphaxiv/cosmos
3 exp · 4 runs6/10/2026Wed, June 10, 2026 at 8:41 AM

MaxRL Opus

GRPO vs MaxRL on GSM8K (Qwen3-1.7B + LoRA): comparing advantage normalization by group mean vs std.

rehaanahmad2013/qwen-maxrl-b7d23c6b
32 exp · 36 runs6/10/2026Wed, June 10, 2026 at 6:41 AM

MaxRL Fable

GRPO vs MaxRL on GSM8K (Qwen3-1.7B + LoRA): does normalizing the advantage by group mean instead of std help?

rehaanahmad2013/qwen-maxrl-b724706e
37 exp · 45 runs6/10/2026Wed, June 10, 2026 at 6:30 AM