Verifiable Process Rewards for Agentic Reasoning
AI & ML interests
None defined yet.
Recent Activity
Papers
SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
Collections for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing"
-
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Paper ⢠2505.21600 ⢠Published ⢠71 -
nics-efc/R2R_router
Token Classification ⢠Updated ⢠1 -
nics-efc/R2R_Router_Training
Viewer ⢠Updated ⢠8.19M ⢠314 ⢠4 -
nics-efc/R2R_query
Viewer ⢠Updated ⢠2.93k ⢠35
MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
š Accepted by ICLR 2026
-
MARS: Reinforcing Multi-Agent Reasoning of LLMs through Self-Play in Strategic Games
Paper ⢠2510.15414 ⢠Published ⢠1 -
nics-efc/MARSHAL-Generalist-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠30 -
nics-efc/MARSHAL-Generalist-Qwen3-8B
Text Generation ⢠8B ⢠Updated ⢠12 -
nics-efc/MARSHAL-Tic-Tac-Toe-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠13
Artifacts of paper "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"
-
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Paper ⢠2505.21600 ⢠Published ⢠71 -
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
Paper ⢠2412.17153 ⢠Published ⢠39 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper ⢠2307.15337 ⢠Published ⢠39 -
DiTFastAttn: Attention Compression for Diffusion Transformer Models
Paper ⢠2406.08552 ⢠Published ⢠25
Verifiable Process Rewards for Agentic Reasoning
MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
š Accepted by ICLR 2026
-
MARS: Reinforcing Multi-Agent Reasoning of LLMs through Self-Play in Strategic Games
Paper ⢠2510.15414 ⢠Published ⢠1 -
nics-efc/MARSHAL-Generalist-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠30 -
nics-efc/MARSHAL-Generalist-Qwen3-8B
Text Generation ⢠8B ⢠Updated ⢠12 -
nics-efc/MARSHAL-Tic-Tac-Toe-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠13
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
Artifacts of paper "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"
Collections for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing"
-
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Paper ⢠2505.21600 ⢠Published ⢠71 -
nics-efc/R2R_router
Token Classification ⢠Updated ⢠1 -
nics-efc/R2R_Router_Training
Viewer ⢠Updated ⢠8.19M ⢠314 ⢠4 -
nics-efc/R2R_query
Viewer ⢠Updated ⢠2.93k ⢠35
-
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Paper ⢠2505.21600 ⢠Published ⢠71 -
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
Paper ⢠2412.17153 ⢠Published ⢠39 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper ⢠2307.15337 ⢠Published ⢠39 -
DiTFastAttn: Attention Compression for Diffusion Transformer Models
Paper ⢠2406.08552 ⢠Published ⢠25