AI论文简报
搜索
方法论
公众号
EN
两次循环让SWE-bench从43涨到64
从319篇论文中选出17篇
重点关注
OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation
score 10
入选 HF Daily Papers;HF 热度: 26 upvotes (+4);有代码实现;关键词(1): distillation
GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?
score 10
入选 HF Daily Papers;HF 热度: 42 upvotes (+4);有代码实现;关键词(1): coding
Variable-Width Transformers
score 8
入选 HF Daily Papers;HF 热度: 5 upvotes (+2);有代码实现;关键词(2): scaling, MoE
Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification
score 9
入选 HF Daily Papers;HF 热度: 11 upvotes (+3);有代码实现;关键词(4): scaling, quantization, fine-tuning, pre-training
ActWorld: From Explorable to Interactive World Model via Action-Aware Memory
score 6
入选 HF Daily Papers;HF 热度: 6 upvotes (+2);关键词(3): compression, real-time, reasoning
Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients
score 8
入选 HF Daily Papers;HF 热度: 48 upvotes (+4);关键词(3): distillation, GRPO, vision-language
也值得关注
FoundCause: Causal Discovery with Latent Confounders from Observational Data
score 4
机构: Amazon;关键词(3): edge, reasoning, synthetic data
SuCo: Sufficiency-guided Continuous Adaptive Reasoning
score 4
关键词(2): fine-tuning, reasoning;顶会接收: ICML
Dynamic Rollout Editing for Reducing Overthinking in RL-Trained Reasoning Models
score 4
机构: Huawei;关键词(3): GRPO, post-training, reasoning
SoftMoE: Soft Differentiable Routing for Mixture-of-Experts in LLMs
score 4
关键词(2): scaling, MoE;顶会接收: ICML
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling
score 8
入选 HF Daily Papers;HF 热度: 134 upvotes (+4);关键词(6): scaling, latency, instruction tuning, agentic, code generation
Looped World Models
score 6
入选 HF Daily Papers;HF 热度: 8 upvotes (+2);关键词(1): scaling
The Discrete-Log Clock: How a Transformer Learns Modular Multiplication
score 3
机构: Stanford
Temporal Preference Optimization for Unsupervised Retrieval
score 3
顶会接收: ICML
Bridging Functional Correctness and Runtime Efficiency Gaps in LLM-Based Code Translation
score 3
顶会接收: ICML
MLLMs Get It Right, Then Get It Wrong: Tracing and Correcting Late-Layer Textual Bias
score 3
顶会接收: IJCAI
Adaptive Volumetric Mechanical Property Fields Invariant to Resolution
score 3
机构: NVIDIA