论文来源 | 奥赛金牌打包成两步配方

重点关注

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling score 10
入选 HF Daily Papers；HF 热度: 135 upvotes (+4)；有代码实现；关键词(2): scaling, reasoning
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory score 10
入选 HF Daily Papers；HF 热度: 48 upvotes (+4)；有代码实现；关键词(1): reasoning
Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video score 10
入选 HF Daily Papers；HF 热度: 34 upvotes (+4)；有代码实现；关键词(3): lightweight, finetuning, post-training
Self-Distilled Agentic Reinforcement Learning score 10
入选 HF Daily Papers；HF 热度: 75 upvotes (+4)；有代码实现；关键词(4): distillation, GRPO, post-training, agentic
Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems score 10
入选 HF Daily Papers；HF 热度: 42 upvotes (+4)；有代码实现；关键词(2): tool use, reasoning
Many-Shot CoT-ICL: Making In-Context Learning Truly Learn score 8
入选 HF Daily Papers；HF 热度: 28 upvotes (+4)；关键词(3): scaling, fine-tuning, reasoning
Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation score 8
入选 HF Daily Papers；HF 热度: 8 upvotes (+2)；有代码实现；关键词(3): retrieval-augmented, RAG, reasoning
Orchard: An Open-Source Agentic Modeling Framework score 9
入选 HF Daily Papers；HF 热度: 12 upvotes (+3)；有代码实现；关键词(7): lightweight, agentic, tool use, coding, reasoning
Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion score 9
入选 HF Daily Papers；HF 热度: 10 upvotes (+3)；有代码实现；关键词(2): lightweight, throughput
RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation score 8
入选 HF Daily Papers；HF 热度: 7 upvotes (+2)；有代码实现；关键词(1): reasoning

也值得关注

VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Prediction score 6
入选 HF Daily Papers；HF 热度: 13 upvotes (+3)
PersonalAI 2.0: Enhancing knowledge graph traversal/retrieval with planning mechanism for Personalized LLM Agents score 4
入选 HF Daily Papers；关键词(2): retrieval-augmented, reasoning
Data Difficulty and the Generalization--Extrapolation Tradeoff in LLM Fine-Tuning score 4
机构: Amazon；关键词(1): fine-tuning
Nexus : An Agentic Framework for Time Series Forecasting score 4
入选 HF Daily Papers；关键词(2): agentic, reasoning
Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis score 6
入选 HF Daily Papers；HF 热度: 5 upvotes (+2)；关键词(2): reasoning, synthetic data
Monitoring Data-aware Temporal Properties (Extended Version) score 4
关键词(1): reasoning；顶会接收: IJCAI
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining score 4
关键词(2): pre-training, pretraining；顶会接收: ICML