-
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
score 13
机构: Baidu;入选 HF Daily Papers;HF 热度: 98 upvotes (+4);有代码实现;关键词(1): tool use
-
Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization
score 11
入选 HF Daily Papers;HF 热度: 30 upvotes (+4);关键词(1): GRPO;顶会接收: ICLR
-
EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents
score 11
机构: Zhejiang University;入选 HF Daily Papers;HF 热度: 9 upvotes (+2);有代码实现;关键词(2): fine-tune, embodied
-
The Trinity of Consistency as a Defining Principle for General World Models
score 10
入选 HF Daily Papers;HF 热度: 187 upvotes (+4);有代码实现;关键词(2): scaling, reasoning
-
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models
score 10
入选 HF Daily Papers;HF 热度: 146 upvotes (+4);有代码实现;关键词(1): reasoning
-
OmniGAIA: Towards Native Omni-Modal AI Agents
score 10
入选 HF Daily Papers;HF 热度: 49 upvotes (+4);有代码实现;关键词(3): reasoning, vision-language, open-source
-
Imagination Helps Visual Reasoning, But Not Yet in Latent Space
score 10
入选 HF Daily Papers;HF 热度: 36 upvotes (+4);有代码实现;关键词(1): reasoning
-
AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning
score 10
入选 HF Daily Papers;HF 热度: 24 upvotes (+4);有代码实现;关键词(4): pruning, fine-tuning, retrieval-augmented, reasoning
-
Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization
score 9
入选 HF Daily Papers;HF 热度: 18 upvotes (+3);有代码实现;关键词(5): scaling, latency, fine-tuning, agentic, reasoning
-
MediX-R1: Open Ended Medical Reinforcement Learning
score 9
入选 HF Daily Papers;HF 热度: 16 upvotes (+3);有代码实现;关键词(4): lightweight, reasoning, vision-language, open-source