-
K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts
score 10
入选 HF Daily Papers; HF 热度: 51 upvotes (+4); 有代码实现; 关键词(2): agentic, reasoning
-
X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding
score 10
入选 HF Daily Papers; HF 热度: 30 upvotes (+4); 有代码实现; 关键词(1): reasoning
-
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters
score 8
入选 HF Daily Papers; HF 热度: 135 upvotes (+4); 关键词(3): scaling, serving, fine-tuning
-
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation
score 9
入选 HF Daily Papers; HF 热度: 14 upvotes (+3); 有代码实现; 关键词(1): tool use
-
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization
score 8
入选 HF Daily Papers; HF 热度: 25 upvotes (+4); 关键词(4): scaling, lightweight, reasoning, vision-language
-
OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents
score 9
入选 HF Daily Papers; HF 热度: 15 upvotes (+3); 有代码实现; 关键词(4): post-training, agentic, reasoning, open-source
-
Joint Agent Memory and Exploration Learning via Novelty Signals
score 9
入选 HF Daily Papers; HF 热度: 12 upvotes (+3); 有代码实现; 关键词(1): agentic
-
LongLive-RAG: A General Retrieval-Augmented Framework for Long Video Generation
score 9
入选 HF Daily Papers; HF 热度: 15 upvotes (+3); 有代码实现; 关键词(3): lightweight, retrieval-augmented, RAG
-
Multi-Agent Computer Use
score 6
入选 HF Daily Papers; HF 热度: 5 upvotes (+2); 关键词(1): scaling
-
Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning
score 6
入选 HF Daily Papers; HF 热度: 6 upvotes (+2); 关键词(1): reasoning