Sources | A 4B Web Agent Catches Up to Closed CUAs on a Few Thousand Trajectories

Featured

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts score 10
入选 HF Daily Papers; HF 热度: 51 upvotes (+4); 有代码实现; 关键词(2): agentic, reasoning
X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding score 10
入选 HF Daily Papers; HF 热度: 30 upvotes (+4); 有代码实现; 关键词(1): reasoning
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters score 8
入选 HF Daily Papers; HF 热度: 135 upvotes (+4); 关键词(3): scaling, serving, fine-tuning
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation score 9
入选 HF Daily Papers; HF 热度: 14 upvotes (+3); 有代码实现; 关键词(1): tool use
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization score 8
入选 HF Daily Papers; HF 热度: 25 upvotes (+4); 关键词(4): scaling, lightweight, reasoning, vision-language
OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents score 9
入选 HF Daily Papers; HF 热度: 15 upvotes (+3); 有代码实现; 关键词(4): post-training, agentic, reasoning, open-source
Joint Agent Memory and Exploration Learning via Novelty Signals score 9
入选 HF Daily Papers; HF 热度: 12 upvotes (+3); 有代码实现; 关键词(1): agentic
LongLive-RAG: A General Retrieval-Augmented Framework for Long Video Generation score 9
入选 HF Daily Papers; HF 热度: 15 upvotes (+3); 有代码实现; 关键词(3): lightweight, retrieval-augmented, RAG
Multi-Agent Computer Use score 6
入选 HF Daily Papers; HF 热度: 5 upvotes (+2); 关键词(1): scaling
Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning score 6
入选 HF Daily Papers; HF 热度: 6 upvotes (+2); 关键词(1): reasoning

Also Worth Noting

PhyScene3D: Physically Consistent Interactive 3D Tabletop Scene Generation score 4
关键词(1): reasoning; 顶会接收: ICML
Understanding Identity Continuity in Thermal Video through Scene-Level Consistency score 4
关键词(1): lightweight; 顶会接收: CVPR
Improving Visual Token Reduction via Rectifying Distortions for Efficient Multimodal LLM Inference score 4
关键词(2): latency, vision-language; 顶会接收: ICML
Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localization score 4
关键词(1): synthetic data; 顶会接收: ICML
Explainable Forensics of Manipulated Segments in Untrimmed Long Videos score 4
关键词(1): reasoning; 顶会接收: ICML
Initialization is Half the Battle: Generating Diverse Images from a Guidance Potential Posterior score 4
关键词(1): text-to-image; 顶会接收: ICML
Active Exploring like a Pigeon: Reinforcing Spatial Reasoning via Agentic Vision-Language Models score 4
关键词(5): serving, finetuning, agentic, reasoning, vision-language; 顶会接收: ICML
MASER: Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence score 4
关键词(4): lightweight, reasoning, vision-language, embodied; 顶会接收: CVPR
Thinking in Blender: Staged Executable Inverse Graphics with Vision-Language Models score 4
入选 HF Daily Papers; 关键词(2): agentic, vision-language
Demystifying Multimodal Biomolecular Co-design With Intrinsic Geodesic Coupling score 3
顶会接收: ICML
Divide and Conquer: Reliable Multi-View Evidential Learning for Deepfake Detection score 3
顶会接收: ICML
Convex Distance Operator Transport: A Convex and Geometry-Preserving Formulation score 3
顶会接收: ICML
Rethinking Evaluation Paradigms in IBP-based Certified Training score 3
顶会接收: ICML