Sources | Open-Source Search Agent Wins With 12K Samples, Agent Skills Mostly Fail

Featured

Mixture-of-Depths Attention score 10
入选 HF Daily Papers; HF 热度: 66 upvotes (+4); 有代码实现; 关键词(1): scaling
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning score 8
入选 HF Daily Papers; HF 热度: 9 upvotes (+2); 有代码实现; 关键词(1): code generation
HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions score 10
入选 HF Daily Papers; HF 热度: 141 upvotes (+4); 有代码实现; 关键词(1): embodied
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data score 10
入选 HF Daily Papers; HF 热度: 135 upvotes (+4); 有代码实现; 关键词(3): pre-training, reasoning, open-source
Grounding World Simulation Models in a Real-World Metropolis score 10
入选 HF Daily Papers; HF 热度: 121 upvotes (+4); 有代码实现; 关键词(1): retrieval-augmented
Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models score 10
入选 HF Daily Papers; HF 热度: 26 upvotes (+4); 有代码实现; 关键词(4): deployment, state space, reasoning, vision-language
ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer score 9
入选 HF Daily Papers; HF 热度: 23 upvotes (+4); 有代码实现
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering? score 9
入选 HF Daily Papers; HF 热度: 15 upvotes (+3); 有代码实现; 关键词(1): deployment
Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium score 9
入选 HF Daily Papers; HF 热度: 12 upvotes (+3); 有代码实现; 关键词(3): agentic, coding, reasoning
Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty score 9
入选 HF Daily Papers; HF 热度: 11 upvotes (+3); 有代码实现; 关键词(2): post-training, reasoning

Also Worth Noting

Tri-Prompting: Video Diffusion with Unified Control over Scene, Subject, and Motion score 4
入选 HF Daily Papers; HF 热度: 4 upvotes (+1)
Flash-Unified: A Training-Free and Task-Aware Acceleration Framework for Native Unified Models score 4
关键词(2): pruning, deployment; 顶会接收: CVPR
Unlocking the Value of Text: Event-Driven Reasoning and Multi-Level Alignment for Time Series Forecasting score 4
关键词(1): reasoning; 顶会接收: ICLR
Kimodo: Scaling Controllable Human Motion Generation score 4
机构: NVIDIA; 关键词(2): scaling, robotics
Evolving Contextual Safety in Multi-Modal Large Language Models via Inference-Time Self-Reflective Memory score 4
关键词(2): reasoning, jailbreak; 顶会接收: CVPR
Learning to Recall with Transformers Beyond Orthogonal Embeddings score 4
机构: University of Toronto; 关键词(1): scaling
Embodied Foundation Models at the Edge: A Survey of Deployment Constraints and Mitigation Strategies score 4
机构: Imperial College; 关键词(8): compression, deployment, latency, real-time, edge
A Family of LLMs Liberated from Static Vocabularies score 4
机构: Aleph Alpha; 关键词(3): compression, fine-tuning, pre-training
MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale score 4
机构: Mila; 关键词(4): deployment, latency, real-time, pretraining
GASP: Guided Asymmetric Self-Play For Coding LLMs score 4
关键词(3): edge, post-training, coding; 顶会接收: ICLR
Deriving Hyperparameter Scaling Laws via Modern Optimization Theory score 4
机构: Mila; 关键词(1): scaling
Seeing Beyond: Extrapolative Domain Adaptive Panoramic Segmentation score 3
顶会接收: CVPR
AC-Foley: Reference-Audio-Guided Video-to-Audio Synthesis with Acoustic Transfer score 3
顶会接收: ICLR