AI Research Brief
Search
Methodology
中文
Open-Source Search Agent Wins With 12K Samples, Agent Skills Mostly Fail
23 selected from 255 papers
Featured
Mixture-of-Depths Attention
score 10
入选 HF Daily Papers; HF 热度: 66 upvotes (+4); 有代码实现; 关键词(1): scaling
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning
score 8
入选 HF Daily Papers; HF 热度: 9 upvotes (+2); 有代码实现; 关键词(1): code generation
HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions
score 10
入选 HF Daily Papers; HF 热度: 141 upvotes (+4); 有代码实现; 关键词(1): embodied
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data
score 10
入选 HF Daily Papers; HF 热度: 135 upvotes (+4); 有代码实现; 关键词(3): pre-training, reasoning, open-source
Grounding World Simulation Models in a Real-World Metropolis
score 10
入选 HF Daily Papers; HF 热度: 121 upvotes (+4); 有代码实现; 关键词(1): retrieval-augmented
Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models
score 10
入选 HF Daily Papers; HF 热度: 26 upvotes (+4); 有代码实现; 关键词(4): deployment, state space, reasoning, vision-language
ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer
score 9
入选 HF Daily Papers; HF 热度: 23 upvotes (+4); 有代码实现
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?
score 9
入选 HF Daily Papers; HF 热度: 15 upvotes (+3); 有代码实现; 关键词(1): deployment
Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium
score 9
入选 HF Daily Papers; HF 热度: 12 upvotes (+3); 有代码实现; 关键词(3): agentic, coding, reasoning
Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty
score 9
入选 HF Daily Papers; HF 热度: 11 upvotes (+3); 有代码实现; 关键词(2): post-training, reasoning
Also Worth Noting
Tri-Prompting: Video Diffusion with Unified Control over Scene, Subject, and Motion
score 4
入选 HF Daily Papers; HF 热度: 4 upvotes (+1)
Flash-Unified: A Training-Free and Task-Aware Acceleration Framework for Native Unified Models
score 4
关键词(2): pruning, deployment; 顶会接收: CVPR
Unlocking the Value of Text: Event-Driven Reasoning and Multi-Level Alignment for Time Series Forecasting
score 4
关键词(1): reasoning; 顶会接收: ICLR
Kimodo: Scaling Controllable Human Motion Generation
score 4
机构: NVIDIA; 关键词(2): scaling, robotics
Evolving Contextual Safety in Multi-Modal Large Language Models via Inference-Time Self-Reflective Memory
score 4
关键词(2): reasoning, jailbreak; 顶会接收: CVPR
Learning to Recall with Transformers Beyond Orthogonal Embeddings
score 4
机构: University of Toronto; 关键词(1): scaling
Embodied Foundation Models at the Edge: A Survey of Deployment Constraints and Mitigation Strategies
score 4
机构: Imperial College; 关键词(8): compression, deployment, latency, real-time, edge
A Family of LLMs Liberated from Static Vocabularies
score 4
机构: Aleph Alpha; 关键词(3): compression, fine-tuning, pre-training
MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale
score 4
机构: Mila; 关键词(4): deployment, latency, real-time, pretraining
GASP: Guided Asymmetric Self-Play For Coding LLMs
score 4
关键词(3): edge, post-training, coding; 顶会接收: ICLR
Deriving Hyperparameter Scaling Laws via Modern Optimization Theory
score 4
机构: Mila; 关键词(1): scaling
Seeing Beyond: Extrapolative Domain Adaptive Panoramic Segmentation
score 3
顶会接收: CVPR
AC-Foley: Reference-Audio-Guided Video-to-Audio Synthesis with Acoustic Transfer
score 3
顶会接收: ICLR