-
Self-Improving Language Models with Bidirectional Evolutionary Search
score 10
入选 HF Daily Papers; HF 热度: 52 upvotes (+4); 有代码实现; 关键词(4): post-training, agentic, embodied, open-source
-
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players
score 8
入选 HF Daily Papers; HF 热度: 356 upvotes (+4); 关键词(3): scaling, real-time, embodied
-
DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes
score 10
入选 HF Daily Papers; HF 热度: 43 upvotes (+4); 有代码实现; 关键词(2): reasoning, data curation
-
GEM: Generative Supervision Helps Embodied Intelligence
score 10
入选 HF Daily Papers; HF 热度: 37 upvotes (+4); 有代码实现; 关键词(5): pre-training, reasoning, vision-language, robotics, embodied
-
MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems
score 10
入选 HF Daily Papers; HF 热度: 36 upvotes (+4); 有代码实现; 关键词(2): RAG, reasoning
-
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
score 9
入选 HF Daily Papers; HF 热度: 78 upvotes (+4); 有代码实现
-
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents
score 9
入选 HF Daily Papers; HF 热度: 34 upvotes (+4); 有代码实现
-
OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning
score 9
入选 HF Daily Papers; HF 热度: 19 upvotes (+3); 有代码实现; 关键词(5): quantization, fine-tuning, GRPO, post-training, text-to-video
-
SkillGrad: Optimizing Agent Skills Like Gradient Descent
score 10
入选 HF Daily Papers; HF 热度: 23 upvotes (+4); 有代码实现; 关键词(1): lightweight
-
HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs
score 9
入选 HF Daily Papers; HF 热度: 12 upvotes (+3); 有代码实现; 关键词(1): reasoning