- May 30, 2026 Agents Start Improving Themselves, and Reaching for Fewer Tools Daily
- May 27, 2026 The Rulers We Use to Measure What Models Really Think Are Broken Daily
- May 23, 2026 Optimizer Choice Stretches Capacity Scaling 2.3x Daily
- May 21, 2026 Dual-Stream MoE Unifies Multimodal, Garment Video 30x Faster Daily
- May 20, 2026 Stop When Reasoning Converges, Save 26% of Tokens Daily
- May 14, 2026 Flow-OPD Lifts GenEval From 63 to 92 Daily
- May 8, 2026 T²PO Stabilizes Multi-Turn RL; MotionCache Cuts Video Steps 6x Daily
- May 4, 2026 ViT Pre-Trains Like an LLM, Skips the CLIP Stage Daily
- May 1, 2026 Recursive MAS Cuts Tokens 35%, T2I Repaints Instead of Editing Daily
- Apr 29, 2026 Emotion Probes Crash From 82% to 5% Without Keywords Daily
- Apr 27, 2026 Full Traces Lift Multi-Agent Attribution Accuracy 76% Daily
- Apr 26, 2026 4B Agent on 10K Data, MoE Upcycling Saves 32% Compute Daily
- Apr 22, 2026 Agents Ignore Answers Placed in Plain Sight Daily
- Apr 20, 2026 Open Omni Hits Flagship Scale, Self-Judge Breaks, Reasoning Leaks Forgotten Facts Daily
- Apr 18, 2026 Tencent Open-Sources 3D World Generation, VLM Modal Bias Probe Daily
- Apr 17, 2026 Big Models Resist Rumors but Fall for Noise Daily
- Apr 11, 2026 1.7x Faster From Fine-Tuning Alone, Token Collapse Misdiagnosed Daily
- Apr 9, 2026 120B on One GPU, and 40% of Video Benchmarks Are Guessable Daily
- Apr 4, 2026 Single Neurons Remember Entities, Reusable Routines Boost 19% Daily
- Apr 3, 2026 Minimalist Agents Match MCP, Code Models Think Mid-Stream Daily
- Mar 19, 2026 Open-Source Search Agent Wins With 12K Samples, Agent Skills Mostly Fail Daily
- Mar 18, 2026 700K Paper Pairs Distill Taste, Null Spaces Expose Blind Spots Daily
- Mar 17, 2026 Expert Reasoning Structure for CoT, +13% on Novel Class Discovery Daily
- Mar 16, 2026 Budget-Aware Agents Beat 4x Brute-Force Sampling Daily
- Mar 14, 2026 Encode the Answer, Not the Question — Embeddings Gain 9% Daily
- Mar 12, 2026 Write Code Before You Draw, Layouts Improve 68% Daily
- Mar 11, 2026 4-Step Diffusion Beats 100-Step Baselines, Layer Skipping Saves 18% Daily
- Mar 6, 2026 Code Agents Can't Cross Repo Boundaries, Under 45% Success Daily
- Mar 4, 2026 9K Samples Rival R1, Most RL Gains Trace Back to SFT Daily
- Mar 1, 2026 Latent Reasoning's Gains Aren't From Reasoning Daily
- Feb 25, 2026 Token Probabilities as Zero-Shot Rewards Hit 0.95 Correlation Daily
- Feb 21, 2026 Agents Score Higher but Fail the Same Way Daily
- Feb 20, 2026 Example Pairs Replace Prompts, Agents Play Favorites Daily
- Feb 12, 2026 Text Diffusion Hits Practical Speed, RL Spreads Everywhere Daily
- Feb 10, 2026 LinkedIn Ships LLM-Powered Search Ranking at Scale Daily
- Feb 5, 2026 Kimi K2.5 Open-Sources Agent Swarm, CoT Plans Only 2-3 Steps Ahead Daily
- Feb 4, 2026 Better SFT Makes Worse RL, Distillation Waste, Reward Circuits Daily
- Feb 3, 2026 Zero-Cost Data Mix Search, Guided RLVR, Selective SFT Daily
- Feb 1, 2026 Open-Source Deep Research Beats GPT-5, Embedding Scaling Outshines Experts Daily