论文来源 | 只保留256个token就能逼近全量注意力性能

重点关注

FASA: Frequency-aware Sparse Attention score 12
入选 HF Daily Papers；HF 热度: 112 upvotes (+4)；关键词(4): pruning, deployment, attention, reasoning；顶会接收: ICLR
SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild? score 11
入选 HF Daily Papers；HF 热度: 10 upvotes (+3)；关键词(4): reasoning, vision-language, benchmark, evaluation；顶会接收: ICLR
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration score 9
入选 HF Daily Papers；HF 热度: 81 upvotes (+4)；关键词(5): efficient, agent, agents, agentic, cost
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing score 9
入选 HF Daily Papers；HF 热度: 40 upvotes (+4)；关键词(2): attention, MoE
Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis score 9
入选 HF Daily Papers；HF 热度: 39 upvotes (+4)；关键词(5): fast, distillation, inference, text-to-image, cost
SWE-World: Building Software Engineering Agents in Docker-Free Environments score 9
入选 HF Daily Papers；HF 热度: 38 upvotes (+4)；关键词(4): scaling, agent, agents, evaluation
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training score 9
入选 HF Daily Papers；HF 热度: 34 upvotes (+4)；关键词(8): scaling, inference, post-training, agent, agents
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs score 9
入选 HF Daily Papers；HF 热度: 33 upvotes (+4)；关键词(4): efficiency, GRPO, post-training, reasoning
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization score 9
入选 HF Daily Papers；HF 热度: 32 upvotes (+4)；关键词(5): efficiency, quantization, deployment, latency, diffusion
Accurate Failure Prediction in Agents Does Not Imply Effective Failure Prevention score 9
入选 HF Daily Papers；HF 热度: 25 upvotes (+4)；关键词(3): deployment, agents, benchmark

也值得关注

Accelerating Scientific Research with Gemini: Case Studies and Common Techniques score 4
入选 HF Daily Papers；HF 热度: 4 upvotes (+1)
Rare Event Early Detection: A Dataset of Sepsis Onset for Critically Ill Trauma Patients score 2
关键词(2): benchmark, cost
Weighted Sum-of-Trees Model for Clustered Data score 2
关键词(2): lightweight, inference
Equal Access, Unequal Interaction: A Counterfactual Audit of LLM Fairness score 2
关键词(2): evaluation, safety
3D-Learning: Diffusion-Augmented Distributionally Robust Decision-Focused Learning score 2
关键词(3): serving, edge, diffusion
SRA-Seg: Synthetic to Real Alignment for Semi-Supervised Medical Image Segmentation score 2
关键词(3): edge, alignment, synthetic data
Variational Sparse Paired Autoencoders (vsPAIR) for Inverse Problems and Uncertainty Quantification score 2
关键词(2): fast, inference
Nüwa: Mending the Spatial Integrity Torn by VLM Token Pruning score 2
关键词(4): efficient, pruning, alignment, attention
UAT-LITE: Inference-Time Uncertainty-Aware Attention for Pretrained Transformers score 2
关键词(5): deployment, inference, transformer, attention, cost
Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation score 2
关键词(4): diffusion, audio, evaluation, synthetic data
Human-Centric Traffic Signal Control for Equity: A Multi-Agent Action Branching Deep Reinforcement Learning Approach score 2
关键词(2): agent, multimodal
Generative Engine Optimization: A VLM and Agent Framework for Pinterest Acquisition Growth score 2
关键词(8): production, real-time, fine-tune, agent, agents