论文来源 | 去掉CLIP的VLM更强，prefill加速28倍

重点关注

Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders score 13
机构: Tencent；入选 HF Daily Papers；HF 热度: 64 upvotes (+4)；有代码实现；关键词(6): scaling, lightweight, deployment, edge, pretraining
PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction score 6
入选 HF Daily Papers；有代码实现；关键词(2): lightweight, reasoning
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling score 8
入选 HF Daily Papers；HF 热度: 6 upvotes (+2)；有代码实现；关键词(1): latency
Physical Simulator In-the-Loop Video Generation score 6
入选 HF Daily Papers；顶会接收: CVPR

也值得关注

TumorChain: Interleaved Multimodal Chain-of-Thought Reasoning for Traceable Clinical Tumor Analysis score 4
关键词(2): reasoning, vision-language；顶会接收: ICLR
BlackMirror: Black-Box Backdoor Detection for Text-to-Image Models via Instruction-Response Deviation score 4
关键词(1): text-to-image；顶会接收: CVPR
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal Models score 4
关键词(1): text-to-image；顶会接收: CVPR
Making Training-Free Diffusion Segmentors Scale with the Generative Power score 4
关键词(1): text-to-image；顶会接收: CVPR
Cut to the Chase: Training-free Multimodal Summarization via Chain-of-Events score 4
关键词(2): lightweight, reasoning；顶会接收: CVPR
DC-Merge: Improving Model Merging with Directional Consistency score 4
关键词(2): fine-tuning, vision-language；顶会接收: CVPR
Dynamic Chunking Diffusion Transformer score 5
入选 HF Daily Papers；HF 热度: 3 upvotes (+1)；关键词(2): compression, post-training
Pano3DComposer: Feed-Forward Compositional 3D Scene Generation from Single Panoramic Image score 3
顶会接收: CVPR
Unify the Views: View-Consistent Prototype Learning for Few-Shot Segmentation score 3
顶会接收: CVPR
Imagine How To Change: Explicit Procedure Modeling for Change Captioning score 3
顶会接收: ICLR
Dynamic Momentum Recalibration in Online Gradient Learning score 3
顶会接收: CVPR
Learning to Solve Orienteering Problem with Time Windows and Variable Profits score 3
顶会接收: ICLR
SCOPE: Scene-Contextualized Incremental Few-Shot 3D Segmentation score 3
顶会接收: CVPR