AI论文简报
搜索
方法论
公众号
EN
去掉CLIP的VLM更强,prefill加速28倍
从283篇论文中选出17篇
重点关注
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
score 13
机构: Tencent;入选 HF Daily Papers;HF 热度: 64 upvotes (+4);有代码实现;关键词(6): scaling, lightweight, deployment, edge, pretraining
PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction
score 6
入选 HF Daily Papers;有代码实现;关键词(2): lightweight, reasoning
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling
score 8
入选 HF Daily Papers;HF 热度: 6 upvotes (+2);有代码实现;关键词(1): latency
Physical Simulator In-the-Loop Video Generation
score 6
入选 HF Daily Papers;顶会接收: CVPR
也值得关注
TumorChain: Interleaved Multimodal Chain-of-Thought Reasoning for Traceable Clinical Tumor Analysis
score 4
关键词(2): reasoning, vision-language;顶会接收: ICLR
BlackMirror: Black-Box Backdoor Detection for Text-to-Image Models via Instruction-Response Deviation
score 4
关键词(1): text-to-image;顶会接收: CVPR
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal Models
score 4
关键词(1): text-to-image;顶会接收: CVPR
Making Training-Free Diffusion Segmentors Scale with the Generative Power
score 4
关键词(1): text-to-image;顶会接收: CVPR
Cut to the Chase: Training-free Multimodal Summarization via Chain-of-Events
score 4
关键词(2): lightweight, reasoning;顶会接收: CVPR
DC-Merge: Improving Model Merging with Directional Consistency
score 4
关键词(2): fine-tuning, vision-language;顶会接收: CVPR
Dynamic Chunking Diffusion Transformer
score 5
入选 HF Daily Papers;HF 热度: 3 upvotes (+1);关键词(2): compression, post-training
Pano3DComposer: Feed-Forward Compositional 3D Scene Generation from Single Panoramic Image
score 3
顶会接收: CVPR
Unify the Views: View-Consistent Prototype Learning for Few-Shot Segmentation
score 3
顶会接收: CVPR
Imagine How To Change: Explicit Procedure Modeling for Change Captioning
score 3
顶会接收: ICLR
Dynamic Momentum Recalibration in Online Gradient Learning
score 3
顶会接收: CVPR
Learning to Solve Orienteering Problem with Time Windows and Variable Profits
score 3
顶会接收: ICLR
SCOPE: Scene-Contextualized Incremental Few-Shot 3D Segmentation
score 3
顶会接收: CVPR