AI Research Brief
Search
Methodology
中文
Drop CLIP, Gain Performance: VLMs Work Better Without It
17 selected from 283 papers
Featured
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
score 12
机构: Tencent; 入选 HF Daily Papers; HF 热度: 18 upvotes (+3); 有代码实现; 关键词(6): scaling, lightweight, deployment, edge, pretraining
PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction
score 6
入选 HF Daily Papers; 有代码实现; 关键词(2): lightweight, reasoning
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling
score 6
入选 HF Daily Papers; 有代码实现; 关键词(1): latency
Physical Simulator In-the-Loop Video Generation
score 6
入选 HF Daily Papers; 顶会接收: CVPR
Also Worth Noting
TumorChain: Interleaved Multimodal Chain-of-Thought Reasoning for Traceable Clinical Tumor Analysis
score 4
关键词(2): reasoning, vision-language; 顶会接收: ICLR
BlackMirror: Black-Box Backdoor Detection for Text-to-Image Models via Instruction-Response Deviation
score 4
关键词(1): text-to-image; 顶会接收: CVPR
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal Models
score 4
关键词(1): text-to-image; 顶会接收: CVPR
Making Training-Free Diffusion Segmentors Scale with the Generative Power
score 4
关键词(1): text-to-image; 顶会接收: CVPR
Cut to the Chase: Training-free Multimodal Summarization via Chain-of-Events
score 4
关键词(2): lightweight, reasoning; 顶会接收: CVPR
DC-Merge: Improving Model Merging with Directional Consistency
score 4
关键词(2): fine-tuning, vision-language; 顶会接收: CVPR
Dynamic Chunking Diffusion Transformer
score 4
入选 HF Daily Papers; 关键词(2): compression, post-training
Pano3DComposer: Feed-Forward Compositional 3D Scene Generation from Single Panoramic Image
score 3
顶会接收: CVPR
Unify the Views: View-Consistent Prototype Learning for Few-Shot Segmentation
score 3
顶会接收: CVPR
Imagine How To Change: Explicit Procedure Modeling for Change Captioning
score 3
顶会接收: ICLR
Dynamic Momentum Recalibration in Online Gradient Learning
score 3
顶会接收: CVPR
Learning to Solve Orienteering Problem with Time Windows and Variable Profits
score 3
顶会接收: ICLR
SCOPE: Scene-Contextualized Incremental Few-Shot 3D Segmentation
score 3
顶会接收: CVPR