AI论文简报
搜索
方法论
公众号
EN
视觉模型开始重新设计自己的输出方式
从111篇论文中选出7篇
重点关注
SpatialBench: Is Your Spatial Foundation Model an All-Round Player?
score 10
入选 HF Daily Papers;HF 热度: 63 upvotes (+4);有代码实现;关键词(2): scaling, embodied
Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini
score 7
入选 HF Daily Papers;HF 热度: 15 upvotes (+3);关键词(1): RAG
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding
score 8
入选 HF Daily Papers;HF 热度: 114 upvotes (+4);关键词(2): throughput, vision-language
JLT: Clean-Latent Prediction in Latent Diffusion Transformers
score 10
入选 HF Daily Papers;HF 热度: 26 upvotes (+4);有代码实现;关键词(1): compression
MRT: Masked Region Transformer for Layered Image Generation and Editing at Scale
score 6
入选 HF Daily Papers;HF 热度: 5 upvotes (+2);关键词(2): distillation, real-time
也值得关注
Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization
score 7
入选 HF Daily Papers;HF 热度: 3 upvotes (+1);有代码实现;关键词(3): edge, agentic, coding
PlayClass: Automated Play Behaviour Classification in Poultry
score 3
机构: Imperial College