论文来源 | 单GPU训120B·视频评测四成靠猜

重点关注

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces score 11
机构: Apple；入选 HF Daily Papers；HF 热度: 16 upvotes (+3)；有代码实现
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding score 10
入选 HF Daily Papers；HF 热度: 201 upvotes (+4)；有代码实现；关键词(2): reasoning, leaderboard
Watch Before You Answer: Learning from Visually Grounded Post-Training score 10
入选 HF Daily Papers；HF 热度: 26 upvotes (+4)；有代码实现；关键词(4): post-training, reasoning, vision-language, data curation
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU score 10
入选 HF Daily Papers；HF 热度: 25 upvotes (+4)；有代码实现；关键词(1): throughput
General Multimodal Protein Design Enables DNA-Encoding of Chemistry score 10
入选 HF Daily Papers；HF 热度: 21 upvotes (+4)；有代码实现；关键词(1): scaling
MedGemma 1.5 Technical Report score 6
入选 HF Daily Papers；HF 热度: 9 upvotes (+2)；关键词(1): reasoning