AI Research Brief
Search
Methodology
中文
An 8B Model Beats a 235B One at Science Reasoning
8 selected from 167 papers
Also Worth Noting
Mat-Pref: Verifiable-Reward Training Improves Compositional Reasoning in Inorganic Materials
score 4
关键词(3): fine-tuning, GRPO, reasoning; 顶会接收: ICML
Beyond Value Benchmarks: Measuring Value-Structure Alignment in Large Language Models via Symmetric Q-Sorts
score 4
关键词(1): reasoning; 顶会接收: ACL
Denoising-Enhanced Coarse-to-Fine Infrared Small Target Detection with Attention Prior-Guided Knowledge Distillation
score 4
关键词(3): lightweight, distillation, real-time; 顶会接收: ECCV
Provably Efficient Policy-Reward Co-Pretraining for Adversarial Imitation Learning
score 4
关键词(1): pretraining; 顶会接收: ICML
Drowning in Routine: Signal Dilution in Multi-Turn Agent Training
score 4
机构: Mila; 关键词(2): scaling, GRPO
Multi4D: High-Fidelity Dynamic Gaussian Splatting via Multi-Level Competitive Allocation
score 4
关键词(1): real-time; 顶会接收: ECCV
Beyond Flat Labels: Level-Restricted Contrastive Learning for Hierarchical Fine-Grained Vision Classification
score 3
顶会接收: CVPR
Residue-Level Attributions in Protein Language Models Do Not Recover Allergen Epitopes
score 3
机构: ETH Zurich