AI Research Brief
Search
Methodology
中文
The Rulers We Use to Measure What Models Really Think Are Broken
23 selected from 232 papers
Featured
Faithfulness Metrics Don't Measure Faithfulness: A Meta-Evaluation with Ground Truth
score 11
机构: Google; 入选 HF Daily Papers; HF 热度: 12 upvotes (+3); 有代码实现
Your Embedding Model is SMARTer Than You Think
score 10
入选 HF Daily Papers; HF 热度: 23 upvotes (+4); 有代码实现; 关键词(4): lightweight, finetuning, post-training, open source
DarkForest: Less Talk, Higher Accuracy for Multi-Agent LLMs
score 8
入选 HF Daily Papers; HF 热度: 7 upvotes (+2); 有代码实现; 关键词(2): latency, reasoning
Injecting Image Guidance into Text-Conditioned Diffusion Models at Inference
score 7
入选 HF Daily Papers; HF 热度: 4 upvotes (+1); 有代码实现; 关键词(3): lightweight, fine-tuning, text-to-image
STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media
score 6
入选 HF Daily Papers; 有代码实现; 关键词(2): retrieval-augmented, RAG
Geometry-Aware Image Flow Matching
score 5
入选 HF Daily Papers; HF 热度: 9 upvotes (+2)
SimuWoB: Simulating Real-World Mobile Apps for Fast and Faithful GUI Agent Benchmarking
score 5
入选 HF Daily Papers; HF 热度: 2 upvotes (+1); 关键词(1): open-source
Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models
score 5
入选 HF Daily Papers; HF 热度: 2 upvotes (+1); 关键词(1): reasoning
Also Worth Noting
AOEPT: Breaking the Implicit Modality-Reduction Bottleneck in Modality-Missing Prompt Tuning
score 4
关键词(3): lightweight, serving, reasoning; 顶会接收: ICML
Geo-Expert: Towards Expert-Level Geological Reasoning via Parameter-Efficient Fine-Tuning
score 4
关键词(4): scaling, deployment, fine-tuning, reasoning; 顶会接收: ICML
When Does Adaptive Guidance Help? Belief-Aware Privileged Distillation for Autonomous Driving Under Partial Observability
score 4
关键词(1): distillation; 顶会接收: CVPR
Clustering as Reasoning: A $k$-Means Interpretation of Chain-of-Thought Graph Learning
score 4
关键词(1): reasoning; 顶会接收: ICML
Efficient DP-SGD for LLMs with Randomized Clipping
score 4
关键词(1): fine-tuning; 顶会接收: ICML
ProActor: Timing-Aware Reinforcement Learning for Proactive Task Scheduling Agents
score 4
关键词(1): GRPO; 顶会接收: ACL
NITP: Next Implicit Token Prediction for LLM Pre-training
score 4
关键词(2): pre-training, MoE; 顶会接收: ICML
Three-Step Conditional Diffusion 3D Reconstruction for Light-Field Microscopy
score 4
关键词(2): lightweight, real-time; 顶会接收: CVPR
Furina: Fragmented Uncertainty-Driven Refusal Instability Attack
score 4
机构: Princeton; 关键词(1): jailbreak
Language Bias in LVLMs: From In-Depth Analysis to Simple and Effective Mitigation
score 4
关键词(3): DPO, instruction tuning, vision-language; 顶会接收: ICML
Scale When Needed: Adaptive Neuron-level Mixed Precision Quantization Aware Training
score 4
关键词(3): compression, quantization, edge; 顶会接收: ICML
Blocked Gibbs meets Diffusion Transformers: Unsupervised Learning for Constraint Optimization
score 4
机构: University of Toronto; 关键词(1): reasoning
HCL-FF: Hierarchical and Contrastive Learning for Forward-Forward Algorithm
score 3
顶会接收: CVPR
Unifying Value Alignment and Assignment in Cross-Domain Offline Reinforcement Learning with Heterogeneous Datasets
score 3
顶会接收: ICML
Large Language Model Selection with Limited Annotations
score 3
机构: Oxford