AI论文简报
搜索
方法论
公众号
EN
答案摆面前agent也视而不见
从178篇论文中选出22篇
重点关注
SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents
score 9
入选 HF Daily Papers;HF 热度: 15 upvotes (+3);有代码实现;关键词(1): agentic
Agents Explore but Agents Ignore: LLMs Lack Environmental Curiosity
score 9
机构: Cohere;入选 HF Daily Papers;HF 热度: 5 upvotes (+2);关键词(1): reasoning
When Background Matters: Breaking Medical Vision Language Models by Transferable Attack
score 6
入选 HF Daily Papers;有代码实现;关键词(3): fine-tuning, reasoning, vision-language
EvoMaster: A Foundational Evolving Agent Framework for Agentic Science at Scale
score 6
入选 HF Daily Papers;有代码实现;关键词(1): agentic
The Continuity Layer: Why Intelligence Needs an Architecture for What It Carries Forward
score 6
入选 HF Daily Papers;有代码实现;关键词(1): agentic
HSG: Hyperbolic Scene Graph
score 6
入选 HF Daily Papers;有代码实现;关键词(1): reasoning
Back to Repair: A Minimal Denoising Network\ for Time Series Anomaly Detection
score 5
入选 HF Daily Papers;有代码实现
也值得关注
LookasideVLN: Direction-Aware Aerial Vision-and-Language Navigation
score 4
关键词(2): lightweight, reasoning;顶会接收: CVPR
Are Emotion and Rhetoric Neurons in LLM? Neuron Recognition and Adaptive Masking for Emotion-Rhetoric Prediction Steering
score 4
关键词(1): reasoning;顶会接收: ACL
Depth Adaptive Efficient Visual Autoregressive Modeling
score 4
关键词(1): pruning;顶会接收: CVPR
A Survey of Reinforcement Learning for Large Language Models under Data Scarcity: Challenges and Solutions
score 7
机构: Peking University;关键词(2): post-training, reasoning;顶会接收: ACL
Calibrated? Not for Everyone: How Sexual Orientation and Religious Markers Distort LLM Accuracy and Confidence in Medical QA
score 4
关键词(1): deployment;顶会接收: ACL
AnchorMem: Anchored Facts with Associative Contexts for Building Memory in Large Language Models
score 4
机构: Tsinghua;关键词(1): open-source
Speculative Decoding for Autoregressive Video Generation
score 4
机构: Tsinghua;关键词(2): distillation, serving
PBSBench: A Multi-Level Vision-Language Framework and Benchmark for Hematopathology Whole Slide Image Interpretation
score 4
关键词(3): instruction tuning, reasoning, vision-language;顶会接收: CVPR
ThreadSumm: Summarization of Nested Discourse Threads Using Tree of Thoughts
score 4
关键词(1): reasoning;顶会接收: ACL
Modeling Multi-Dimensional Cognitive States in Large Language Models under Cognitive Crowding
score 3
顶会接收: ACL
Cognitive Policy-Driven LLM for Diagnosis and Intervention of Cognitive Distortions in Emotional Support Conversation
score 3
顶会接收: ACL
Rethinking Meeting Effectiveness: A Benchmark and Framework for Temporal Fine-grained Automatic Meeting Effectiveness Evaluation
score 3
顶会接收: ACL
From Admission to Invariants: Measuring Deviation in Delegated Agent Systems
score 3
机构: FAIR
Contraction and Hourglass Persistence for Learning on Graphs, Simplices, and Cells
score 3
顶会接收: ICLR
MAPLE: A Meta-learning Framework for Cross-Prompt Essay Scoring
score 3
顶会接收: ACL