-
Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making
score 8
入选 HF Daily Papers;HF 热度: 17 upvotes (+3);关键词(2): reasoning, safety
-
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math
score 8
入选 HF Daily Papers;HF 热度: 13 upvotes (+3);关键词(2): reasoning, evaluation
-
POINTS-GUI-G: GUI-Grounding Journey
score 8
入选 HF Daily Papers;HF 热度: 10 upvotes (+3);关键词(7): inference, fine-tuning, fine-tune, agents, reasoning
-
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos
score 7
入选 HF Daily Papers;HF 热度: 6 upvotes (+2);关键词(8): distillation, real-time, post-training, pretraining, agents
-
PlanViz: Evaluating Planning-Oriented Image Generation and Editing for Computer-Use Tasks
score 7
入选 HF Daily Papers;HF 热度: 5 upvotes (+2);关键词(6): efficiency, reasoning, planning, multimodal, benchmark
-
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning
score 7
入选 HF Daily Papers;HF 热度: 5 upvotes (+2);关键词(7): scaling, efficient, efficiency, inference, latency
-
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare
score 6
入选 HF Daily Papers;HF 热度: 4 upvotes (+1);关键词(4): scaling, lightweight, GRPO, cost
-
SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks
score 6
入选 HF Daily Papers;HF 热度: 3 upvotes (+1);关键词(7): fine-tuning, DPO, alignment, preference, open-source
-
Revisiting the Shape Convention of Transformer Language Models
score 5
入选 HF Daily Papers;关键词(3): efficient, transformer, attention
-
VowelPrompt: Hearing Speech Emotions from Text via Vowel-level Prosodic Augmentation
score 5
关键词(6): fine-tuning, GRPO, reasoning, multimodal, speech;顶会接收: ICLR
-
MPIB: A Benchmark for Medical Prompt Injection Attacks and Clinical Safety in LLMs
score 2
关键词(5): retrieval-augmented, RAG, benchmark, evaluation, safety
-
RoPE-LIME: RoPE-Space Locality + Sparse-K Sampling for Efficient LLM Attribution
score 2
关键词(3): efficient, reasoning, open-source
-
An Interpretable Vision Transformer as a Fingerprint-Based Diagnostic Aid for Kabuki and Wiedemann-Steiner Syndromes
score 2
关键词(2): transformer, attention
-
SOCKET: SOft Collison Kernel EsTimator for Sparse Attention
score 2
关键词(6): scaling, efficient, inference, throughput, attention
-
MMEarth-Bench: Global Model Adaptation via Multimodal Test-Time Training
score 2
关键词(3): pretraining, multimodal, benchmark
-
Do LLMs Act Like Rational Agents? Measuring Belief Coherence in Probabilistic Decision Making
score 2
关键词(2): agent, agents
-
Zero-shot Multi-Contrast Brain MRI Registration by Intensity Randomizing T1-weighted MRI (LUMIR25)
score 2
关键词(3): lightweight, inference, multimodal
-
Accelerating Vision Transformers on Brain Processing Unit
score 2
关键词(5): efficient, deployment, inference, fine-tuning, transformer
-
Lost in Speech: Benchmarking, Evaluation, and Parsing of Spoken Code-Switching Beyond Standard UD Assumptions
score 2
关键词(4): agentic, speech, benchmark, evaluation
-
The Condensate Theorem: Transformers are O(n), Not $O(n^2)$
score 2
关键词(2): inference, attention
-
Exposing Weaknesses of Large Reasoning Models through Graph Algorithm Problems
score 2
关键词(3): reasoning, benchmark, evaluation
-
Online Adaptive Reinforcement Learning with Echo State Networks for Non-Stationary Dynamics
score 2
关键词(7): efficiency, lightweight, deployment, real-time, edge
-
Halt the Hallucination: Decoupling Signal and Semantic OOD Detection Based on Cascaded Early Rejection
score 2
关键词(4): efficient, inference, benchmark, safety
-
Don't Break the Boundary: Continual Unlearning for OOD Detection Based on Free Energy Repulsion
score 2
关键词(4): efficient, fine-tuning, cost, safety
-
Taming SAM3 in the Wild: A Concept Bank for Open-Vocabulary Segmentation
score 2
关键词(2): efficiency, alignment