Agent Papers
Updated on 2026.04.03
| Publish Date | Title | Authors | |
|---|---|---|---|
| 2026-04-02 | Glia: A Human-Inspired AI for Automated Systems Design and Optimization | Pouya Hamadanian et.al. | 2510.27176 |
| 2026-04-01 | Universal YOCO for Efficient Depth Scaling | Yutao Sun et.al. | 2604.01220 |
| 2026-03-30 | Heddle: A Distributed Orchestration System for Agentic RL Rollout | Zili Zhang et.al. | 2603.28101 |
| 2026-03-30 | Multi-Agent Home Energy Management Assistant | Wooyoung Jung et.al. | 2602.15219 |
| 2026-03-26 | Large Language Models as Optimization Controllers: Adaptive Continuation for SIMP Topology Optimization | Shaoliang Yang et.al. | 2603.25099 |
| 2026-03-25 | LMetric: Simple is Better - Multiplication May Be All You Need for LLM Request Scheduling | Dingyan Zhang et.al. | 2603.15202 |
| 2026-03-25 | Language-Grounded Multi-Agent Planning for Personalized and Fair Participatory Urban Sensing | Xusen Guo et.al. | 2603.24014 |
| 2026-03-23 | Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs | Kangqi Ni et.al. | 2603.22206 |
| 2026-03-23 | AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access | Liwei Wu et.al. | 2603.22376_(ACL) |
| 2026-03-22 | The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project | Huamin Chen et.al. | 2603.21354 |
| 2026-03-22 | Improving Coherence and Persistence in Agentic AI for System Optimization | Pantea Karimi et.al. | 2603.21321 |
| 2026-03-22 | CALVO: Improve Serving Efficiency for LLM Inferences with Intense Network Demands | Weiye Wang et.al. | 2603.21257 |
| 2026-03-18 | IEMAS: An Incentive-Efficiency Routing Framework for Open Agentic Web Ecosystems | Hongze Liu et.al. | 2603.17302 |
| 2026-03-17 | EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval | Zebin Yang et.al. | 2510.18546_(NeurIPS) |
| 2026-03-17 | Efficient LLM Serving for Agentic Workflows: A Data Systems Perspective | Noppanat Wadlom et.al. | 2603.16104 |
| 2026-03-17 | MetaClaw: Just Talk – An Agent That Meta-Learns and Evolves in the Wild | Peng Xia et.al. | 2603.17187 |
| 2026-03-15 | OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism | Xiangyu Li et.al. | 2603.14371 |
| 2026-03-14 | ATCC: Adaptive Concurrency Control for Unforeseen Agentic Transactions | Weixing Zhou et.al. | 2603.13906 |
| 2026-03-14 | Retrieve, Schedule, Reflect: LLM Agents for Chip QoR Optimization | Yikang ouyang et.al. | 2603.13767 |
| 2026-03-14 | Justitia: Fair and Efficient Scheduling of Task-parallel LLM Agents with Selective Pampering | Mingyan Yang et.al. | 2510.17015 |
| 2026-03-13 | AgentRM: An OS-Inspired Resource Manager for LLM Agent Systems | Jianshu She et.al. | 2603.13110 |
| 2026-03-13 | ARL-Tangram: Unleash the Resource Efficiency in Agentic Reinforcement Learning | Bangjun Xiao et.al. | 2603.13019 |
| 2026-03-13 | Orla: A Library for Serving LLM-Based Multi-Agent Systems | Rana Shahout et.al. | 2603.13605 |
| 2026-03-12 | Slow-Fast Inference: Training-Free Inference Acceleration via Within-Sentence Support Stability | Xingyu Xie et.al. | 2603.12038 |
| 2026-03-11 | Beyond Max Tokens: Stealthy Resource Amplification via Tool Calling Chains in LLM Agents | Kaiyu Zhou et.al. | 2601.10955 |
| 2026-03-11 | BLITZRANK: Principled Zero-shot Ranking Agents with Tournament Graphs | Sheshansh Agrawal et.al. | 2602.05448 |
| 2026-03-10 | ThunderAgent: A Simple, Fast and Program-Aware Agentic Inference System | Hao Kang et.al. | 2602.13692 |
| 2026-03-10 | FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation | Yinpeng Wu et.al. | 2603.09046 |
| 2026-03-09 | Advancing Automated Algorithm Design via Evolutionary Stagewise Design with LLMs | Chen Lu et.al. | 2603.07970 |
| 2026-03-09 | Not All Prefills Are Equal: PPD Disaggregation for Multi-turn LLM Serving | Zongze Li et.al. | 2603.13358 |
| 2026-03-09 | Can LLMs Perceive Time? An Empirical Investigation | Aniketh Garikaparthi et.al. | 2604.00010_(ICLR) |
| 2026-03-08 | Quine: Realizing LLM Agents as Native POSIX Processes | Hao Ke et.al. | 2603.18030 |
| 2026-03-06 | Characterizing Faults in Agentic AI: A Taxonomy of Types, Symptoms, and Root Causes | Mehil B Shah et.al. | 2603.06847 |
| 2026-03-06 | MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens | Yu Chen et.al. | 2603.23516 |
| 2026-03-05 | PerfGuard: A Performance-Aware Agent for Visual Content Generation | Zhipeng Chen et.al. | 2601.22571_(FG) |
| 2026-03-04 | ChatNeuroSim: An LLM Agent Framework for Automated Compute-in-Memory Accelerator Deployment and Optimization | Ming-Yen Lee et.al. | 2603.08745 |
| 2026-03-03 | ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution | Liu Yang et.al. | 2603.02510 |
| 2026-03-03 | CraniMem: Cranial Inspired Gated and Bounded Memory for Agentic Systems | Pearl Mody et.al. | 2603.15642 |
| 2026-03-02 | SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents | Gyuhyeon Seo et.al. | 2509.24282_(ICLR) |
| 2026-03-02 | Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents | Heyang Gao et.al. | 2510.03253_(ICLR) |
| 2026-03-02 | HeRo: Adaptive Orchestration of Agentic RAG on Heterogeneous Mobile SoC | Maoliang Li et.al. | 2603.01661_(DAC) |
| 2026-03-01 | MagicAgent: Towards Generalized Agent Planning | Xuhui Ren et.al. | 2602.19000 |
| 2026-02-28 | ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms | Mohammad Pivezhandi et.al. | 2601.08166 |
| 2026-02-28 | RelayCaching: Accelerating LLM Collaboration via Decoding KV Cache Reuse | Yingsheng Geng et.al. | 2603.13289 |
| 2026-02-27 | Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis | Pengfei Zhang et.al. | 2602.15909_(ICLR) |
| 2026-02-27 | SideQuest: Model-Driven KV Cache Management for Long-Horizon Agentic Reasoning | Sanjay Kariyappa et.al. | 2602.22603 |
| 2026-02-27 | Efficient Long-Horizon GUI Agents via Training-Free KV Cache Compression | Bowen Zhou et.al. | 2603.00188 |
| 2026-02-27 | Verifier-Bound Communication for LLM Agents: Certified Bounds on Covert Signaling | Om Tailor et.al. | 2603.00381 |
| 2026-02-27 | ICaRus: Identical Cache Reuse for Efficient Multi Model Inference | Sunghyeon Woo et.al. | 2603.13281 |
| 2026-02-26 | DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference | Yongtong Wu et.al. | 2602.21548 |
| 2026-02-25 | Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach | Xu Yang et.al. | 2602.21715 |
| 2026-02-24 | Towards Efficient Agents: A Co-Design of Inference Architecture and System | Weizhe Lin et.al. | 2512.18337 |
| 2026-02-24 | Hierarchical Decision Mamba Meets Agentic AI: A Novel Approach for RAN Slicing in 6G | Md Arafat Habib et.al. | 2512.23502_(Networking) |
| 2026-02-24 | ST-EVO: Towards Generative Spatio-Temporal Evolution of Multi-Agent Communication Topologies | Xingjian Wu et.al. | 2602.14681 |
| 2026-02-24 | Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence | ChengYou Li et.al. | 2602.20934 |
| 2026-02-23 | ContextPilot: Fast Long-Context Inference via Context Reuse | Yinsicheng Jiang et.al. | 2511.03475 |
| 2026-02-22 | KVComm: Enabling Efficient LLM Communication through Selective KV Sharing | Xiangyu Shi et.al. | 2510.03346_(ICLR) |
| 2026-02-20 | Aurora: Neuro-Symbolic AI Driven Advising Agent | Lorena Amanda Quincoso Lugones et.al. | 2602.17999 |
| 2026-02-20 | Alignment in Time: Peak-Aware Orchestration for Long-Horizon Agentic Systems | Hanjing Shi et.al. | 2602.17910 |
| 2026-02-19 | KLong: Training LLM Agent for Extremely Long-horizon Tasks | Yue Liu et.al. | 2602.17547 |
| 2026-02-19 | Position: AI Agents Are Not (Yet) a Panacea for Social Simulation | Yiming Li et.al. | 2603.00113 |
| 2026-02-19 | ARKV: Adaptive and Resource-Efficient KV Cache Management under Limited Memory Budget for Long-Context Inference in LLMs | Jianlong Lei et.al. | 2603.08727 |
| 2026-02-17 | Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Devices | Yakov Pyotr Shkolnikov et.al. | 2603.04428 |
| 2026-02-16 | OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety | Sanidhya Vijayvargiya et.al. | 2507.06134_(ICLR) |
| 2026-02-16 | Efficient Multi-round LLM Inference over Disaggregated Serving | Wenhao He et.al. | 2602.14516 |
| 2026-02-15 | HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling | Xiaochen Zhao et.al. | 2602.13933 |
| 2026-02-12 | PrefillShare: A Shared Prefill Module for KV Reuse in Multi-LLM Disaggregated Serving | Sunghyeon Woo et.al. | 2602.12029 |
| 2026-02-12 | Deep Kernel Fusion for Transformers | Zixi Zhang et.al. | 2602.11808 |
| 2026-02-12 | IMAGAgent: Orchestrating Multi-Turn Image Editing via Constraint-Aware Planning and Reflection | Fei Shen et.al. | 2603.29602 |
| 2026-02-10 | AOI: Context-Aware Multi-Agent Operations via Dynamic Scheduling and Hierarchical Memory Compression | Zishan Bai et.al. | 2512.13956 |
| 2026-02-10 | Learning to Evict from Key-Value Cache | Luca Moschella et.al. | 2602.10238 |
| 2026-02-10 | ATPO: Adaptive Tree Policy Optimization for Multi-Turn Medical Dialogue | Ruike Cao et.al. | 2603.02216_(ICLR) |
| 2026-02-09 | Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems | Lang Feng et.al. | 2602.08847 |
| 2026-02-08 | DeltaKV: Residual-Based KV Cache Compression via Long-Range Similarity | Jitai Hao et.al. | 2602.08005 |
| 2026-02-06 | Lemon Agent Technical Report | Haipeng Jiang et.al. | 2602.07092 |
| 2026-02-05 | AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents | Wenhui Zhu et.al. | 2603.03290 |
| 2026-02-03 | Cortex: Achieving Low-Latency, Cost-Efficient Remote Data Access For LLM via Semantic-Aware Knowledge Caching | Chaoyi Ruan et.al. | 2509.17360 |
| 2026-02-03 | ENGRAM: Effective, Lightweight Memory Orchestration for Conversational Agents | Daivik Patel et.al. | 2511.12960 |
| 2026-02-03 | LegalOne: A Family of Foundation Models for Reliable Legal Reasoning | Haitao Li et.al. | 2602.00642 |
| 2026-02-03 | Agent Primitives: Reusable Latent Building Blocks for Multi-Agent Systems | Haibo Jin et.al. | 2602.03695 |
| 2026-02-02 | CoMeT: Collaborative Memory Transformer for Efficient Long Context Modeling | Runsong Zhao et.al. | 2602.01766 |
| 2026-02-02 | AgentRAN: An Agentic AI Architecture for Autonomous Control of Open 6G Networks | Maxime Elkael et.al. | 2508.17778 |
| 2026-02-02 | Implicit Bias in LLMs for Transgender Populations | Micaela Hirsch et.al. | 2602.13253 |
| 2026-02-01 | PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing | Junyi Hou et.al. | 2512.02589 |
| 2026-02-01 | LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents | Hyesung Jeon et.al. | 2602.01053 |
| 2026-01-31 | FastTTS: Accelerating Test-Time Scaling for Edge LLM Reasoning | Hao Mark Chen et.al. | 2509.00195_(ASPLOS) |
| 2026-01-30 | Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live | Hanchen Li et.al. | 2511.02230 |
| 2026-01-30 | Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models | Elias Hossain et.al. | 2510.17098 |
| 2026-01-30 | CONCUR: High-Throughput Agentic Batch Inference of LLM via Congestion-Based Concurrency Control | Qiaoling Chen et.al. | 2601.22705 |
| 2026-01-29 | Emergent Coordination in Multi-Agent Systems via Pressure Fields and Temporal Decay | Roland Rodriguez et.al. | 2601.08129 |
| 2026-01-29 | CORE:Toward Ubiquitous 6G Intelligence Through Collaborative Orchestration of Large Language Model Agents Over Hierarchical Edge | Zitong Yu et.al. | 2601.21822 |
| 2026-01-29 | Heterogeneous Computing: The Key to Powering the Future of AI Agent Inference | Yiren Zhao et.al. | 2601.22001 |
| 2026-01-28 | PEARL: Self-Evolving Assistant for Time Management with Reinforcement Learning | Bingxuan Li et.al. | 2601.11957 |
| 2026-01-27 | GTA: Generative Traffic Agents for Simulating Realistic Mobility Behavior | Simon Lämmer et.al. | 2601.16778_(CHI) |
| 2026-01-21 | IB-GRPO: Aligning LLM-based Learning Path Recommendation with Educational Objectives via Indicator-Based Group Relative Policy Optimization | Shuai Wang et.al. | 2601.14686 |
| 2026-01-20 | LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems | Badri N. Patro et.al. | 2601.14053 |
| 2026-01-20 | IGAA: Intent-Driven General Agentic AI for Edge Services Scheduling using Generative Meta Learning | Yan Sun et.al. | 2601.13702 |
| 2026-01-19 | Batch Query Processing and Optimization for Agentic Workflows | Junyi Shen et.al. | 2509.02121 |
| 2026-01-19 | Sutradhara: An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference | Anish Biswas et.al. | 2601.12967 |
| 2026-01-19 | IntAgent: NWDAF-Based Intent LLM Agent Towards Advanced Next Generation Networks | Abdelrahman Soliman et.al. | 2601.13114 |
| 2026-01-18 | Rethinking the Value of Multi-Agent Workflow: A Strong Single Agent Baseline | Jiawei Xu et.al. | 2601.12307 |
| 2026-01-16 | AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems | Weiyi Wang et.al. | 2601.11354 |
| 2026-01-13 | StackPilot: Autonomous Function Agents for Scalable and Environment-Free Code Execution | Xinkui Zhao et.al. | 2508.11665 |
| 2026-01-13 | When KV Cache Reuse Fails in Multi-Agent Systems: Cross-Candidate Interaction is Crucial for LLM Judges | Sichu Liang et.al. | 2601.08343 |
| 2026-01-13 | Unleashing Tool Engineering and Intelligence for Agentic AI in Next-Generation Communication Networks | Yinqiu Liu et.al. | 2601.08259 |
| 2026-01-12 | OpenTinker: Separating Concerns in Agentic Reinforcement Learning | Siqi Zhu et.al. | 2601.07376 |
| 2026-01-11 | RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction | Haonan Bian et.al. | 2601.06966 |
| 2026-01-09 | FlashMem: Distilling Intrinsic Latent Memory via Computation Reuse | Yubo Hou et.al. | 2601.05505 |
| 2026-01-08 | Nalar: An agent serving framework | Marco Laju et.al. | 2601.05109 |
| 2026-01-07 | NEMO-4-PAYPAL: Leveraging NVIDIA’s Nemo Framework for empowering PayPal’s Commerce Agent | Sudhanshu Garg et.al. | 2512.21578 |
| 2026-01-06 | Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC | Xinming Wei et.al. | 2506.24045 |
| 2026-01-03 | Chimera: Harnessing Multi-Agent LLMs for Automatic Insider Threat Simulation | Jiongchi Yu et.al. | 2508.07745_(NDSS) |
| 2026-01-03 | Warp-Cortex: An Asynchronous, Memory-Efficient Architecture for Million-Agent Cognitive Scaling on Consumer Hardware | Jorge L. Ruiz Williams et.al. | 2601.01298 |
| 2026-01-03 | ReliabilityBench: Evaluating LLM Agent Reliability Under Production-Like Stress Conditions | Aayush Gupta et.al. | 2601.06112 |
| 2025-12-31 | Context-aware LLM-based AI Agents for Human-centered Energy Management Systems in Smart Buildings | Tianzhi He et.al. | 2512.25055 |
| 2025-12-30 | SmartFlow Reinforcement Learning and Agentic AI for Bike-Sharing Optimisation | Aditya Sreevatsa K et.al. | 2601.00868 |
| 2025-12-28 | Accelerating Language Model Workflows with Prompt Choreography | TJ Bai et.al. | 2512.23049_(ACL) |
| 2025-12-27 | Agentic Auto-Scheduling: An Experimental Study of LLM-Guided Loop Optimization | Massinissa Merouani et.al. | 2511.00592_(CHI) |
| 2025-12-27 | AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent | Haipeng Luo et.al. | 2512.20745 |
| 2025-12-24 | V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval | Donghyuk Kim et.al. | 2512.12284_(HPCA) |
| 2025-12-22 | JITServe: SLO-aware LLM Serving with Imprecise Request Information | Wei Zhang et.al. | 2504.20068 |
| 2025-12-21 | IntelliCode: A Multi-Agent LLM Tutoring System with Centralized Learner Modeling | Jones David et.al. | 2512.18669_(ACL) |
| 2025-12-21 | Does It Tie Out? Towards Autonomous Legal Agents in Venture Capital | Pierre Colombo et.al. | 2512.18658 |
| 2025-12-18 | MEPIC: Memory Efficient Position Independent Caching for LLM Serving | Qian Wang et.al. | 2512.16822 |
| 2025-12-16 | SemShareKV: Efficient KVCache Sharing for Semantically Similar Prompts via Token-Level LSH Matching | Xinye Zhao et.al. | 2509.24832 |
| 2025-12-16 | Astraea: A State-Aware Scheduling Engine for LLM-Powered Agents | Hongqiu Ni et.al. | 2512.14142 |
| 2025-12-16 | Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling | Annu Rana et.al. | 2512.14474 |
| 2025-12-15 | Beyond Training: Enabling Self-Evolution of Agents with MOBIMEM | Zibin Liu et.al. | 2512.15784 |
| 2025-12-12 | Aragog: Just-in-Time Model Routing for Scalable Serving of Agentic Workflows | Yinwei Dai et.al. | 2511.20975 |
| 2025-12-08 | EvoMem: Improving Multi-Agent Planning with Dual-Evolving Memory | Wenzhe Fan et.al. | 2511.01912 |
| 2025-12-07 | Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning | Yulei Qin et.al. | 2509.22601 |
| 2025-12-05 | MARINE: Theoretical Optimization and Design for Multi-Agent Recursive IN-context Enhancement | Hongwei Zhang et.al. | 2512.07898 |
| 2025-12-03 | KVNAND: Efficient On-Device Large Language Model Inference Using DRAM-Free In-Flash Computing | Lishuo Deng et.al. | 2512.03608 |
| 2025-12-02 | DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent Systems | Junwei Yu et.al. | 2503.07675 |
| 2025-11-29 | A CPU-Centric Perspective on Agentic AI | Ritik Raj et.al. | 2511.00739 |
| 2025-11-28 | Beyond Curve Fitting: Neuro-Symbolic Agents for Context-Aware Epidemic Forecasting | Joongwon Chae et.al. | 2511.23276 |
| 2025-11-28 | LegalWebAgent: Empowering Access to Justice via LLM-Based Web Agents | Jinzhe Tan et.al. | 2512.04105 |
| 2025-11-27 | Optimizing NetGPT via Routing-Based Synergy and Reinforcement Learning | Yuxuan Chen et.al. | 2511.22217 |
| 2025-11-27 | Q-KVComm: Efficient Multi-Agent Communication Via Adaptive KV Cache Compression | Boris Kriuk et.al. | 2512.17914 |
| 2025-11-25 | Reducing Latency of LLM Search Agent via Speculation-based Algorithm-System Co-Design | Zixiao Huang et.al. | 2511.20048 |
| 2025-11-25 | Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation | Inferix Team et.al. | 2511.20714 |
| 2025-11-23 | Hybrid Agentic AI and Multi-Agent Systems in Smart Manufacturing | Mojtaba A. Farahani et.al. | 2511.18258 |
| 2025-11-18 | KnowCoder-A1: Incentivizing Agentic Reasoning Capability with Outcome Supervision for KBQA | Zhuo Chen et.al. | 2510.25101 |
| 2025-11-05 | ALAS: Transactional and Dynamic Multi-Agent LLM Planning | Longling Geng et.al. | 2511.03094 |
| 2025-11-05 | FREESH: Fair, Resource- and Energy-Efficient Scheduling for LLM Serving on Heterogeneous GPUs | Xuan He et.al. | 2511.00807_(ISS) |
| 2025-11-04 | LiveSecBench: A Dynamic and Culturally-Relevant AI Safety Benchmark for LLMs in Chinese Context | Yudong Li et.al. | 2511.02366 |
| 2025-11-04 | Optimal-Agent-Selection: State-Aware Routing Framework for Efficient Multi-Agent Collaboration | Jingbo Wang et.al. | 2511.02200 |
| 2025-11-04 | Using Span Queries to Optimize for Cache and Attention Locality | Paul Castro et.al. | 2511.02749 |
| 2025-11-03 | Scaling Graph Chain-of-Thought Reasoning: A Multi-Agent Framework with Efficient LLM Serving | Chengying Huan et.al. | 2511.01633 |
| 2025-11-03 | TPS-Bench: Evaluating AI Agents’ Tool Planning \& Scheduling Abilities in Compounding Tasks | Hanwen Xu et.al. | 2511.01527 |
| 2025-11-02 | HEXGEN-FLOW: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL | You Peng et.al. | 2505.05286 |
| 2025-11-01 | KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems | Hancheng Ye et.al. | 2510.12872_(FAST) |
| 2025-10-31 | Tokencake: A KV-Cache-centric Serving Framework for LLM-based Multi-Agent Applications | Zhuohang Bian et.al. | 2510.18586 |
| 2025-10-30 | Agentic AI Home Energy Management System: A Large Language Model Framework for Residential Load Scheduling | Reda El Makroum et.al. | 2510.26603 |
| 2025-10-28 | Pie: A Programmable Serving System for Emerging LLM Applications | In Gim et.al. | 2510.24051_(SOSP) |
| 2025-10-28 | Improving LLM Reasoning via Dependency-Aware Query Decomposition and Logic-Parallel Content Expansion | Xianjun Gao et.al. | 2510.24390 |
| 2025-10-28 | From Narrative to Action: A Hierarchical LLM-Agent Framework for Human Mobility Generation | Qiumeng Li et.al. | 2510.24802 |
| 2025-10-26 | SwiftSolve: A Self-Iterative, Complexity-Aware Multi-Agent Framework for Competitive Programming | Adhyayan Veer Singh et.al. | 2510.22626 |
| 2025-10-23 | Accelerating Mobile Language Model via Speculative Decoding and NPU-Coordinated Execution | Zhiyang Chen et.al. | 2510.15312 |
| 2025-10-22 | PTFA: An LLM-based Agent that Facilitates Online Consensus Building through Parallel Thinking | Wen Gu et.al. | 2503.12499 |
| 2025-10-21 | The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability | Zijie Xu et.al. | 2510.18563 |
| 2025-10-19 | STARK: Strategic Team of Agents for Refining Kernels | Juncheng Dong et.al. | 2510.16996 |
| 2025-10-18 | Ripple Effect Protocol: Coordinating Agent Populations | Ayush Chopra et.al. | 2510.16572 |
| 2025-10-16 | Terrarium: Revisiting the Blackboard for Multi-Agent Safety, Privacy, and Security Studies | Mason Nakamura et.al. | 2510.14312 |
| 2025-10-15 | Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Serving | Nikos Pagonas et.al. | 2510.14126 |
| 2025-10-14 | Evaluating the Quality of Randomness and Entropy in Tasks Supported by Large Language Models | Rabimba Karanjai et.al. | 2510.12080 |
| 2025-10-13 | Part II: ROLL Flash – Accelerating RLVR and Agentic Training with Asynchrony | Han Lu et.al. | 2510.11345 |
| 2025-10-13 | ShishuLM: Lightweight Language Model with Hybrid Decoder-MLP Architecture and Paired Weight Sharing | Shivanshu Kumar et.al. | 2510.13860 |
| 2025-10-13 | StreamAgent: Towards Anticipatory Agents for Streaming Video Understanding | Haolin Yang et.al. | 2508.01875 |
| 2025-10-11 | Agentic Troubleshooting Guide Automation for Incident Management | Jiayi Mao et.al. | 2510.10074 |
| 2025-10-10 | OrcaLoca: An LLM Agent Framework for Software Issue Localization | Zhongming Yu et.al. | 2502.00350 |
| 2025-10-10 | StreamingVLM: Real-Time Understanding for Infinite Video Streams | Ruyi Xu et.al. | 2510.09608 |
| 2025-10-08 | FLEET: Formal Language-Grounded Scheduling for Heterogeneous Robot Teams | Corban Rivera et.al. | 2510.07417 |
| 2025-10-06 | Multi-Agent Collaborative Intelligence: Dual-Dial Control for Reliable LLM Reasoning | Edward Y. Chang et.al. | 2510.04488 |
| 2025-10-06 | ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering | Yuki Imajuku et.al. | 2506.09050_(NeurIPS) |
| 2025-10-03 | Automatic Building Code Review: A Case Study | Hanlong Wan et.al. | 2510.02634 |
| 2025-10-01 | GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness | Kung-Hsiang Huang et.al. | 2510.00536 |
| 2025-09-30 | Towards Agentic OS: An LLM Agent Framework for Linux Schedulers | Yusheng Zheng et.al. | 2509.01245 |
| 2025-09-30 | AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges | Ranjan Sapkota et.al. | 2505.10468 |
| 2025-09-27 | Runtime Adaptive Pruning for LLM Inference | Huanrong Liu et.al. | 2505.17138 |
| 2025-09-26 | LLM Assisted Alpha Fairness for 6 GHz WiFi and NR_U Coexistence: An Agentic Orchestrator for Throughput, Energy, and SLA | Qun Wang et.al. | 2510.17814 |
| 2025-09-26 | ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration | Gaole Dai et.al. | 2509.21823 |
| 2025-09-25 | Nova: Real-Time Agentic Vision-Language Model Serving with Adaptive Cross-Stage Parallelization | Yuhang Xu et.al. | 2509.21301 |
| 2025-09-24 | CollaPipe: Adaptive Segment-Optimized Pipeline Parallelism for Collaborative LLM Training in Heterogeneous Edge Networks | Jiewei Chen et.al. | 2509.19855 |
| 2025-09-20 | Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games | Niv Eckhaus et.al. | 2506.05309 |
| 2025-09-19 | Overhearing LLM Agents: A Survey, Taxonomy, and Roadmap | Andrew Zhu et.al. | 2509.16325 |
| 2025-09-17 | CrowdAgent: Multi-Agent Managed Multi-Source Annotation System | Maosheng Qin et.al. | 2509.14030 |
| 2025-08-30 | LLM-Assisted Iterative Evolution with Swarm Intelligence Toward SuperBrain | Li Weigang et.al. | 2509.00510 |
| 2025-08-29 | Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward | Yong Deng et.al. | 2508.12800 |
| 2025-08-20 | Entropy-Constrained Strategy Optimization in Urban Floods: A Multi-Agent Framework with LLM and Knowledge Graph Integration | Peilin Ji et.al. | 2508.14654 |
| 2025-08-18 | Datarus-R1: An Adaptive Multi-Step Reasoning LLM for Automated Data Analysis | Ayoub Ben Chaliah et.al. | 2508.13382 |
| 2025-08-18 | AutoChemSchematic AI: Agentic Physics-Aware Automation for Chemical Manufacturing Scale-Up | Sakhinana Sagar Srinivas et.al. | 2505.24584 |
| 2025-08-11 | From Natural Language to Solver-Ready Power System Optimization: An LLM-Assisted, Validation-in-the-Loop Framework | Yunkai Hu et.al. | 2508.08147 |
| 2025-08-09 | Kairos: Low-latency Multi-Agent Serving with Shared LLMs and Excessive Loads in the Public Cloud | Jinyuan Chen et.al. | 2508.06948 |
| 2025-08-06 | AquaChat++: LLM-Assisted Multi-ROV Inspection for Aquaculture Net Pens with Integrated Battery Management and Thruster Fault Tolerance | Abdelhaleem Saad et.al. | 2508.06554 |
| 2025-08-05 | REALM-Bench: A Benchmark for Evaluating Multi-Agent Systems on Real-world, Dynamic Planning and Scheduling Tasks | Longling Geng et.al. | 2502.18836 |
| 2025-08-01 | CyGATE: Game-Theoretic Cyber Attack-Defense Engine for Patch Strategy Optimization | Yuning Jiang et.al. | 2508.00478 |
| 2025-07-29 | Forecasting LLM Inference Performance via Hardware-Agnostic Analytical Modeling | Rajeev Patwari et.al. | 2508.00904 |
| 2025-07-29 | StaffPro: an LLM Agent for Joint Staffing and Profiling | Alessio Maritan et.al. | 2507.21636 |
| 2025-07-21 | LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra | Seth Karten et.al. | 2507.15815 |
| 2025-07-18 | DREAMS: Density Functional Theory Based Research Engine for Agentic Materials Simulation | Ziqi Wang et.al. | 2507.14267 |
| 2025-07-18 | CodeEdu: A Multi-Agent Collaborative Platform for Personalized Coding Education | Jianing Zhao et.al. | 2507.13814 |
| 2025-07-14 | DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving | Yuhan Liu et.al. | 2411.02820 |
| 2025-07-10 | KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows | Zaifeng Pan et.al. | 2507.07400 |
| 2025-07-09 | Gradientsys: A Multi-Agent LLM Scheduler with ReAct Orchestration | Xinyuan Song et.al. | 2507.06520 |
| 2025-07-07 | StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling | Meng Wei et.al. | 2507.05240 |
| 2025-06-28 | FairMarket-RL: LLM-Guided Fairness Shaping for Multi-Agent Reinforcement Learning in Peer-to-Peer Markets | Shrenik Jadhav et.al. | 2506.22708 |
| 2025-06-26 | CitySim: Modeling Urban Behaviors and City Dynamics with Large-Scale LLM-Driven Agent Simulation | Nicolas Bougie et.al. | 2506.21805 |
| 2025-06-26 | MobiVerse: Scaling Urban Mobility Simulation with Hybrid Lightweight Domain-Specific Generator and Large Language Models | Yifan Liu et.al. | 2506.21784 |
| 2025-06-25 | MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation | Gurusha Juneja et.al. | 2506.20737 |
| 2025-06-17 | LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification | Penghui Yang et.al. | 2502.17421 |
| 2025-06-16 | AlphaEvolve: A coding agent for scientific and algorithmic discovery | Alexander Novikov et.al. | 2506.13131 |
| 2025-06-11 | SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems | Peiran Li et.al. | 2506.07564 |
| 2025-06-09 | DeepServe: Serverless Large Language Model Serving at Scale | Junhao Hu et.al. | 2501.14417 |
| 2025-06-07 | EconEvals: Benchmarks and Litmus Tests for LLM Agents in Unknown Environments | Sara Fish et.al. | 2503.18825 |
| 2025-06-04 | AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance | Dhaval Patel et.al. | 2506.03828 |
| 2025-06-01 | A Survey of LLM $\times$ DATA | Xuanhe Zhou et.al. | 2505.18458 |
| 2025-05-28 | Design and testing of an agent chatbot supporting decision making with public transport data | Luca Fantin et.al. | 2505.22698 |
| 2025-05-26 | Task Memory Engine: Spatial Memory for Robust Multi-Step LLM Agents | Ye Ye et.al. | 2505.19436 |
| 2025-05-23 | Curriculum Guided Reinforcement Learning for Efficient Multi Hop Retrieval Augmented Generation | Yuelyu Ji et.al. | 2505.17391 |
| 2025-05-20 | Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation | Peter Baile Chen et.al. | 2505.14398 |
| 2025-05-19 | Learning Virtual Machine Scheduling in Cloud Computing through Language Agents | JieHao Wu et.al. | 2505.10117 |
| 2025-05-18 | ALAS: A Stateful Multi-LLM Agent Framework for Disruption-Aware Planning | Edward Y. Chang et.al. | 2505.12501 |
| 2025-05-17 | Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents | Tiannuo Yang et.al. | 2505.12065 |
| 2025-05-17 | OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents | Raghav Thind et.al. | 2504.16918 |
| 2025-04-24 | Throughput-Optimal Scheduling Algorithms for LLM Inference and AI Agents | Yueying Li et.al. | 2504.07347 |
| 2025-04-21 | PLANET: A Collection of Benchmarks for Evaluating LLMs’ Planning Capabilities | Haoming Li et.al. | 2504.14773 |
| 2025-04-02 | MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding | Ranajoy Sadhukhan et.al. | 2408.11049 |
| 2025-04-01 | Personality-Driven Decision-Making in LLM-Based Autonomous Agents | Lewis Newsham et.al. | 2504.00727 |
| 2025-04-01 | HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents | Shiyi Liu et.al. | 2504.00434 |
| 2025-03-25 | Agent-Initiated Interaction in Phone UI Automation | Noam Kahlon et.al. | 2503.19537 |
| 2025-03-19 | Exploring Large Language Models for Word Games:Who is the Spy? | Chentian Wei et.al. | 2503.15235 |
| 2025-03-12 | COLA: A Scalable Multi-Agent Framework For Windows UI Task Automation | Di Zhao et.al. | 2503.09263 |
| 2025-03-11 | LLM4MAC: An LLM-Driven Reinforcement Learning Framework for MAC Protocol Emergence | Renxuan Tan et.al. | 2503.08123 |
| 2025-03-05 | Pretrained LLMs as Real-Time Controllers for Robot Operated Serial Production Line | Muhammad Waseem et.al. | 2503.03889 |
| 2025-02-28 | ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments | Pedro Gimenes et.al. | 2502.21208 |
| 2025-02-27 | TripCraft: A Benchmark for Spatio-Temporally Fine Grained Travel Planning | Soumyabrata Chaudhuri et.al. | 2502.20508 |
| 2025-02-20 | Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents | Axel Backlund et.al. | 2502.15840 |
| 2025-02-20 | Plan-over-Graph: Towards Parallelable LLM Agent Schedule | Shiqi Zhang et.al. | 2502.14563 |
| 2025-02-19 | Autellix: An Efficient Serving Engine for LLM Agents as General Programs | Michael Luo et.al. | 2502.13965 |
| 2025-02-16 | Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks | Yuanjie Lyu et.al. | 2502.11083 |
| 2025-02-06 | Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for Efficient On-Device Agents | Chenyang Shao et.al. | 2502.04392 |
| 2025-01-29 | MACI: Multi-Agent Collaborative Intelligence for Adaptive Reasoning and Temporal Planning | Edward Y. Chang et.al. | 2501.16689 |
| 2025-01-27 | LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System | Tianfu Wang et.al. | 2501.15749_(WWW) |
| 2025-01-14 | CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning | Guoliang He et.al. | 2501.08071_(CGO) |
| 2024-12-24 | TimelyLLM: Segmented LLM Serving System for Time-sensitive Robotic Applications | Neiwen Ling et.al. | 2412.18695 |
| 2024-12-21 | SYMPHONY: Improving Memory Management for LLM Inference Workloads | Saurabh Agarwal et.al. | 2412.16434 |
| 2024-11-02 | NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference | Xuanlin Jiang et.al. | 2411.01142 |
| 2024-06-06 | SGLang: Efficient Execution of Structured Language Model Programs | Lianmin Zheng et.al. | 2312.07104 |
| 2024-05-14 | Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis | Yao Fu et.al. | 2405.08944 |