Agent Papers

Updated on 2026.04.03

Publish Date Title Authors PDF
2026-04-02 Glia: A Human-Inspired AI for Automated Systems Design and Optimization Pouya Hamadanian et.al. 2510.27176
2026-04-01 Universal YOCO for Efficient Depth Scaling Yutao Sun et.al. 2604.01220
2026-03-30 Heddle: A Distributed Orchestration System for Agentic RL Rollout Zili Zhang et.al. 2603.28101
2026-03-30 Multi-Agent Home Energy Management Assistant Wooyoung Jung et.al. 2602.15219
2026-03-26 Large Language Models as Optimization Controllers: Adaptive Continuation for SIMP Topology Optimization Shaoliang Yang et.al. 2603.25099
2026-03-25 LMetric: Simple is Better - Multiplication May Be All You Need for LLM Request Scheduling Dingyan Zhang et.al. 2603.15202
2026-03-25 Language-Grounded Multi-Agent Planning for Personalized and Fair Participatory Urban Sensing Xusen Guo et.al. 2603.24014
2026-03-23 Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs Kangqi Ni et.al. 2603.22206
2026-03-23 AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access Liwei Wu et.al. 2603.22376_(ACL)
2026-03-22 The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project Huamin Chen et.al. 2603.21354
2026-03-22 Improving Coherence and Persistence in Agentic AI for System Optimization Pantea Karimi et.al. 2603.21321
2026-03-22 CALVO: Improve Serving Efficiency for LLM Inferences with Intense Network Demands Weiye Wang et.al. 2603.21257
2026-03-18 IEMAS: An Incentive-Efficiency Routing Framework for Open Agentic Web Ecosystems Hongze Liu et.al. 2603.17302
2026-03-17 EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval Zebin Yang et.al. 2510.18546_(NeurIPS)
2026-03-17 Efficient LLM Serving for Agentic Workflows: A Data Systems Perspective Noppanat Wadlom et.al. 2603.16104
2026-03-17 MetaClaw: Just Talk – An Agent That Meta-Learns and Evolves in the Wild Peng Xia et.al. 2603.17187
2026-03-15 OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism Xiangyu Li et.al. 2603.14371
2026-03-14 ATCC: Adaptive Concurrency Control for Unforeseen Agentic Transactions Weixing Zhou et.al. 2603.13906
2026-03-14 Retrieve, Schedule, Reflect: LLM Agents for Chip QoR Optimization Yikang ouyang et.al. 2603.13767
2026-03-14 Justitia: Fair and Efficient Scheduling of Task-parallel LLM Agents with Selective Pampering Mingyan Yang et.al. 2510.17015
2026-03-13 AgentRM: An OS-Inspired Resource Manager for LLM Agent Systems Jianshu She et.al. 2603.13110
2026-03-13 ARL-Tangram: Unleash the Resource Efficiency in Agentic Reinforcement Learning Bangjun Xiao et.al. 2603.13019
2026-03-13 Orla: A Library for Serving LLM-Based Multi-Agent Systems Rana Shahout et.al. 2603.13605
2026-03-12 Slow-Fast Inference: Training-Free Inference Acceleration via Within-Sentence Support Stability Xingyu Xie et.al. 2603.12038
2026-03-11 Beyond Max Tokens: Stealthy Resource Amplification via Tool Calling Chains in LLM Agents Kaiyu Zhou et.al. 2601.10955
2026-03-11 BLITZRANK: Principled Zero-shot Ranking Agents with Tournament Graphs Sheshansh Agrawal et.al. 2602.05448
2026-03-10 ThunderAgent: A Simple, Fast and Program-Aware Agentic Inference System Hao Kang et.al. 2602.13692
2026-03-10 FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation Yinpeng Wu et.al. 2603.09046
2026-03-09 Advancing Automated Algorithm Design via Evolutionary Stagewise Design with LLMs Chen Lu et.al. 2603.07970
2026-03-09 Not All Prefills Are Equal: PPD Disaggregation for Multi-turn LLM Serving Zongze Li et.al. 2603.13358
2026-03-09 Can LLMs Perceive Time? An Empirical Investigation Aniketh Garikaparthi et.al. 2604.00010_(ICLR)
2026-03-08 Quine: Realizing LLM Agents as Native POSIX Processes Hao Ke et.al. 2603.18030
2026-03-06 Characterizing Faults in Agentic AI: A Taxonomy of Types, Symptoms, and Root Causes Mehil B Shah et.al. 2603.06847
2026-03-06 MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Yu Chen et.al. 2603.23516
2026-03-05 PerfGuard: A Performance-Aware Agent for Visual Content Generation Zhipeng Chen et.al. 2601.22571_(FG)
2026-03-04 ChatNeuroSim: An LLM Agent Framework for Automated Compute-in-Memory Accelerator Deployment and Optimization Ming-Yen Lee et.al. 2603.08745
2026-03-03 ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution Liu Yang et.al. 2603.02510
2026-03-03 CraniMem: Cranial Inspired Gated and Bounded Memory for Agentic Systems Pearl Mody et.al. 2603.15642
2026-03-02 SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents Gyuhyeon Seo et.al. 2509.24282_(ICLR)
2026-03-02 Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents Heyang Gao et.al. 2510.03253_(ICLR)
2026-03-02 HeRo: Adaptive Orchestration of Agentic RAG on Heterogeneous Mobile SoC Maoliang Li et.al. 2603.01661_(DAC)
2026-03-01 MagicAgent: Towards Generalized Agent Planning Xuhui Ren et.al. 2602.19000
2026-02-28 ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms Mohammad Pivezhandi et.al. 2601.08166
2026-02-28 RelayCaching: Accelerating LLM Collaboration via Decoding KV Cache Reuse Yingsheng Geng et.al. 2603.13289
2026-02-27 Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis Pengfei Zhang et.al. 2602.15909_(ICLR)
2026-02-27 SideQuest: Model-Driven KV Cache Management for Long-Horizon Agentic Reasoning Sanjay Kariyappa et.al. 2602.22603
2026-02-27 Efficient Long-Horizon GUI Agents via Training-Free KV Cache Compression Bowen Zhou et.al. 2603.00188
2026-02-27 Verifier-Bound Communication for LLM Agents: Certified Bounds on Covert Signaling Om Tailor et.al. 2603.00381
2026-02-27 ICaRus: Identical Cache Reuse for Efficient Multi Model Inference Sunghyeon Woo et.al. 2603.13281
2026-02-26 DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference Yongtong Wu et.al. 2602.21548
2026-02-25 Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach Xu Yang et.al. 2602.21715
2026-02-24 Towards Efficient Agents: A Co-Design of Inference Architecture and System Weizhe Lin et.al. 2512.18337
2026-02-24 Hierarchical Decision Mamba Meets Agentic AI: A Novel Approach for RAN Slicing in 6G Md Arafat Habib et.al. 2512.23502_(Networking)
2026-02-24 ST-EVO: Towards Generative Spatio-Temporal Evolution of Multi-Agent Communication Topologies Xingjian Wu et.al. 2602.14681
2026-02-24 Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence ChengYou Li et.al. 2602.20934
2026-02-23 ContextPilot: Fast Long-Context Inference via Context Reuse Yinsicheng Jiang et.al. 2511.03475
2026-02-22 KVComm: Enabling Efficient LLM Communication through Selective KV Sharing Xiangyu Shi et.al. 2510.03346_(ICLR)
2026-02-20 Aurora: Neuro-Symbolic AI Driven Advising Agent Lorena Amanda Quincoso Lugones et.al. 2602.17999
2026-02-20 Alignment in Time: Peak-Aware Orchestration for Long-Horizon Agentic Systems Hanjing Shi et.al. 2602.17910
2026-02-19 KLong: Training LLM Agent for Extremely Long-horizon Tasks Yue Liu et.al. 2602.17547
2026-02-19 Position: AI Agents Are Not (Yet) a Panacea for Social Simulation Yiming Li et.al. 2603.00113
2026-02-19 ARKV: Adaptive and Resource-Efficient KV Cache Management under Limited Memory Budget for Long-Context Inference in LLMs Jianlong Lei et.al. 2603.08727
2026-02-17 Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Devices Yakov Pyotr Shkolnikov et.al. 2603.04428
2026-02-16 OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety Sanidhya Vijayvargiya et.al. 2507.06134_(ICLR)
2026-02-16 Efficient Multi-round LLM Inference over Disaggregated Serving Wenhao He et.al. 2602.14516
2026-02-15 HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling Xiaochen Zhao et.al. 2602.13933
2026-02-12 PrefillShare: A Shared Prefill Module for KV Reuse in Multi-LLM Disaggregated Serving Sunghyeon Woo et.al. 2602.12029
2026-02-12 Deep Kernel Fusion for Transformers Zixi Zhang et.al. 2602.11808
2026-02-12 IMAGAgent: Orchestrating Multi-Turn Image Editing via Constraint-Aware Planning and Reflection Fei Shen et.al. 2603.29602
2026-02-10 AOI: Context-Aware Multi-Agent Operations via Dynamic Scheduling and Hierarchical Memory Compression Zishan Bai et.al. 2512.13956
2026-02-10 Learning to Evict from Key-Value Cache Luca Moschella et.al. 2602.10238
2026-02-10 ATPO: Adaptive Tree Policy Optimization for Multi-Turn Medical Dialogue Ruike Cao et.al. 2603.02216_(ICLR)
2026-02-09 Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems Lang Feng et.al. 2602.08847
2026-02-08 DeltaKV: Residual-Based KV Cache Compression via Long-Range Similarity Jitai Hao et.al. 2602.08005
2026-02-06 Lemon Agent Technical Report Haipeng Jiang et.al. 2602.07092
2026-02-05 AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents Wenhui Zhu et.al. 2603.03290
2026-02-03 Cortex: Achieving Low-Latency, Cost-Efficient Remote Data Access For LLM via Semantic-Aware Knowledge Caching Chaoyi Ruan et.al. 2509.17360
2026-02-03 ENGRAM: Effective, Lightweight Memory Orchestration for Conversational Agents Daivik Patel et.al. 2511.12960
2026-02-03 LegalOne: A Family of Foundation Models for Reliable Legal Reasoning Haitao Li et.al. 2602.00642
2026-02-03 Agent Primitives: Reusable Latent Building Blocks for Multi-Agent Systems Haibo Jin et.al. 2602.03695
2026-02-02 CoMeT: Collaborative Memory Transformer for Efficient Long Context Modeling Runsong Zhao et.al. 2602.01766
2026-02-02 AgentRAN: An Agentic AI Architecture for Autonomous Control of Open 6G Networks Maxime Elkael et.al. 2508.17778
2026-02-02 Implicit Bias in LLMs for Transgender Populations Micaela Hirsch et.al. 2602.13253
2026-02-01 PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing Junyi Hou et.al. 2512.02589
2026-02-01 LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents Hyesung Jeon et.al. 2602.01053
2026-01-31 FastTTS: Accelerating Test-Time Scaling for Edge LLM Reasoning Hao Mark Chen et.al. 2509.00195_(ASPLOS)
2026-01-30 Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live Hanchen Li et.al. 2511.02230
2026-01-30 Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models Elias Hossain et.al. 2510.17098
2026-01-30 CONCUR: High-Throughput Agentic Batch Inference of LLM via Congestion-Based Concurrency Control Qiaoling Chen et.al. 2601.22705
2026-01-29 Emergent Coordination in Multi-Agent Systems via Pressure Fields and Temporal Decay Roland Rodriguez et.al. 2601.08129
2026-01-29 CORE:Toward Ubiquitous 6G Intelligence Through Collaborative Orchestration of Large Language Model Agents Over Hierarchical Edge Zitong Yu et.al. 2601.21822
2026-01-29 Heterogeneous Computing: The Key to Powering the Future of AI Agent Inference Yiren Zhao et.al. 2601.22001
2026-01-28 PEARL: Self-Evolving Assistant for Time Management with Reinforcement Learning Bingxuan Li et.al. 2601.11957
2026-01-27 GTA: Generative Traffic Agents for Simulating Realistic Mobility Behavior Simon Lämmer et.al. 2601.16778_(CHI)
2026-01-21 IB-GRPO: Aligning LLM-based Learning Path Recommendation with Educational Objectives via Indicator-Based Group Relative Policy Optimization Shuai Wang et.al. 2601.14686
2026-01-20 LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems Badri N. Patro et.al. 2601.14053
2026-01-20 IGAA: Intent-Driven General Agentic AI for Edge Services Scheduling using Generative Meta Learning Yan Sun et.al. 2601.13702
2026-01-19 Batch Query Processing and Optimization for Agentic Workflows Junyi Shen et.al. 2509.02121
2026-01-19 Sutradhara: An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference Anish Biswas et.al. 2601.12967
2026-01-19 IntAgent: NWDAF-Based Intent LLM Agent Towards Advanced Next Generation Networks Abdelrahman Soliman et.al. 2601.13114
2026-01-18 Rethinking the Value of Multi-Agent Workflow: A Strong Single Agent Baseline Jiawei Xu et.al. 2601.12307
2026-01-16 AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems Weiyi Wang et.al. 2601.11354
2026-01-13 StackPilot: Autonomous Function Agents for Scalable and Environment-Free Code Execution Xinkui Zhao et.al. 2508.11665
2026-01-13 When KV Cache Reuse Fails in Multi-Agent Systems: Cross-Candidate Interaction is Crucial for LLM Judges Sichu Liang et.al. 2601.08343
2026-01-13 Unleashing Tool Engineering and Intelligence for Agentic AI in Next-Generation Communication Networks Yinqiu Liu et.al. 2601.08259
2026-01-12 OpenTinker: Separating Concerns in Agentic Reinforcement Learning Siqi Zhu et.al. 2601.07376
2026-01-11 RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction Haonan Bian et.al. 2601.06966
2026-01-09 FlashMem: Distilling Intrinsic Latent Memory via Computation Reuse Yubo Hou et.al. 2601.05505
2026-01-08 Nalar: An agent serving framework Marco Laju et.al. 2601.05109
2026-01-07 NEMO-4-PAYPAL: Leveraging NVIDIA’s Nemo Framework for empowering PayPal’s Commerce Agent Sudhanshu Garg et.al. 2512.21578
2026-01-06 Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC Xinming Wei et.al. 2506.24045
2026-01-03 Chimera: Harnessing Multi-Agent LLMs for Automatic Insider Threat Simulation Jiongchi Yu et.al. 2508.07745_(NDSS)
2026-01-03 Warp-Cortex: An Asynchronous, Memory-Efficient Architecture for Million-Agent Cognitive Scaling on Consumer Hardware Jorge L. Ruiz Williams et.al. 2601.01298
2026-01-03 ReliabilityBench: Evaluating LLM Agent Reliability Under Production-Like Stress Conditions Aayush Gupta et.al. 2601.06112
2025-12-31 Context-aware LLM-based AI Agents for Human-centered Energy Management Systems in Smart Buildings Tianzhi He et.al. 2512.25055
2025-12-30 SmartFlow Reinforcement Learning and Agentic AI for Bike-Sharing Optimisation Aditya Sreevatsa K et.al. 2601.00868
2025-12-28 Accelerating Language Model Workflows with Prompt Choreography TJ Bai et.al. 2512.23049_(ACL)
2025-12-27 Agentic Auto-Scheduling: An Experimental Study of LLM-Guided Loop Optimization Massinissa Merouani et.al. 2511.00592_(CHI)
2025-12-27 AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent Haipeng Luo et.al. 2512.20745
2025-12-24 V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval Donghyuk Kim et.al. 2512.12284_(HPCA)
2025-12-22 JITServe: SLO-aware LLM Serving with Imprecise Request Information Wei Zhang et.al. 2504.20068
2025-12-21 IntelliCode: A Multi-Agent LLM Tutoring System with Centralized Learner Modeling Jones David et.al. 2512.18669_(ACL)
2025-12-21 Does It Tie Out? Towards Autonomous Legal Agents in Venture Capital Pierre Colombo et.al. 2512.18658
2025-12-18 MEPIC: Memory Efficient Position Independent Caching for LLM Serving Qian Wang et.al. 2512.16822
2025-12-16 SemShareKV: Efficient KVCache Sharing for Semantically Similar Prompts via Token-Level LSH Matching Xinye Zhao et.al. 2509.24832
2025-12-16 Astraea: A State-Aware Scheduling Engine for LLM-Powered Agents Hongqiu Ni et.al. 2512.14142
2025-12-16 Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling Annu Rana et.al. 2512.14474
2025-12-15 Beyond Training: Enabling Self-Evolution of Agents with MOBIMEM Zibin Liu et.al. 2512.15784
2025-12-12 Aragog: Just-in-Time Model Routing for Scalable Serving of Agentic Workflows Yinwei Dai et.al. 2511.20975
2025-12-08 EvoMem: Improving Multi-Agent Planning with Dual-Evolving Memory Wenzhe Fan et.al. 2511.01912
2025-12-07 Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning Yulei Qin et.al. 2509.22601
2025-12-05 MARINE: Theoretical Optimization and Design for Multi-Agent Recursive IN-context Enhancement Hongwei Zhang et.al. 2512.07898
2025-12-03 KVNAND: Efficient On-Device Large Language Model Inference Using DRAM-Free In-Flash Computing Lishuo Deng et.al. 2512.03608
2025-12-02 DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent Systems Junwei Yu et.al. 2503.07675
2025-11-29 A CPU-Centric Perspective on Agentic AI Ritik Raj et.al. 2511.00739
2025-11-28 Beyond Curve Fitting: Neuro-Symbolic Agents for Context-Aware Epidemic Forecasting Joongwon Chae et.al. 2511.23276
2025-11-28 LegalWebAgent: Empowering Access to Justice via LLM-Based Web Agents Jinzhe Tan et.al. 2512.04105
2025-11-27 Optimizing NetGPT via Routing-Based Synergy and Reinforcement Learning Yuxuan Chen et.al. 2511.22217
2025-11-27 Q-KVComm: Efficient Multi-Agent Communication Via Adaptive KV Cache Compression Boris Kriuk et.al. 2512.17914
2025-11-25 Reducing Latency of LLM Search Agent via Speculation-based Algorithm-System Co-Design Zixiao Huang et.al. 2511.20048
2025-11-25 Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation Inferix Team et.al. 2511.20714
2025-11-23 Hybrid Agentic AI and Multi-Agent Systems in Smart Manufacturing Mojtaba A. Farahani et.al. 2511.18258
2025-11-18 KnowCoder-A1: Incentivizing Agentic Reasoning Capability with Outcome Supervision for KBQA Zhuo Chen et.al. 2510.25101
2025-11-05 ALAS: Transactional and Dynamic Multi-Agent LLM Planning Longling Geng et.al. 2511.03094
2025-11-05 FREESH: Fair, Resource- and Energy-Efficient Scheduling for LLM Serving on Heterogeneous GPUs Xuan He et.al. 2511.00807_(ISS)
2025-11-04 LiveSecBench: A Dynamic and Culturally-Relevant AI Safety Benchmark for LLMs in Chinese Context Yudong Li et.al. 2511.02366
2025-11-04 Optimal-Agent-Selection: State-Aware Routing Framework for Efficient Multi-Agent Collaboration Jingbo Wang et.al. 2511.02200
2025-11-04 Using Span Queries to Optimize for Cache and Attention Locality Paul Castro et.al. 2511.02749
2025-11-03 Scaling Graph Chain-of-Thought Reasoning: A Multi-Agent Framework with Efficient LLM Serving Chengying Huan et.al. 2511.01633
2025-11-03 TPS-Bench: Evaluating AI Agents’ Tool Planning \& Scheduling Abilities in Compounding Tasks Hanwen Xu et.al. 2511.01527
2025-11-02 HEXGEN-FLOW: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL You Peng et.al. 2505.05286
2025-11-01 KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems Hancheng Ye et.al. 2510.12872_(FAST)
2025-10-31 Tokencake: A KV-Cache-centric Serving Framework for LLM-based Multi-Agent Applications Zhuohang Bian et.al. 2510.18586
2025-10-30 Agentic AI Home Energy Management System: A Large Language Model Framework for Residential Load Scheduling Reda El Makroum et.al. 2510.26603
2025-10-28 Pie: A Programmable Serving System for Emerging LLM Applications In Gim et.al. 2510.24051_(SOSP)
2025-10-28 Improving LLM Reasoning via Dependency-Aware Query Decomposition and Logic-Parallel Content Expansion Xianjun Gao et.al. 2510.24390
2025-10-28 From Narrative to Action: A Hierarchical LLM-Agent Framework for Human Mobility Generation Qiumeng Li et.al. 2510.24802
2025-10-26 SwiftSolve: A Self-Iterative, Complexity-Aware Multi-Agent Framework for Competitive Programming Adhyayan Veer Singh et.al. 2510.22626
2025-10-23 Accelerating Mobile Language Model via Speculative Decoding and NPU-Coordinated Execution Zhiyang Chen et.al. 2510.15312
2025-10-22 PTFA: An LLM-based Agent that Facilitates Online Consensus Building through Parallel Thinking Wen Gu et.al. 2503.12499
2025-10-21 The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability Zijie Xu et.al. 2510.18563
2025-10-19 STARK: Strategic Team of Agents for Refining Kernels Juncheng Dong et.al. 2510.16996
2025-10-18 Ripple Effect Protocol: Coordinating Agent Populations Ayush Chopra et.al. 2510.16572
2025-10-16 Terrarium: Revisiting the Blackboard for Multi-Agent Safety, Privacy, and Security Studies Mason Nakamura et.al. 2510.14312
2025-10-15 Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Serving Nikos Pagonas et.al. 2510.14126
2025-10-14 Evaluating the Quality of Randomness and Entropy in Tasks Supported by Large Language Models Rabimba Karanjai et.al. 2510.12080
2025-10-13 Part II: ROLL Flash – Accelerating RLVR and Agentic Training with Asynchrony Han Lu et.al. 2510.11345
2025-10-13 ShishuLM: Lightweight Language Model with Hybrid Decoder-MLP Architecture and Paired Weight Sharing Shivanshu Kumar et.al. 2510.13860
2025-10-13 StreamAgent: Towards Anticipatory Agents for Streaming Video Understanding Haolin Yang et.al. 2508.01875
2025-10-11 Agentic Troubleshooting Guide Automation for Incident Management Jiayi Mao et.al. 2510.10074
2025-10-10 OrcaLoca: An LLM Agent Framework for Software Issue Localization Zhongming Yu et.al. 2502.00350
2025-10-10 StreamingVLM: Real-Time Understanding for Infinite Video Streams Ruyi Xu et.al. 2510.09608
2025-10-08 FLEET: Formal Language-Grounded Scheduling for Heterogeneous Robot Teams Corban Rivera et.al. 2510.07417
2025-10-06 Multi-Agent Collaborative Intelligence: Dual-Dial Control for Reliable LLM Reasoning Edward Y. Chang et.al. 2510.04488
2025-10-06 ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering Yuki Imajuku et.al. 2506.09050_(NeurIPS)
2025-10-03 Automatic Building Code Review: A Case Study Hanlong Wan et.al. 2510.02634
2025-10-01 GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness Kung-Hsiang Huang et.al. 2510.00536
2025-09-30 Towards Agentic OS: An LLM Agent Framework for Linux Schedulers Yusheng Zheng et.al. 2509.01245
2025-09-30 AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges Ranjan Sapkota et.al. 2505.10468
2025-09-27 Runtime Adaptive Pruning for LLM Inference Huanrong Liu et.al. 2505.17138
2025-09-26 LLM Assisted Alpha Fairness for 6 GHz WiFi and NR_U Coexistence: An Agentic Orchestrator for Throughput, Energy, and SLA Qun Wang et.al. 2510.17814
2025-09-26 ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration Gaole Dai et.al. 2509.21823
2025-09-25 Nova: Real-Time Agentic Vision-Language Model Serving with Adaptive Cross-Stage Parallelization Yuhang Xu et.al. 2509.21301
2025-09-24 CollaPipe: Adaptive Segment-Optimized Pipeline Parallelism for Collaborative LLM Training in Heterogeneous Edge Networks Jiewei Chen et.al. 2509.19855
2025-09-20 Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games Niv Eckhaus et.al. 2506.05309
2025-09-19 Overhearing LLM Agents: A Survey, Taxonomy, and Roadmap Andrew Zhu et.al. 2509.16325
2025-09-17 CrowdAgent: Multi-Agent Managed Multi-Source Annotation System Maosheng Qin et.al. 2509.14030
2025-08-30 LLM-Assisted Iterative Evolution with Swarm Intelligence Toward SuperBrain Li Weigang et.al. 2509.00510
2025-08-29 Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward Yong Deng et.al. 2508.12800
2025-08-20 Entropy-Constrained Strategy Optimization in Urban Floods: A Multi-Agent Framework with LLM and Knowledge Graph Integration Peilin Ji et.al. 2508.14654
2025-08-18 Datarus-R1: An Adaptive Multi-Step Reasoning LLM for Automated Data Analysis Ayoub Ben Chaliah et.al. 2508.13382
2025-08-18 AutoChemSchematic AI: Agentic Physics-Aware Automation for Chemical Manufacturing Scale-Up Sakhinana Sagar Srinivas et.al. 2505.24584
2025-08-11 From Natural Language to Solver-Ready Power System Optimization: An LLM-Assisted, Validation-in-the-Loop Framework Yunkai Hu et.al. 2508.08147
2025-08-09 Kairos: Low-latency Multi-Agent Serving with Shared LLMs and Excessive Loads in the Public Cloud Jinyuan Chen et.al. 2508.06948
2025-08-06 AquaChat++: LLM-Assisted Multi-ROV Inspection for Aquaculture Net Pens with Integrated Battery Management and Thruster Fault Tolerance Abdelhaleem Saad et.al. 2508.06554
2025-08-05 REALM-Bench: A Benchmark for Evaluating Multi-Agent Systems on Real-world, Dynamic Planning and Scheduling Tasks Longling Geng et.al. 2502.18836
2025-08-01 CyGATE: Game-Theoretic Cyber Attack-Defense Engine for Patch Strategy Optimization Yuning Jiang et.al. 2508.00478
2025-07-29 Forecasting LLM Inference Performance via Hardware-Agnostic Analytical Modeling Rajeev Patwari et.al. 2508.00904
2025-07-29 StaffPro: an LLM Agent for Joint Staffing and Profiling Alessio Maritan et.al. 2507.21636
2025-07-21 LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra Seth Karten et.al. 2507.15815
2025-07-18 DREAMS: Density Functional Theory Based Research Engine for Agentic Materials Simulation Ziqi Wang et.al. 2507.14267
2025-07-18 CodeEdu: A Multi-Agent Collaborative Platform for Personalized Coding Education Jianing Zhao et.al. 2507.13814
2025-07-14 DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving Yuhan Liu et.al. 2411.02820
2025-07-10 KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows Zaifeng Pan et.al. 2507.07400
2025-07-09 Gradientsys: A Multi-Agent LLM Scheduler with ReAct Orchestration Xinyuan Song et.al. 2507.06520
2025-07-07 StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Meng Wei et.al. 2507.05240
2025-06-28 FairMarket-RL: LLM-Guided Fairness Shaping for Multi-Agent Reinforcement Learning in Peer-to-Peer Markets Shrenik Jadhav et.al. 2506.22708
2025-06-26 CitySim: Modeling Urban Behaviors and City Dynamics with Large-Scale LLM-Driven Agent Simulation Nicolas Bougie et.al. 2506.21805
2025-06-26 MobiVerse: Scaling Urban Mobility Simulation with Hybrid Lightweight Domain-Specific Generator and Large Language Models Yifan Liu et.al. 2506.21784
2025-06-25 MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation Gurusha Juneja et.al. 2506.20737
2025-06-17 LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification Penghui Yang et.al. 2502.17421
2025-06-16 AlphaEvolve: A coding agent for scientific and algorithmic discovery Alexander Novikov et.al. 2506.13131
2025-06-11 SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems Peiran Li et.al. 2506.07564
2025-06-09 DeepServe: Serverless Large Language Model Serving at Scale Junhao Hu et.al. 2501.14417
2025-06-07 EconEvals: Benchmarks and Litmus Tests for LLM Agents in Unknown Environments Sara Fish et.al. 2503.18825
2025-06-04 AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance Dhaval Patel et.al. 2506.03828
2025-06-01 A Survey of LLM $\times$ DATA Xuanhe Zhou et.al. 2505.18458
2025-05-28 Design and testing of an agent chatbot supporting decision making with public transport data Luca Fantin et.al. 2505.22698
2025-05-26 Task Memory Engine: Spatial Memory for Robust Multi-Step LLM Agents Ye Ye et.al. 2505.19436
2025-05-23 Curriculum Guided Reinforcement Learning for Efficient Multi Hop Retrieval Augmented Generation Yuelyu Ji et.al. 2505.17391
2025-05-20 Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation Peter Baile Chen et.al. 2505.14398
2025-05-19 Learning Virtual Machine Scheduling in Cloud Computing through Language Agents JieHao Wu et.al. 2505.10117
2025-05-18 ALAS: A Stateful Multi-LLM Agent Framework for Disruption-Aware Planning Edward Y. Chang et.al. 2505.12501
2025-05-17 Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents Tiannuo Yang et.al. 2505.12065
2025-05-17 OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents Raghav Thind et.al. 2504.16918
2025-04-24 Throughput-Optimal Scheduling Algorithms for LLM Inference and AI Agents Yueying Li et.al. 2504.07347
2025-04-21 PLANET: A Collection of Benchmarks for Evaluating LLMs’ Planning Capabilities Haoming Li et.al. 2504.14773
2025-04-02 MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding Ranajoy Sadhukhan et.al. 2408.11049
2025-04-01 Personality-Driven Decision-Making in LLM-Based Autonomous Agents Lewis Newsham et.al. 2504.00727
2025-04-01 HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents Shiyi Liu et.al. 2504.00434
2025-03-25 Agent-Initiated Interaction in Phone UI Automation Noam Kahlon et.al. 2503.19537
2025-03-19 Exploring Large Language Models for Word Games:Who is the Spy? Chentian Wei et.al. 2503.15235
2025-03-12 COLA: A Scalable Multi-Agent Framework For Windows UI Task Automation Di Zhao et.al. 2503.09263
2025-03-11 LLM4MAC: An LLM-Driven Reinforcement Learning Framework for MAC Protocol Emergence Renxuan Tan et.al. 2503.08123
2025-03-05 Pretrained LLMs as Real-Time Controllers for Robot Operated Serial Production Line Muhammad Waseem et.al. 2503.03889
2025-02-28 ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments Pedro Gimenes et.al. 2502.21208
2025-02-27 TripCraft: A Benchmark for Spatio-Temporally Fine Grained Travel Planning Soumyabrata Chaudhuri et.al. 2502.20508
2025-02-20 Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents Axel Backlund et.al. 2502.15840
2025-02-20 Plan-over-Graph: Towards Parallelable LLM Agent Schedule Shiqi Zhang et.al. 2502.14563
2025-02-19 Autellix: An Efficient Serving Engine for LLM Agents as General Programs Michael Luo et.al. 2502.13965
2025-02-16 Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks Yuanjie Lyu et.al. 2502.11083
2025-02-06 Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for Efficient On-Device Agents Chenyang Shao et.al. 2502.04392
2025-01-29 MACI: Multi-Agent Collaborative Intelligence for Adaptive Reasoning and Temporal Planning Edward Y. Chang et.al. 2501.16689
2025-01-27 LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System Tianfu Wang et.al. 2501.15749_(WWW)
2025-01-14 CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning Guoliang He et.al. 2501.08071_(CGO)
2024-12-24 TimelyLLM: Segmented LLM Serving System for Time-sensitive Robotic Applications Neiwen Ling et.al. 2412.18695
2024-12-21 SYMPHONY: Improving Memory Management for LLM Inference Workloads Saurabh Agarwal et.al. 2412.16434
2024-11-02 NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference Xuanlin Jiang et.al. 2411.01142
2024-06-06 SGLang: Efficient Execution of Structured Language Model Programs Lianmin Zheng et.al. 2312.07104
2024-05-14 Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis Yao Fu et.al. 2405.08944

This site uses Just the Docs, a documentation theme for Jekyll.