AgentProg is a novel framework that tackles the critical bottleneck of context management in long-horizon GUI automation. Traditional agents struggle with ever-expanding interaction histories, leading to context overflow and loss of critical information. AgentProg reframes this challenge by representing the agent's execution as a structured program with explicit variables and control flow, providing a principled mechanism to retain essential information while discarding irrelevant details.
- [2025.12.31] We released AW-Extend, a benchmark extension based on AndroidWorld designed to evaluate agents on long-horizon compositional and iterative tasks. The evaluation code for AndroidWorld has also been updated.
- [2025.12.07] Official release of the AgentProg source code.
AgentProg provides Semantic Task Program, a semantic-tolerant domain-specific language that bridges rigid programming and flexible natural language:
- Structured Control Flow: Organizes complex tasks using loops, conditionals, and functions
- Explicit Variables: Captures and maintains critical information (e.g., task lists, user data) throughout execution through explicit variable declarations
- Semantic Tolerance: Uses natural language instructions that adapt to different environments, unlike brittle traditional scripts
AgentProg interprets high-level program instructions into executable low-level actions:
- Dual-Mode Execution: Alternates between (1) generating Python code for GUI actions based on current program instruction and environment state, and (2) updating the Program Counter (PC) to control execution flow
- Runtime Adaptation: Translates abstract natural language instructions (e.g., "get user name") into concrete API calls (e.g.,
click(),input(),swipe()) based on real-time GUI observations - Decoupled Logic: Separates task planning from execution details, preventing context overflow while maintaining flexibility to handle dynamic environments
AgentProg intelligently manages agent context through program structure:
- Control Flow Pruning: Discards irrelevant branches, completed loop iterations and details of function invocation using execution tree structure
- Data Flow Persistence: Preserves task-critical information through explicit variable declarations
- Step-Aware Retrieval: Recalls previous executions of the same program step to improve consistency in repetitive tasks
AgentProg addresses partial observability and environmental dynamics in GUI environments:
- Active Monitoring: Maintains hypotheses about hidden UI states (clipboard content, navigation stack, off-screen elements)
- Runtime Verification: Continuously validates assumptions against real-time observations
- Anomaly Recovery: Detects belief-reality gaps and triggers corrective actions (e.g., app crashes, form submission failure)
Task: ContactsAddMultipleContactsAndSms
Task description:
For the following persons:
Name: Hana Ferreira, Number: +10662908339
Name: Sophie Martin, Number: +18723713947
Name: Olivia Alves, Number: +16278036185,
add them as new contacts, and then use Simple SMS Messenger to send each of them a 'hello, [Name]' message, where [Name] is the name of each contact.
Video (accelerated):
demo.mp4
Set up the agentprog package:
git clone https://github.com/MobileLLM/AgentProg.git
cd AgentProg
pip install -e .
Python 3.11+ is recommended for running the agent.
Next, edit the .env file to include the required API keys. These keys are necessary to access the underlying large language models.
# Credentials for Google Gemini-2.5-Pro
GEMINI_API_KEY=<YOUR_GEMINI_API_KEY_HERE>
# Credentials for UI-TARS-1.5 on the Volcengine Ark platform
ARK_API_KEY=<YOUR_ARK_API_KEY_HERE>
DOUBAO_BASE_URL=<YOUR_DOUBAO_BASE_URL_HERE>Prepare an Android phone or Android emulator and connect it using adb.
For CLI usage:
agentprog [task requirements] --serial [serial name, e.g., emulator-5554]
For example:
agentprog "create a new contact named agent prog in contacts app." --serial emulator-5554
You can also use AgentProg in Python:
from agentprog import agentprog_pipeline, AgentProgConfig
config = AgentProgConfig(task_description="create a new contact named agent prog in contacts app.", serial="emulator-5554")
agentprog_pipeline(config)@misc{tian2025agentprog,
title={AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management},
author={Shizuo Tian and Hao Wen and Yuxuan Chen and Jiacheng Liu and Shanhui Zhao and Guohong Liu and Ju Ren and Yunxin Liu and Yuanchun Li},
year={2025},
eprint={2512.10371},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2512.10371},
}