应用管理-无影 Agent 开发套件 AgentBay(AgentBay)-阿里云帮助中心

本文介绍使用 AgentBay SDK 对云电脑进行应用管理的相关能力。包含如何在云环境中发现、启动、监控和控制桌面应用程序。

概述

Computer Use模块为桌面环境提供全面的应用程序管理能力，包括：

应用程序发现 - 查找系统中已安装的应用程序。
应用程序生命周期管理 - 启动和停止桌面应用程序。
进程监控 - 跟踪正在运行的应用程序及其进程。
桌面自动化 - 自动化复杂的桌面工作流程。

创建会话

import os
from agentbay import AgentBay
from agentbay.session_params import CreateSessionParams

api_key = os.getenv("AGENTBAY_API_KEY")
if not api_key:
    raise ValueError("AGENTBAY_API_KEY environment variable is required")

agent_bay = AgentBay(api_key=api_key)

params = CreateSessionParams(image_id="linux_latest")
result = agent_bay.create(params)

if result.success:
    session = result.session
    print(f"Session created: {session.session_id}")
    # Output: Session created: session-xxxxxxxxxxxxxxxxx
else:
    print(f"Failed to create session: {result.error_message}")
    exit(1)

获取已安装的应用程序

result = session.computer.get_installed_apps(
    start_menu=True,
    desktop=False,
    ignore_system_apps=True
)

# Verification: Result type is InstalledAppListResult
# Verification: Success = True
# Verification: Found 76 installed applications on test system

if result.success:
    apps = result.data
    print(f"Found {len(apps)} installed applications")
    # Output: Found 76 installed applications
    
    for app in apps[:5]:
        print(f"Name: {app.name}")
        print(f"Start Command: {app.start_cmd}")
        print(f"Stop Command: {app.stop_cmd if app.stop_cmd else 'N/A'}")
        print(f"Work Directory: {app.work_directory if app.work_directory else 'N/A'}")
        print("---")
    # Output example:
    # Name: AptURL
    # Start Command: apturl %u
    # Stop Command: N/A
    # Work Directory: N/A
    # ---
    # Name: Bluetooth Transfer
    # Start Command: bluetooth-sendto
    # Stop Command: N/A
    # Work Directory: N/A
    # ---
else:
    print(f"Error: {result.error_message}")

参数说明：

start_menu (bool): 是否包含开始菜单中的应用程序。
desktop (bool): 是否包含桌面应用程序。
ignore_system_apps (bool): 是否过滤掉系统应用程序。

返回值：

包含 InstalledApp 对象列表的 InstalledAppListResult。

启动应用程序

通过命令启动

start_cmd = "/usr/bin/google-chrome-stable"

result = session.computer.start_app(start_cmd)

# 验证: 结果类型为 ProcessListResult
# 验证: 成功 = True
# 验证: 启动了 6 个进程 (chrome 主进程 + 辅助进程)

if result.success:
    processes = result.data
    print(f"Application started with {len(processes)} processes")
    # 输出: Application started with 6 processes
    
    for process in processes:
        print(f"Process: {process.pname} (PID: {process.pid})")
    # 输出示例:
    # Process: chrome (PID: 4443)
    # Process: cat (PID: 4448)
    # Process: cat (PID: 4449)
    # Process: chrome (PID: 4459)
    # Process: chrome (PID: 4460)
    # Process: chrome (PID: 4462)
else:
    print(f"Failed to start application: {result.error_message}")

指定工作目录启动

start_cmd = "/usr/bin/google-chrome-stable"
work_directory = "/tmp"

result = session.computer.start_app(
    start_cmd=start_cmd,
    work_directory=work_directory
)

# 验证: 结果类型为 ProcessListResult
# 验证: 成功 = True
# 验证: 应用程序在指定的工作目录中启动

if result.success:
    processes = result.data
    print(f"Application started with {len(processes)} processes")
    # 输出: Application started with 6 processes
else:
    print(f"Failed to start application: {result.error_message}")

从已安装应用列表启动

result = session.computer.get_installed_apps(
    start_menu=True,
    desktop=False,
    ignore_system_apps=True
)

# 验证: 成功检索到已安装的应用程序列表

if result.success:
    apps = result.data
    
    target_app = None
    for app in apps:
        if "chrome" in app.name.lower():
            target_app = app
            break
    
    # 验证: 在应用程序列表中找到了 "Google Chrome"
    
    if target_app:
        print(f"Starting {target_app.name}...")
        # 输出: Starting Google Chrome...
        
        start_result = session.computer.start_app(target_app.start_cmd)
        
        # 验证: 成功启动了应用程序
        
        if start_result.success:
            print("Application started successfully!")
            # 输出: Application started successfully!
        else:
            print(f"Failed to start: {start_result.error_message}")
    else:
        print("Target application not found")

停止应用程序

通过PID停止

start_result = session.computer.start_app("/usr/bin/google-chrome-stable")

# 验证: 应用程序成功启动，包含多个进程

if start_result.success:
    target_pid = None
    for process in start_result.data:
        print(f"Process: {process.pname} (PID: {process.pid})")
        # 输出示例:
        # Process: chrome (PID: 6378)
        # Process: cat (PID: 6383)
        # Process: cat (PID: 6384)
        
        if 'chrome' in process.pname.lower():
            target_pid = process.pid
            break
    
    if target_pid:
        result = session.computer.stop_app_by_pid(target_pid)
        
        # 验证: 结果类型为 AppOperationResult
        # 验证: 成功 = True
        
        if result.success:
            print(f"Successfully stopped process {target_pid}")
            # 输出: Successfully stopped process 6378
        else:
            print(f"Failed to stop process: {result.error_message}")

通过进程名称停止

start_result = session.computer.start_app("/usr/bin/google-chrome-stable")

# 验证: 应用程序成功启动

if start_result.success:
    target_pname = None
    for process in start_result.data:
        print(f"Process: {process.pname} (PID: {process.pid})")
        target_pname = process.pname
        break
    
    # 验证: 获取到进程名称 "chrome"
    
    if target_pname:
        result = session.computer.stop_app_by_pname(target_pname)
        
        # 验证: 结果类型为 AppOperationResult
        # 验证: 成功 = True
        
        if result.success:
            print(f"Successfully stopped process {target_pname}")
            # 输出: Successfully stopped process chrome
        else:
            print(f"Failed to stop process: {result.error_message}")

通过停止命令停止

result = session.computer.get_installed_apps(
    start_menu=True,
    desktop=False,
    ignore_system_apps=True
)

# 验证: 成功检索到已安装的应用程序

if result.success:
    apps = result.data
    
    target_app = None
    for app in apps:
        if app.stop_cmd:
            target_app = app
            break
    
    # 注意: Linux 上大多数桌面应用程序都没有定义 stop_cmd
    # 这是正常的 - 应使用 stop_app_by_pid 或 stop_app_by_pname
    
    if target_app:
        start_result = session.computer.start_app(target_app.start_cmd)
        
        if start_result.success:
            print("Application started successfully!")
            
            result = session.computer.stop_app_by_cmd(target_app.stop_cmd)
            
            # 验证: 结果类型为 AppOperationResult
            
            if result.success:
                print("Successfully stopped application using command")
            else:
                print(f"Failed to stop application: {result.error_message}")

列出正在运行的应用程序

result = session.computer.list_visible_apps()

# 验证: 结果类型为 ProcessListResult
# 验证: 成功 = True
# 验证: 找到了 1 个可见应用程序 (具有可见窗口的 chrome)

if result.success:
    visible_apps = result.data
    print(f"Found {len(visible_apps)} running applications")
    # 输出: Found 1 running applications
    
    for app in visible_apps:
        print(f"Process: {app.pname}")
        print(f"PID: {app.pid}")
        print(f"Command: {app.cmdline}")
        print("---")
    # 输出示例:
    # Process: chrome
    # PID: 6378
    # Command: /opt/google/chrome/chrome
    # ---
else:
    print(f"Error: {result.error_message}")

进程对象属性:

pname (str): 进程名称。
pid (int): 进程ID。
cmdline (str): 用于启动进程的完整命令行。

完整工作流程示例

import os
import time
from agentbay import AgentBay
from agentbay.session_params import CreateSessionParams

api_key = os.getenv("AGENTBAY_API_KEY")
if not api_key:
    raise ValueError("AGENTBAY_API_KEY environment variable is required")

agent_bay = AgentBay(api_key=api_key)

params = CreateSessionParams(image_id="linux_latest")
result = agent_bay.create(params)

if not result.success:
    print(f"Failed to create session: {result.error_message}")
    exit(1)

session = result.session
print(f"Session created: {session.session_id}")
# 输出: Session created: session-xxxxxxxxxxxxxxxxx

print("Step 1: Finding installed applications...")
apps_result = session.computer.get_installed_apps(
    start_menu=True,
    desktop=False,
    ignore_system_apps=True
)

# 验证: 成功检索到 76 个应用程序

if not apps_result.success:
    print(f"Failed to get apps: {apps_result.error_message}")
    agent_bay.delete(session)
    exit(1)

target_app = None
for app in apps_result.data:
    if "chrome" in app.name.lower():
        target_app = app
        break

# 验证: 找到了 "Google Chrome" 应用程序

if not target_app:
    print("Google Chrome not found")
    agent_bay.delete(session)
    exit(1)

print(f"Found application: {target_app.name}")
# 输出: Found application: Google Chrome

print("Step 2: Launching application...")
start_result = session.computer.start_app(target_app.start_cmd)

# 验证: 成功启动了 6 个进程

if not start_result.success:
    print(f"Failed to start app: {start_result.error_message}")
    agent_bay.delete(session)
    exit(1)

print(f"Application started with {len(start_result.data)} processes")
# 输出: Application started with 6 processes

for process in start_result.data:
    print(f"  - {process.pname} (PID: {process.pid})")
# 输出示例:
#   - chrome (PID: 6420)
#   - cat (PID: 6425)
#   - cat (PID: 6426)
#   - chrome (PID: 6436)
#   - chrome (PID: 6437)
#   - chrome (PID: 6439)

print("Step 3: Waiting for application to load...")
time.sleep(5)

print("Step 4: Checking running applications...")
visible_result = session.computer.list_visible_apps()

# 验证: 找到了 1 个可见应用程序

if visible_result.success:
    print(f"Found {len(visible_result.data)} visible applications")
    # 输出: Found 1 visible applications

print("Step 5: Stopping application...")
if start_result.data:
    stop_result = session.computer.stop_app_by_pid(start_result.data[0].pid)
    
    # 验证: 成功停止了应用程序
    
    if stop_result.success:
        print("Application stopped successfully")
        # 输出: Application stopped successfully
    else:
        print(f"Failed to stop application: {stop_result.error_message}")

print("Cleaning up session...")
agent_bay.delete(session)
print("Workflow completed!")
# 输出: Workflow completed!

# === 完整工作流程验证结果 ===
# ✓ 会话创建: 成功
# ✓ 获取已安装应用程序: 找到 76 个应用程序
# ✓ 查找目标应用程序: 找到 Google Chrome
# ✓ 启动应用程序: 启动了 6 个进程
# ✓ 列出可见应用程序: 1 个可见应用程序
# ✓ 停止应用程序: 成功停止
# ✓ 会话清理: 成功

API 参考

云电脑应用程序管理方法

所有应用程序管理方法都通过 session.computer.* 访问：

方法	参数	返回值	描述
`get_installed_apps()`	`start_menu: bool = True`<br/>`desktop: bool = False`<br/>`ignore_system_apps: bool = True`	`InstalledAppListResult`	获取已安装应用程序列表
`start_app()`	`start_cmd: str`<br/>`work_directory: str = ""`<br/>`activity: str = ""`	`ProcessListResult`	启动应用程序
`stop_app_by_pid()`	`pid: int`	`AppOperationResult`	通过进程ID停止应用程序
`stop_app_by_pname()`	`pname: str`	`AppOperationResult`	通过进程名称停止应用程序
`stop_app_by_cmd()`	`stop_cmd: str`	`AppOperationResult`	通过停止命令停止应用程序
`list_visible_apps()`	无	`ProcessListResult`	列出当前可见的应用程序

返回类型

InstalledAppListResult

success (bool): 操作是否成功。
data (List[InstalledApp]): 已安装应用程序列表。
error_message (str): 操作失败时的错误消息。
request_id (str): 唯一请求标识符。

InstalledApp

name (str): 应用程序名称。
start_cmd (str): 启动应用程序的命令。
stop_cmd (Optional[str]): 停止应用程序的命令。
work_directory (Optional[str]): 应用程序的工作目录。

ProcessListResult

success (bool): 操作是否成功。
data (List[Process]): 进程对象列表。
error_message (str): 操作失败时的错误消息。
request_id (str): 唯一请求标识符。

Process

pname (str): 进程名称。
pid (int): 进程ID。
cmdline (Optional[str]): 完整命令行。

AppOperationResult

success (bool): 操作是否成功。
error_message (str): 操作失败时的错误消息。
request_id (str): 唯一请求标识符。