本文介绍使用 AgentBay SDK 对云电脑进行应用管理的相关能力。包含如何在云环境中发现、启动、监控和控制桌面应用程序。
概述
Computer Use模块为桌面环境提供全面的应用程序管理能力,包括:
应用程序发现 - 查找系统中已安装的应用程序。
应用程序生命周期管理 - 启动和停止桌面应用程序。
进程监控 - 跟踪正在运行的应用程序及其进程。
桌面自动化 - 自动化复杂的桌面工作流程。
创建会话
import os
from agentbay import AgentBay
from agentbay.session_params import CreateSessionParams
api_key = os.getenv("AGENTBAY_API_KEY")
if not api_key:
raise ValueError("AGENTBAY_API_KEY environment variable is required")
agent_bay = AgentBay(api_key=api_key)
params = CreateSessionParams(image_id="linux_latest")
result = agent_bay.create(params)
if result.success:
session = result.session
print(f"Session created: {session.session_id}")
# Output: Session created: session-xxxxxxxxxxxxxxxxx
else:
print(f"Failed to create session: {result.error_message}")
exit(1)获取已安装的应用程序
result = session.computer.get_installed_apps(
start_menu=True,
desktop=False,
ignore_system_apps=True
)
# Verification: Result type is InstalledAppListResult
# Verification: Success = True
# Verification: Found 76 installed applications on test system
if result.success:
apps = result.data
print(f"Found {len(apps)} installed applications")
# Output: Found 76 installed applications
for app in apps[:5]:
print(f"Name: {app.name}")
print(f"Start Command: {app.start_cmd}")
print(f"Stop Command: {app.stop_cmd if app.stop_cmd else 'N/A'}")
print(f"Work Directory: {app.work_directory if app.work_directory else 'N/A'}")
print("---")
# Output example:
# Name: AptURL
# Start Command: apturl %u
# Stop Command: N/A
# Work Directory: N/A
# ---
# Name: Bluetooth Transfer
# Start Command: bluetooth-sendto
# Stop Command: N/A
# Work Directory: N/A
# ---
else:
print(f"Error: {result.error_message}")参数说明:
start_menu(bool): 是否包含开始菜单中的应用程序。desktop(bool): 是否包含桌面应用程序。ignore_system_apps(bool): 是否过滤掉系统应用程序。
返回值:
包含
InstalledApp对象列表的InstalledAppListResult。
启动应用程序
通过命令启动
start_cmd = "/usr/bin/google-chrome-stable"
result = session.computer.start_app(start_cmd)
# 验证: 结果类型为 ProcessListResult
# 验证: 成功 = True
# 验证: 启动了 6 个进程 (chrome 主进程 + 辅助进程)
if result.success:
processes = result.data
print(f"Application started with {len(processes)} processes")
# 输出: Application started with 6 processes
for process in processes:
print(f"Process: {process.pname} (PID: {process.pid})")
# 输出示例:
# Process: chrome (PID: 4443)
# Process: cat (PID: 4448)
# Process: cat (PID: 4449)
# Process: chrome (PID: 4459)
# Process: chrome (PID: 4460)
# Process: chrome (PID: 4462)
else:
print(f"Failed to start application: {result.error_message}")
指定工作目录启动
start_cmd = "/usr/bin/google-chrome-stable"
work_directory = "/tmp"
result = session.computer.start_app(
start_cmd=start_cmd,
work_directory=work_directory
)
# 验证: 结果类型为 ProcessListResult
# 验证: 成功 = True
# 验证: 应用程序在指定的工作目录中启动
if result.success:
processes = result.data
print(f"Application started with {len(processes)} processes")
# 输出: Application started with 6 processes
else:
print(f"Failed to start application: {result.error_message}")
从已安装应用列表启动
result = session.computer.get_installed_apps(
start_menu=True,
desktop=False,
ignore_system_apps=True
)
# 验证: 成功检索到已安装的应用程序列表
if result.success:
apps = result.data
target_app = None
for app in apps:
if "chrome" in app.name.lower():
target_app = app
break
# 验证: 在应用程序列表中找到了 "Google Chrome"
if target_app:
print(f"Starting {target_app.name}...")
# 输出: Starting Google Chrome...
start_result = session.computer.start_app(target_app.start_cmd)
# 验证: 成功启动了应用程序
if start_result.success:
print("Application started successfully!")
# 输出: Application started successfully!
else:
print(f"Failed to start: {start_result.error_message}")
else:
print("Target application not found")停止应用程序
通过PID停止
start_result = session.computer.start_app("/usr/bin/google-chrome-stable")
# 验证: 应用程序成功启动,包含多个进程
if start_result.success:
target_pid = None
for process in start_result.data:
print(f"Process: {process.pname} (PID: {process.pid})")
# 输出示例:
# Process: chrome (PID: 6378)
# Process: cat (PID: 6383)
# Process: cat (PID: 6384)
if 'chrome' in process.pname.lower():
target_pid = process.pid
break
if target_pid:
result = session.computer.stop_app_by_pid(target_pid)
# 验证: 结果类型为 AppOperationResult
# 验证: 成功 = True
if result.success:
print(f"Successfully stopped process {target_pid}")
# 输出: Successfully stopped process 6378
else:
print(f"Failed to stop process: {result.error_message}")通过进程名称停止
start_result = session.computer.start_app("/usr/bin/google-chrome-stable")
# 验证: 应用程序成功启动
if start_result.success:
target_pname = None
for process in start_result.data:
print(f"Process: {process.pname} (PID: {process.pid})")
target_pname = process.pname
break
# 验证: 获取到进程名称 "chrome"
if target_pname:
result = session.computer.stop_app_by_pname(target_pname)
# 验证: 结果类型为 AppOperationResult
# 验证: 成功 = True
if result.success:
print(f"Successfully stopped process {target_pname}")
# 输出: Successfully stopped process chrome
else:
print(f"Failed to stop process: {result.error_message}")通过停止命令停止
result = session.computer.get_installed_apps(
start_menu=True,
desktop=False,
ignore_system_apps=True
)
# 验证: 成功检索到已安装的应用程序
if result.success:
apps = result.data
target_app = None
for app in apps:
if app.stop_cmd:
target_app = app
break
# 注意: Linux 上大多数桌面应用程序都没有定义 stop_cmd
# 这是正常的 - 应使用 stop_app_by_pid 或 stop_app_by_pname
if target_app:
start_result = session.computer.start_app(target_app.start_cmd)
if start_result.success:
print("Application started successfully!")
result = session.computer.stop_app_by_cmd(target_app.stop_cmd)
# 验证: 结果类型为 AppOperationResult
if result.success:
print("Successfully stopped application using command")
else:
print(f"Failed to stop application: {result.error_message}")列出正在运行的应用程序
result = session.computer.list_visible_apps()
# 验证: 结果类型为 ProcessListResult
# 验证: 成功 = True
# 验证: 找到了 1 个可见应用程序 (具有可见窗口的 chrome)
if result.success:
visible_apps = result.data
print(f"Found {len(visible_apps)} running applications")
# 输出: Found 1 running applications
for app in visible_apps:
print(f"Process: {app.pname}")
print(f"PID: {app.pid}")
print(f"Command: {app.cmdline}")
print("---")
# 输出示例:
# Process: chrome
# PID: 6378
# Command: /opt/google/chrome/chrome
# ---
else:
print(f"Error: {result.error_message}")进程对象属性:
pname(str): 进程名称。pid(int): 进程ID。cmdline(str): 用于启动进程的完整命令行。
完整工作流程示例
import os
import time
from agentbay import AgentBay
from agentbay.session_params import CreateSessionParams
api_key = os.getenv("AGENTBAY_API_KEY")
if not api_key:
raise ValueError("AGENTBAY_API_KEY environment variable is required")
agent_bay = AgentBay(api_key=api_key)
params = CreateSessionParams(image_id="linux_latest")
result = agent_bay.create(params)
if not result.success:
print(f"Failed to create session: {result.error_message}")
exit(1)
session = result.session
print(f"Session created: {session.session_id}")
# 输出: Session created: session-xxxxxxxxxxxxxxxxx
print("Step 1: Finding installed applications...")
apps_result = session.computer.get_installed_apps(
start_menu=True,
desktop=False,
ignore_system_apps=True
)
# 验证: 成功检索到 76 个应用程序
if not apps_result.success:
print(f"Failed to get apps: {apps_result.error_message}")
agent_bay.delete(session)
exit(1)
target_app = None
for app in apps_result.data:
if "chrome" in app.name.lower():
target_app = app
break
# 验证: 找到了 "Google Chrome" 应用程序
if not target_app:
print("Google Chrome not found")
agent_bay.delete(session)
exit(1)
print(f"Found application: {target_app.name}")
# 输出: Found application: Google Chrome
print("Step 2: Launching application...")
start_result = session.computer.start_app(target_app.start_cmd)
# 验证: 成功启动了 6 个进程
if not start_result.success:
print(f"Failed to start app: {start_result.error_message}")
agent_bay.delete(session)
exit(1)
print(f"Application started with {len(start_result.data)} processes")
# 输出: Application started with 6 processes
for process in start_result.data:
print(f" - {process.pname} (PID: {process.pid})")
# 输出示例:
# - chrome (PID: 6420)
# - cat (PID: 6425)
# - cat (PID: 6426)
# - chrome (PID: 6436)
# - chrome (PID: 6437)
# - chrome (PID: 6439)
print("Step 3: Waiting for application to load...")
time.sleep(5)
print("Step 4: Checking running applications...")
visible_result = session.computer.list_visible_apps()
# 验证: 找到了 1 个可见应用程序
if visible_result.success:
print(f"Found {len(visible_result.data)} visible applications")
# 输出: Found 1 visible applications
print("Step 5: Stopping application...")
if start_result.data:
stop_result = session.computer.stop_app_by_pid(start_result.data[0].pid)
# 验证: 成功停止了应用程序
if stop_result.success:
print("Application stopped successfully")
# 输出: Application stopped successfully
else:
print(f"Failed to stop application: {stop_result.error_message}")
print("Cleaning up session...")
agent_bay.delete(session)
print("Workflow completed!")
# 输出: Workflow completed!
# === 完整工作流程验证结果 ===
# ✓ 会话创建: 成功
# ✓ 获取已安装应用程序: 找到 76 个应用程序
# ✓ 查找目标应用程序: 找到 Google Chrome
# ✓ 启动应用程序: 启动了 6 个进程
# ✓ 列出可见应用程序: 1 个可见应用程序
# ✓ 停止应用程序: 成功停止
# ✓ 会话清理: 成功API 参考
云电脑应用程序管理方法
所有应用程序管理方法都通过 session.computer.* 访问:
方法 | 参数 | 返回值 | 描述 |
|
|
| 获取已安装应用程序列表 |
|
|
| 启动应用程序 |
|
|
| 通过进程ID停止应用程序 |
|
|
| 通过进程名称停止应用程序 |
|
|
| 通过停止命令停止应用程序 |
| 无 |
| 列出当前可见的应用程序 |
返回类型
InstalledAppListResult
success(bool): 操作是否成功。data(List[InstalledApp]): 已安装应用程序列表。error_message(str): 操作失败时的错误消息。request_id(str): 唯一请求标识符。
InstalledApp
name(str): 应用程序名称。start_cmd(str): 启动应用程序的命令。stop_cmd(Optional[str]): 停止应用程序的命令。work_directory(Optional[str]): 应用程序的工作目录。
ProcessListResult
success(bool): 操作是否成功。data(List[Process]): 进程对象列表。error_message(str): 操作失败时的错误消息。request_id(str): 唯一请求标识符。
Process
pname(str): 进程名称。pid(int): 进程ID。cmdline(Optional[str]): 完整命令行。
AppOperationResult
success(bool): 操作是否成功。error_message(str): 操作失败时的错误消息。request_id(str): 唯一请求标识符。