AgentRun

更新时间:
复制 MD 格式

AgentRunTry AgentRun Now

AgentRun is an all-in-one Agentic AI infrastructure platform that is code-centric, open, and composable. It provides full lifecycle management, including development, deployment, and operations, for enterprise-grade agent applications.

In essence, AgentRun combines a cloud native runtime for agent applications, a sandbox platform, model governance and a tool ecosystem, and built-in security and observability capabilities.

It enables your team to focus on business logic and agent behavior instead of building and managing the entire underlying infrastructure. With AgentRun, you get a serverless platform optimized for agent scenarios, including the execution environment, model gateway, tool invocation, logging, monitoring, and permission systems.

Capabilities

Build and evolve agent applications

  • Offers three development modes: no-code, low-code, and pro-code:

    • No-code (AI Studio): For business or operations staff to build agents through a visual interface.

    • Low-code (Quick Create Agent): Quickly build a runnable agent by selecting a model, writing a prompt, and configuring tools and a sandbox in the UI.

    • Pro-code (Create Agent with Code): Implement complex logic and achieve production-readiness by using languages like Python, Node.js, or Java with any framework.

  • Switch from low-code to pro-code with one click:

    • After you validate a prototype in the quick-create mode, you can convert it to the code mode with a single click.

    • The platform generates well-structured, maintainable code based on your current configuration, allowing you to iterate in pro-code mode without rewriting from scratch.

  • Provides open SDKs and APIs to enhance pro-code integration:

    • A unified HTTP API, compatible with protocols like OpenAI Chat Completions, lets you directly call agents, models, and tools from any language or backend service.

    • Multi-language SDKs for Python, Node.js, and more encapsulate authentication and call details. This enables pro-code projects to perform the following actions with minimal code:

      • Call agents hosted on AgentRun from your existing services.

      • Call the model proxy layer to use governance features like multi-model fallback and load balancing.

      • Use the serverless sandbox to perform complex tasks such as code execution and Browser Use.

  • Stay open and avoid vendor lock-in:

    • AgentRun integrates with mainstream agent frameworks like LangChain, AgentScope, CrewAI, and Google ADK. For more information, see the AgentRun SDK Documentation.

    • You can use AgentRun's capabilities selectively. For example, use only the sandbox, model proxy, or observability features and integrate them with your existing systems. This modular, plug-and-play approach eliminates the need to migrate your entire stack to a closed platform.

Production-grade agent runtime

  • A serverless runtime powered by Function Compute (FC):

    • Handles sparse agent invocations and traffic bursts with ease by using the elastic scaling capabilities of a serverless architecture.

    • Session affinity ensures that requests from the same session are routed to the same instance, which simplifies continuous conversations and state management.

    • Supports scale to zero. Resources are automatically released after a session becomes inactive. Billing is adjusted during this light sleep state to balance performance and cost.

  • Built-in multi-language and multi-type runtimes:

    • The agent runtime supports popular languages like Python, Node.js, and Java.

    • The sandbox runtime has built-in support for over 50 languages and also allows you to use a custom image.

    • You do not need to manage servers, containers, or Kubernetes clusters.

Unified model access and governance

  • Centralized management of large models:

    • Supports models from leading providers such as Qwen and DeepSeek, as well as open-source models.

    • Use FunModel to host open-source models as OpenAI-compatible APIs with a single click.

    • Supports vector model management for retrieval and Retrieval-Augmented Generation (RAG) scenarios.

  • Model governance and high availability:

    • A unified model proxy layer abstracts away the differences between various provider APIs.

    • Features built-in multi-model load balancing, fallback, concurrency control, and timeout control.

    • Supports content moderation, token-based rate limiting, automatic retries, and cost monitoring.

Out-of-the-box sandbox capabilities

  • Sandbox as a service:

    • Code Interpreter: Securely execute code in languages like Python, Node.js, and Java, with support for file and session management.

    • Browser Use: Provides a stable browser automation environment using the CDP over WebSocket protocol, compatible with Puppeteer and Playwright.

    • Computer/Mobile Use (Planned): Future extensions will support advanced operations like GUI interaction and remote desktop.

  • Enterprise-grade security and isolation:

    • Multi-level isolation at the request, instance, and session levels based on secure containers (MicroVMs).

    • Storage isolation with support for mounting Object Storage Service (OSS) or NAS.

    • The platform maintains and updates the sandbox environment, freeing developers from building, patching, and managing dependencies.

  • Performance features of the serverless sandbox:

    • Millisecond-level wakeup from light sleep: Briefly idle sandbox instances automatically enter a light sleep state and can be woken up in milliseconds when a new request arrives, virtually eliminating cold start delays.

    • Second-level wakeup from deep sleep: Instances with no traffic for an extended period enter a more power-efficient deep sleep mode and can still be restored within seconds, balancing cost and performance.

    • Concurrent execution of millions of sandbox templates: The underlying infrastructure supports the concurrent execution of millions of function-level sandbox templates. This capability allows large-scale, diverse sandboxes to run simultaneously on a single platform, making it ideal for enterprise scenarios with multiple teams and business lines.

Unified tool ecosystem and MCP support

  • Tool Hub:

    • Provides a vast collection of tools that can be deployed with one click.

    • Allows you to publish custom tools and build your own agent tool ecosystem.

  • Dual-protocol support for MCP and Function Call:

    • Convert anything into an MCP tool: Agents, sandboxes, and API-based tools can be converted to the MCP format with one click.

    • Supports advanced extensions like pre- and post-processing hooks, semantic analysis, and intelligent routing.

    • Compatible with most MCP and Function Call tools available on the market.

    • Package MCP tools: Aggregate multiple MCP tools and API tools into a unified MCP gateway. This exposes them through a single endpoint for simplified management and access.

Observability and cost analysis

  • End-to-end observability:

    • Provides end-to-end tracing based on OpenTelemetry for the entire request lifecycle, from user request to external dependency.

    • Displays key metrics such as QPS, latency distribution, and error rates.

  • Cost and performance assessment:

    • Token-level cost attribution: Pinpoint costs associated with specific operations like model calls, vector searches, and tool invocations.

    • Multi-dimensional analysis: Analyze costs and performance by user, session, or agent type.

    • Unified log storage and analysis: Use logs for quality assessment, security auditing, and semantic analysis.

Enterprise-grade security and data residency

  • Multi-layered security isolation:

    • Isolation at the request, instance, and session levels.

    • Secure isolation between the agent runtime and the sandbox runtime.

    • Connect to your VPC or IDC network to ensure your data never leaves your private environment.

  • Flexible deployment of data and memory:

    • Deeply integrates with open-source projects like Mem0 and RAGFlow.

    • Supports both a one-click hosted mode and the ability to connect to your existing deployments in a VPC or IDC.

    • Enterprises can choose to deploy core data on-premises while hosting general data in the cloud, balancing security and efficiency.

Core components and architecture

image

AgentRuntime

Provides a unified execution environment and lifecycle management for agents.

Key features:

  • Multiple development modes:

  • Multi-language runtimes:

    • Python 3.10/3.12, Node.js 18/20, Java 8/11/17, and more.

  • Deployment methods:

    • Upload a code package (from a local machine or OSS), use the online editor, or specify a custom image.

  • Runtime capabilities:

    • Session affinity, serverless elasticity, and multi-instance concurrency.

    • Version management, endpoint management, and canary releases.

  • Integration ecosystem:

    • SDK integration, API integration (compatible with OpenAI Chat Completions), UI integration (for full-stack applications), and MCP integration.

Sandbox

Provides a secure, high-performance, and serverless sandbox for code execution and browser automation.

Key features:

  • Multiple sandbox types:

    • Code Interpreter and Browser Use, with future support for All-in-One, Reinforcement Learning (RL), Simulation (Sim), and more.

  • Isolation and elasticity:

    • Based on secure containers (MicroVMs) with multi-level isolation.

    • Supports scale to zero and elastic scheduling based on requests.

    • Millisecond-level wakeup and rapid delivery of tens of thousands of instances per minute.

  • Integration methods:

    • Can be integrated into an agent through SDK calls or as an MCP tool.

    • Supports both pre-built and custom images.

Model management

A centralized hub for accessing, managing, and governing large models.

Key features:

  • Model sources:

    • Third-party models (such as Qwen and DeepSeek), self-hosted open-source models (using frameworks such as vLLM, SGLang, Ollama, or LMDeploy), and vector models.

  • Model service provider plugins:

    • Centrally manage authentication credentials and connection details for various model services.

  • Model runtime:

    • A serverless model runtime that supports out-of-the-box usage, secondary development with DevPods, elastic GPU delivery, and scaling to zero during off-peak hours.

  • Model governance:

    • Multi-model load balancing, fallback, concurrency control, timeouts, and caching.

    • Content moderation, token-based rate limiting, and cost monitoring.

Tool management

A centralized hub for defining, invoking, and governing tools.

Key features:

  • Unified tool interface:

    • Supports both MCP and Function Call protocols.

    • Manages tool invocation logic through a unified API, which reduces development complexity.

  • Tool Hub ecosystem:

    • Provides a large number of common tools for one-click integration.

    • Supports publishing and sharing custom tools.

  • Intelligent extensions:

    • Supports advanced features like hook injection, semantic analysis, and intelligent routing.

    • Planned features include an AI-powered engine for automatically generating tool definitions and recommending tools.

Credential management

Centrally manages all credentials required for accessing agents, sandboxes, LLMs, and tools.

Key features:

  • Supports multiple credential types:

    • API keys, JWT, Basic Auth, AccessKey pairs (AK/SK), and more.

  • Dynamic credential injection:

    • Works with the AgentRun runtime to securely inject credentials at runtime.

  • Enable/Disable control:

    • Allows you to instantly disable credentials that are suspected of being compromised, which mitigates security risks.

Observability and operations

Eliminates the "black box" nature of agents by providing the insights needed to observe and optimize production workloads.

Key features:

  • End-to-end tracing:

    • Provides consistent tracing from the initial request through the gateway, agent, model, tools, and external dependencies.

  • AI application monitoring:

    • Build monitoring dashboards with Prometheus or ARMS to analyze model performance, token costs, GPU anomalies, and more.

  • Logging and evaluation:

    • Unified log storage with support for search and SQL analysis.

    • Supports secondary analysis of model call logs for quality, security, and intent evaluation.

Prerequisites and access methods

Service-Linked Role authorization

When you use AgentRun for the first time, you must grant permissions through a Service-Linked Role (SLR):

  • Log in to the AgentRun Console.

  • The system automatically checks for the required SLR permissions and, if they are missing, displays a dialog box to guide you through authorization. Follow the prompts to create and authorize the following roles:

    • Custom role AliyunDevsCustomRole (trusted entity: devs.aliyuncs.com)

    • Default role AliyunDevsDefaultRole

    • Service-Linked Role AliyunServiceRoleForFC and AliyunServiceRoleForAgentRun

  • After authorization is complete, you can create and manage agents, sandboxes, models, and other resources.

Permissions and networking

  • You must grant your account or RAM user the necessary permissions to access resources like Function Compute (FC), sandboxes, large models, logs, OSS, and VPCs. For more information, see Authorize a RAM user to use AgentRun.

  • To access AgentRun services from within a VPC, you can configure a private endpoint by using PrivateLink. For more information, see Access AgentRun resources over PrivateLink.

  • Using the UI integration feature requires permissions related to the devs pipeline.

Summary

AgentRun is not just an "agent development framework" or a "model invocation SDK." It is an all-in-one infrastructure platform for enterprise-grade agents that provides:

  • A serverless agent runtime and sandbox runtime.

  • A model runtime and model governance.

  • A tool/MCP ecosystem with unified invocation governance.

  • Centralized credential management and enterprise-grade security isolation.

  • End-to-end observability and cost analysis.

This enables teams to seamlessly build and evolve agent applications on a single platform—from no-code prototypes to pro-code production systems—while maintaining full control over data security and technology choices.