Model APIs

Model APIs let AI application teams configure and debug the AI Gateway and pre-configure plugins for AI proxy, AI observability, consumer authorization, and content moderation.

Create Model API

Log on to the AI Gateway console and choose Instance. In the top menu bar, select a region, then click the target instance ID.
In the navigation pane on the left, choose Model API, then click Create Model API.
Select a use case and click Create.

The use case determines the available Protocol options and default routes. Supported use cases:
- Text Generation: Supports OpenAI-compatible and Anthropic protocols.
- Image Generation
- Video Generation
- Speech Synthesis
- Embedding
- Rerank
- Others
Configure basic information.

In the dialog, complete the Select a scenario. step, then configure the Create Model API form:
- Protocol: Each protocol provides a set of built-in default routes for the selected use case, which allows you to quickly generate compatible interfaces for services such as OpenAI, DashScope, and vLLM.
  
  Note
  Protocol conversion may change the Token statistics structure. For example, the input token statistics for the Alibaba Cloud Model Studio (DashScope) protocol include cache tokens, while the input token statistics for the Anthropic protocol do not include cache tokens. Pay attention to the differences in statistical metrics across protocols when viewing observability data.
- API Name: A unique name within your account. Up to 64 characters; supports letters, digits, underscores (_), and hyphens (-).
- Domain Name: Select one or more domain names for API access. Each domain name + BasePath combination must be unique.
  
  If you do not have a domain name, click the Add Domain Name button to create one.
- base path: The base request path of the API. Defaults to /. Optionally enable Remove during backend forwarding.
  Note
  If you enable Remove during backend forwarding, the system strips the base path from the request before forwarding to the backend. For example:
  - The base path is set to /api.
  - The original request path is /api/users.
  - The path forwarded to the backend service becomes /users.
- AI Request Monitoring: Enables metrics, logs, and traces. Logging and tracing require SLS. Select Record request content and Log response to record model requests and responses.
  
  Important
  When enabled, all AI request content including the request body is recorded to the access log. Ensure SLS is properly configured and data security safeguards are in place.
- Model Service: Supports Single-model Service, Multi-model Service (by model name), Multi-model Service (by proportion), Multiple Services (by observability metrics), and Multiple Services (intelligent routing).
  - Single-model Service: Select one AI service and set the Model Name. The model name can be passed through or rewritten.
  - Multi-model Service (by model name): Routes requests to different services by matching the model name in the request body with a rule. The rule supports wildcards such as ? and *. For example, qwen-* can match qwen-max and qwen-long.
  - Multi-model Service (by proportion): Select multiple AI services and set their weights. The model name can be passed through or rewritten.
  - Multi-Service (by Metrics): Automatically routes requests to the optimal service based on observability metrics such as response time and success rate, without manual weight configuration.
  - Multi-Model Service (Smart Routing): The system automatically selects the most suitable model based on model characteristics. Intelligent Routing.
    
    Note
    To use the Multiple Services (by observability metrics) and Multiple Services (intelligent routing) features, you must upgrade AI Gateway to version 2.1.15 or later.
- fallback: You can Enable this feature to configure a sequence of fallback policies. The same service can be used in multiple policies.
- first packet timeout: Maximum wait time (ms) for the first data packet in a streaming response. Set to 0 to disable.
- resource: Select the default resource group, an existing one, or create a new one for grouping, authorizing, and monitoring resources.
  
  To create a new Resource Group, click Create Resource Group.
Review your configuration and click OK.

Default route

Each protocol and use case combination generates a set of default routes.

Text generation

Protocol: OpenAI compatible (`OpenAI/v1`)

Route name	Path	Method	Description
`create-chat-completion`	`/v1/chat/completions`	POST	Creates a model response for the given chat conversation.
`create-completion`	`/v1/completions`	POST	Creates a completion for the provided prompt and parameters.

Protocol: Anthropic (`Anthropic`)

The Anthropic protocol provides native message formats for Anthropic models such as Claude. Ideal for applications that use the native Anthropic API format.

Note

Providers supporting this protocol include Alibaba Cloud Model Studio (Qwen), Claude, Moonshot AI, and Zhipu AI. Their AI services natively support the Anthropic protocol with no additional configuration.

Route name	Path	Method	Description
`create-message`	`/v1/messages`	POST	Creates a message for the given chat conversation using Anthropic's native message format.

Image generation

Protocol: Alibaba Cloud Model Studio

Route name	Path	Method	Description
`dashscope-text-to-image-synthesis`	`/api/v1/services/aigc/text2image/image-synthesis`	POST	Generates an image using text-to-image synthesis.
`dashscope-image-to-image-synthesis`	`/api/v1/services/aigc/image2image/image-synthesis`	POST	Generates an image using image-to-image synthesis.
`dashscope-image-to-image-outpainting`	`/api/v1/services/aigc/image2image/out-painting`	POST	Performs image-to-image outpainting.
`dashscope-virtual-model-generation`	`/api/v1/services/aigc/virtualmodel/generation`	POST	Generates a virtual model image.
`dashscope-background-generation`	`/api/v1/services/aigc/background-generation/generation`	POST	Generates a background image.
`tasks`	`/api/v1/tasks`	GET/POST/PUT/PATCH/DELETE	Manages asynchronous tasks.

Protocol: OpenAI compatibility

Route name	Path	Method	Description
`openai-image-generation`	`/api/v1/images/generations`	POST	Generates an image.
`openai-image-edit`	`/api/v1/images/edits`	POST	Edits an image.
`openai-image-variation`	`/api/v1/images/variations`	POST	Creates a variation of a given image.

Protocol: ComfyUI

Route name	Path	Method	Description
`comfyui-websocket`	`/ws`	GET	Provides a WebSocket endpoint for real-time communication with the server.
`comfyui-embeddings`	`/embeddings`	GET	Lists available embeddings.
`comfyui-extensions`	`/extensions`	GET	Lists extensions that register a web directory.
`comfyui-features`	`/features`	GET	Gets server features and capabilities.
`comfyui-models`	`/models`	GET	Lists available model types.
`comfyui-models-folder`	`/models/{folder}`	GET	Gets models from a specific folder.
`comfyui-workflow-templates`	`/workflow_templates`	GET	Gets a map of custom node modules and their associated template workflows.
`comfyui-upload-image`	`/upload/image`	POST	Uploads an image.
`comfyui-upload-mask`	`/upload/mask`	POST	Uploads a mask.
`comfyui-view`	`/view`	GET	Views an image with multiple options.
`comfyui-view-metadata`	`/view_metadata/`	GET	Gets metadata for a model.
`comfyui-system-stats`	`/system_stats`	GET	Gets system information, such as Python version, devices, and VRAM.
`comfyui-prompt`	`/prompt`	GET/POST	Gets the current queue status and execution information, or submits a prompt.
`comfyui-object-info`	`/object_info`	GET	Gets details of all node types.
`comfyui-object-info-class`	`/object_info/{node_class}`	GET	Gets details for a specific node type.
`comfyui-history`	`/history`	GET/POST	Gets the queue history.
`comfyui-history-prompt-id`	`/history/{prompt_id}`	GET	Gets the queue history for a specific prompt.
`comfyui-queue`	`/queue`	GET/POST	Gets the queue status or manages queue operations.
`comfyui-interrupt`	`/interrupt`	POST	Stops the current workflow execution.
`comfyui-free`	`/free`	POST	Frees memory by unloading specified models.
`comfyui-userdata`	`/userdata`	GET	Lists user data files in a specified directory.
`comfyui-userdata-v2`	`/v2/userdata`	GET	Lists files and directories in a structured format.
`comfyui-userdata-file`	`/userdata/{file}`	GET/POST/DELETE	Gets, uploads, updates, or deletes a specific user data file.
`comfyui-userdata-file-move`	`/userdata/{file}/move/{dest}`	POST	Moves or renames a user data file.
`comfyui-users`	`/users`	GET/POST	Gets user information or creates a new user.

Video generation

Alibaba Cloud Model Studio protocol

Route name	Path	Method	Description
`dashscope-video-generation-synthesis`	`/api/v1/services/aigc/video-generation/video-synthesis`	POST	Generates a video.
`dashscope-image-to-video-synthesis`	`/api/v1/services/aigc/image2video/video-synthesis`	POST	Generates a video from an image.
`tasks`	`/api/v1/tasks`	GET/POST/PUT/PATCH/DELETE	Manages asynchronous tasks.

Speech synthesis

Alibaba Cloud Model Studio

Route name	Path	Method	Description
`dashscope-text-to-audio-synthesis`	`/api-ws/v1/inference`	GET	Synthesizes audio from text.

OpenAI compatible (`OpenAI/v1`)

Route name	Path	Method	Description
`openai-audio-speech`	`/api/v1/audio/speech`	POST	Synthesizes audio from text.

Embedding

Protocol: OpenAI compatible (`OpenAI/v1`)

Route name	Path	Method	Description
`create-embedding`	`/v1/embeddings`	POST	Generates an embedding vector representing the input text.

Text reranking (rerank)

Protocol: Alibaba Cloud Model Studio text reranking

Route name	Path	Method	Description
`rerank`	`/api/v1/services/rerank/text-rerank/text-rerank`	POST	Reranks the given documents based on query relevance.

Protocol: vLLM

Route name	Path	Method	Description
`rerank`	`/v1/rerank`	POST	Reranks the given documents based on query relevance.

Others

Protocol: OpenAI-compatible (`OpenAI/v1`)

Route name	Path	Method	Description
`models`	`/v1/models`	GET, POST, PUT, PATCH, DELETE	Manage models.
`files`	`/v1/files`	GET, POST, PUT, PATCH, DELETE	Manage files.
`batches`	`/v1/batches`	GET, POST, PUT, PATCH, DELETE	Manage batches.
`fine-tuning`	`/v1/fine_tuning`	GET, POST, PUT, PATCH, DELETE	Manage fine-tuning jobs.

Note

AI services from providers supporting the Anthropic protocol (Alibaba Cloud Model Studio, Claude, Moonshot AI, Zhipu AI) automatically support multiple protocols, including OpenAI-compatible and Anthropic. Select the appropriate protocol when creating a Model API.

Intelligent routing

Different LLMs excel in specific domains:

Code generation: The Qwen-Coder series excels in code understanding and generation.
Mathematical reasoning: The Qwen-Math series excels at solving complex mathematical problems.
Translation: The Qwen-MT series is optimized for multilingual translation.
Rapid response: The Qwen-Flash series offers ultra-low latency for time-sensitive scenarios.
Complex reasoning: Models such as Qwen-Max and DeepSeek-R1 have an edge in complex reasoning.

Manually routing requests to these models presents challenges:

Fragmented user experience: Users must manually select models without guidance on which fits best.
Inefficient resource utilization: High-cost models handle simple tasks that cheaper models could serve.
High development complexity: Application-layer routing logic increases development and maintenance costs.
No unified endpoint: Multiple model deployments lead to scattered APIs that are hard to manage.

The AI Gateway intelligent routing feature uses semantic analysis to automatically route requests to the best-fit model based on these intent classifications:

Intent code	Description	Scenarios
`Coder`	Code writing and debugging	Programming questions, code generation, bug fixes
`Math`	Mathematical computation and reasoning	Mathematical proofs, formula derivation, statistical analysis
`Translation`	Multilingual translation	Document translation, real-time translation, localization
`Flash`	Fast and simple responses	Simple Q&A, information lookups, everyday conversations
`Complex`	Complex reasoning	Deep analysis, complex decision-making, long-context understanding

Edit Model API

Log on to the AI Gateway console and choose Instance. In the top menu bar, select a region, then click the target instance ID.
In the navigation pane, click Model API, then click Edit in the Actions column of the target API. Modify the parameters in the Edit Model API panel. Parameter descriptions: Create a Model API.
Click OK.

Debug Model API

Note

Currently, you can only debug text generation using the /v1/chat/completions endpoint.

Log on to the AI Gateway console and choose Instance. In the top menu bar, select a region, then click the target instance ID.
In the left navigation pane, select Model API and click Debug in the Actions column for the target API.
In the Debug panel, select a domain name and model. If needed, enable the Streaming switch and configure parameters and custom parameters. On the Model Returned tab, enter your content and click Send.

Parameters: system prompt (system instruction, up to 100 characters), max_tokens (0–8192, default: 1024), top_p (0–1, default: 0.95), and temperature (0–2, default: 1; adjust with caution as it significantly impacts results). The cURL command and raw output tabs are available on the right.

Delete a model API

Log on to the AI Gateway console and choose Instance. In the top menu bar, select a region, then click the target instance ID.
In the navigation pane, select Model API. In the row for the target API, click Delete in the Actions column. In the confirmation dialog box that appears, enter the API name and click Delete.

Create Model API

Default route

Text generation

Protocol: OpenAI compatible (OpenAI/v1)

Protocol: Anthropic (Anthropic)

Image generation

Protocol: Alibaba Cloud Model Studio

Protocol: OpenAI compatibility

Protocol: ComfyUI

Video generation

Alibaba Cloud Model Studio protocol

Speech synthesis

Alibaba Cloud Model Studio

OpenAI compatible (OpenAI/v1)

Embedding

Protocol: OpenAI compatible (OpenAI/v1)

Text reranking (rerank)

Protocol: Alibaba Cloud Model Studio text reranking

Protocol: vLLM

Others

Protocol: OpenAI-compatible (OpenAI/v1)

Intelligent routing

Edit Model API

Debug Model API

Delete a model API

Protocol: OpenAI compatible (`OpenAI/v1`)

Protocol: Anthropic (`Anthropic`)

OpenAI compatible (`OpenAI/v1`)

Protocol: OpenAI compatible (`OpenAI/v1`)

Protocol: OpenAI-compatible (`OpenAI/v1`)