Model Gateway Management

Model Gateway lets AI application teams create and debug Model APIs with pre-configured plugins for model proxying, observability, consumer authentication, and Content Moderation.

Create Model API

Log on to the AI Gateway console and choose Instance. In the top menu bar, select a region, then click the target instance ID.
In the navigation pane on the left, choose Model API, then click Create Model API.
Select a use case and click Create.

The use case determines the available Protocol options and default routes. Supported use cases:
- Text Generation: Supports OpenAI-compatible and Anthropic protocols.
- Image Generation
- Video Generation
- Speech Synthesis
- Embedding
- Rerank
- Others
Configure basic information.

In the dialog, complete the Select a scenario. step, then configure the Create Model API form:
- Protocol: Each protocol provides a set of built-in default routes for the selected use case, which allows you to quickly generate compatible interfaces for services such as OpenAI, DashScope, and vLLM.
- API Name: A unique name within your account. Up to 64 characters; supports letters, digits, underscores (_), and hyphens (-).
- Domain Name: Select one or more domain names for API access. Each domain name + BasePath combination must be unique.
  
  If you do not have a domain name, click the Add Domain Name button to create one.
- base path: The base request path of the API. Defaults to /. Optionally enable Remove during backend forwarding.
  Note
  If you enable Remove during backend forwarding, the system strips the base path from the request before forwarding to the backend. For example:
  - The base path is set to /api.
  - The original request path is /api/users.
  - The path forwarded to the backend service becomes /users.
- AI Request Monitoring: Enables metrics, logs, and traces. Logging and tracing require SLS. Select Record request content and Log response to record model requests and responses.
  
  Important
  When enabled, all AI request content including the request body is recorded to the access log. Ensure SLS is properly configured and data security safeguards are in place.
- Model Service: Supports Single-model Service, Multi-model Service (by model name), Multi-model Service (by proportion), Multiple Services (by observability metrics), and Multiple Services (intelligent routing).
  - Single-model Service: Select one AI service and set the Model Name. The model name can be passed through or rewritten.
  - Multi-model Service (by model name): Routes requests to different services by matching the model name in the request body with a rule. The rule supports wildcards such as ? and *. For example, qwen-* can match qwen-max and qwen-long.
  - Multi-model Service (by proportion): Select multiple AI services and set their weights. The model name can be passed through or rewritten.
  - Multi-Service (by Metrics): Automatically routes requests to the optimal service based on observability metrics such as response time and success rate, without manual weight configuration.
  - Multi-Model Service (Smart Routing): The system automatically selects the most suitable model based on model characteristics. Intelligent Routing.
    
    Note
    To use the Multiple Services (by observability metrics) and Multiple Services (intelligent routing) features, you must upgrade AI Gateway to version 2.1.15 or later.
- fallback: You can Enable this feature to configure a sequence of fallback policies. The same service can be used in multiple policies.
- first packet timeout: Maximum wait time (ms) for the first data packet in a streaming response. Set to 0 to disable.
- resource: Select the default resource group, an existing one, or create a new one for grouping, authorizing, and monitoring resources.
  
  To create a new Resource Group, click Create Resource Group.
Review your configuration and click OK.

Debug Model API

Note

Currently, you can only debug text generation using the /v1/chat/completions endpoint.

Log on to the AI Gateway console and choose Instance. In the top menu bar, select a region, then click the target instance ID.
In the left navigation pane, select Model API and click Debug in the Actions column for the target API.
In the Debug panel, select a domain name and model. If needed, enable the Streaming switch and configure parameters and custom parameters. On the Model Returned tab, enter your content and click Send.

Parameters: system prompt (system instruction, up to 100 characters), max_tokens (0–8192, default: 1024), top_p (0–1, default: 0.95), and temperature (0–2, default: 1; adjust with caution as it significantly impacts results). The cURL command and raw output tabs are available on the right.