Alibaba Cloud Model Studio is a one-stop LLM development and application platform. It integrates Qwen and mainstream third-party models, provides OpenAI-compatible APIs, and delivers full-lifecycle model services — from API calls to visual application building.
Key capabilities:
Generate content and summaries with a few lines of code. Model Studio is OpenAI-compatible. Update the API key, base URL, and model name to migrate existing OpenAI code. PythonNode.jscurlThe base URL varies by region. The following example uses the base URL for the China (Beijing) region.
| ||
Build an AI assistant for customer inquiries using visual tools.
| Visual orchestration lets non-technical staff design workflows without writing code.
| Customize models using visual fine-tuning without writing code.
|
Model service
Models
Model Studio provides ready-to-use model services, including the proprietary Qwen series and third-party models such as DeepSeek, Kimi, and GLM. See Recommended models.
Qwen flagship models:
Qwen-Max: The highest-performing model in the Qwen series, suited for complex, multi-step tasks.
The latest qwen3.7-max delivers significant reasoning improvements over its predecessor. Recommended.
Qwen-Plus: Balances performance, speed, and cost — recommended for most scenarios.
Qwen-Flash: Low-cost and low-latency — suited for simple tasks that require fast responses.
Multimodal coverage: Includes text generation, visual understanding, image generation, video generation, speech recognition and synthesis, and embedding.
Domain-specific models: Models for long-text processing, translation, data mining, legal, intent recognition, role-playing, and in-depth research.
Model fine-tuning, deployment, and evaluation
Model fine-tuning: Supports supervised fine-tuning (SFT), continued pre-training (CPT), and direct preference optimization (DPO).
Model deployment: Deploy pre-built or custom models as dedicated inference services for high-concurrency, low-latency scenarios. Billing options include duration-based, monthly subscription, and token volume-based plans.
Model evaluation: Compare models, verify fine-tuning results, and identify threats with manual, automatic, and baseline evaluations.
Application building
Application types: Both visual and high-code development modes are available. Use the visual mode to create agent applications and workflow applications. A high-code application deploys a Python project as a backend service with automated O&M, observability, and Simple Log Service integration.
Feature extension: Connect to private data and domain knowledge using a knowledge base (RAG). Call external services via plugins and Model Context Protocol (MCP).
Sharing and publishing: Share or publish applications on web apps, DingTalk robots, WeChat Official Accounts, and audio/video interactive agents. See Application sharing.
Billing
Activating Model Studio is free. Costs apply only when you invoke , fine-tune, or deploy models. See Billable items.
Free quota for new users
New users receive a free quota in the China (Beijing) region to try model invocation.
Unverified users cannot continue using the service after the free quota is depleted. They must complete identity verification and top up their account to switch to pay-as-you-go billing.
Verified users are automatically switched to pay-as-you-go billing after the free quota is depleted. To avoid unexpected charges, enable the Free quota only feature — the service stops when the quota is depleted.
For more information, see Free quota for new users.
Payment methods
Model calls are billed per minute. Make sure your account balance is sufficient — add funds on the Expenses and Costs page.
View bills and usage
Billing details: Go to the Billing Details and Cost Analysis pages.
Call statistics: About one hour after making a model call, go to the Model Studio console, select your region from the top-right corner, go to the Model Monitoring page, set your query conditions, click Monitor in the Actions column for the target model, and view call volume, token consumption, success rate, and other statistics. See Model monitoring.
Coding Plan usage: If you are subscribed to Coding Plan, view quota consumption on the Coding Plan page. Coding Plan uses a fixed monthly fee with a monthly request quota for AI coding tools. See Coding Plan overview.
Getting started
Try models online:
Open the Model Studio console and select your region from the top-right corner.
Go to the Playground and select a model.
Make your first API call: Make the first call to a Qwen API
Build your first LLM application: Build a Q&A application with zero code


