Create an application

更新时间:
复制 MD 格式

You can create multiple applications in the console. Each application lets you combine models and features for different development scenarios.

Create an application

  1. On the Alibaba Cloud Model Studio website, find Application Market > App Practice (应用实践), and select the multimodal interaction development suite.

    Click the App Template tab.

  2. In the upper-right corner, click Free Trial. On the activation page, agree to the service agreement, and then click Buy Now to complete activation.

    The activation page shows the pay-as-you-go information for the Multimodal Interaction Development Suite, covering six API capabilities: chat and plugins, knowledge base retrieval, speech translation, visual understanding, podcast, and other Model Studio applications. The billing mode is post-paid after activation.

  3. After the service is activated, click Management Console to return to the Alibaba Cloud Model Studio website.

    Note

    After returning to the Alibaba Cloud Model Studio website, click Application Market to view the application list.

  4. Click Application Market, find the multimodal interaction development suite application, and click Create Application. You can create different types of applications based on your business requirements.

    The application overview page offers two application types: Multimodal interaction application (real-time "video + voice" conversations, ideal for hardware and software with a camera and microphone, such as AI glasses) and Voice interaction application (voice-only conversations, ideal for hardware and software with a microphone only, such as AI headphones). After selecting the target type, click the Create Application button on the corresponding card.

  • Multimodal interaction application: Supports real-time "video + voice" conversations. This type is ideal for software and hardware equipped with a camera and microphone, such as AI-powered smart glasses and AI video chat apps.

    • All-in-One Edition: Suitable for a wide range of smart hardware. It provides both voice and real-time video conversation features, allowing seamless switching between the two modes. It also supports plugins and Agents.

    • Vision Edition: Designed for real-time video interaction products with an always-on camera. This edition provides video-only calls (voice conversation is not available) and defaults to video mode. You can still use commands, plugins, and Agents. (Note: The knowledge base feature is not currently supported.)

      • When you create an application, select Multimodal Application-Vision Edition. On the configuration page, you can select a visual understanding model.

      • Under Recommended Models, you can choose from the Balanced and Advanced editions of the visual understanding model. Under More Models, you can select other multimodal models such as Qwen3.6-plus, Qwen3.5-plus, and Qwen3.5-flash.

      • On the Multimodal Interaction Development Suite page, click Start from a scenario template, select a device template (such as AR glasses), and in the version selection popup, select Vision Edition to create the application.

  • Voice interaction application: Supports real-time, voice-only conversations. This type is ideal for software and hardware equipped with a microphone, such as AI-powered headphones and children's toys. You can choose either the All-in-One Edition or the Lightweight Edition. (This application does not support real-time video conversations.)

    • All-in-One Edition: Supports advanced features like intent recognition, tool calling, Internet search, and multi-scenario Agents. It is suitable for a wide variety of interaction scenarios.

    • Lightweight Edition: Offers a faster, more cost-effective solution for casual voice chat. It does not support advanced capabilities like intent recognition, tool calling, or Agents.

Managing applications

On the My Applications page, you can view all of your applications.

Click API Access or Download SDK to view the corresponding developer documentation.

You can Copy or Delete an application. A deleted application cannot be recovered.

Important

When you delete an application, its associated online services become unavailable. Proceed with caution.

Dashboard

The dashboard displays metrics for a single application, including the number of conversations, average conversation latency, and license activation and consumption data. You can filter this data by a specific time range.

Note

The latency for a single conversational turn is measured from the end of speech recognition to the first character of the speech synthesis output. If no speech model is integrated, latency is measured from the text input to the first character of the text output.

The top of the dashboard page shows the application data overview, including the total number of applications, the number of published applications, the number of draft applications, and the number of published applications being edited. You can filter applications by type and application ID. The lower part of the page displays a line chart of the conversation count trend over the selected time range.

Alternatively, in the application list, click View Data to open the dashboard.