Alibaba Cloud Container Compute Service (ACS) provides elastic, serverless computing resources through a Kubernetes-native interface for running containerized applications. This tutorial walks you through deploying a generative AI chat application in an ACS cluster using the ACS console and a cluster certificate, and monitoring its status.
Prerequisites
An Alibaba Cloud account with completed real-name verification. Register an Alibaba Cloud account. Complete individual real-name verification.
Background
-
This tutorial uses two open-source projects: RWKV-Runner (a 0.1-billion-parameter model with a RESTful API for inference) and ChatGPT-Next-Web (the chat web UI). Together, they form a decoupled frontend-backend AI chat application deployed to an ACS cluster as container images.
-
Learn Kubernetes fundamentals in the CNCF × Alibaba Cloud-Native Technology Open Course.
Procedure
To use ACS for the first time, activate the service, grant permissions, create a cluster, and deploy the application.
Step 1: Activate and authorize ACS
Before you use ACS for the first time, you must activate the service and grant it the necessary permissions to access other cloud resources.
-
Log on to the ACS console and click Activate.
-
On the ACS activation page, follow the on-screen instructions to activate the service.
-
Return to the ACS console, refresh the page, and click Go to Authorize.
-
On the ACS authorization page, follow the on-screen instructions to grant the required permissions.
After you grant the permissions, refresh the console to start using ACS.
Step 2: Create an ACS cluster
This section shows how to create an ACS cluster by configuring only its key parameters.
-
Log on to the ACS console. In the left navigation pane, click Clusters.
-
On the Clusters page, click Create Kubernetes Cluster in the upper-left corner.
-
On the Create Kubernetes Cluster page, configure the following parameters. You can use the default values for any parameters not listed here.
Parameter
Description
Example
Cluster Name
Enter a name for the cluster.
ACS-DemoRegion
Select the region where you want to create the cluster.
China (Beijing)
Select VPC
Set the network for the cluster. ACS clusters support only VPCs. You can choose Create VPC or Select Existing VPC .
-
Create VPC: The system automatically creates a VPC, a NAT gateway, and configures SNAT rules.
-
Select Existing VPC : Select an existing VPC and vSwitch. If you need to access the internet, for example to pull container images, you must configure a NAT gateway. We recommend that you upload container images to ACR in the same region as your cluster and pull the images over the internal VPC network.
For more information, see Create and manage a VPC.
Select Create VPC.
API Server Access Settings
Specify whether to expose the cluster's API server to the public internet. If you need to manage the cluster remotely from the internet, you must configure an Elastic IP (EIP).
Select Expose API server with EIP.
Service Discovery
Click Show Advanced Options and specify whether to enable service discovery for the cluster. If you need service discovery, you can select CoreDNS.
Select CoreDNS.
-
-
Click Confirm, review and accept the terms of service, and then click Create Kubernetes Cluster.
NoteCluster creation takes about 10 minutes. After the cluster is created, it appears on the Clusters page.
Step 3: Deploy RWKV-Runner with the console
Deploy the RWKV-Runner stateless application (Deployment) on a general-purpose instance in your ACS cluster and expose its RESTful API within the cluster. Create a stateless workload Deployment.
-
Log on to the ACS console. On the Cluster page, click the name of your target cluster (ACS-Demo).
-
In the left-side navigation pane, choose .
-
On the Deployments page, click Create from Image.
-
In the Basic Information step, set the Application Name to rwkv-runner, select General-purpose for Instance Type and default for QoS Type, and then click Next.
-
In the Container step, configure the container and click Next.
Parameter
Description
Example value
Image Name
Image address without a tag, or click Select Image to select an image.
registry.cn-beijing.aliyuncs.com/acs-demo-ns/rwkv-runner
Image Version
Click Select Image Version to select an image version.
1.0.0
CPU
Number of CPU cores for the application.
1 Core
Memory
Amount of memory for the application.
2 GiB
Port
Container ports.
-
Name: runner
-
Container Port: 8000
-
Protocol: TCP
-
-
In the Advanced step, click Create to the right of Services.
-
In the Create Service dialog box, configure the following parameters and click Create. This exposes the rwkv-runner's RESTful API within the cluster.
Parameter
Description
Example value
Name
Name of the service.
rwkv-runner-svc
Type
Service type. Determines how the service is accessed.
ClusterIP
Port Mapping
Set the Service Port and Container Port. The Container Port must match the port exposed by the backend pod.
-
Name: runner
-
Service Port: 80
-
Container Port: 8000
-
Protocol: TCP
-
-
In the Advanced step, click Create in the lower-right corner.
After creation, the Complete step shows the application objects. Click View Details to review the application.
Step 4: Deploy ChatGPT-Next-Web with a certificate
Use your cluster certificate (kubeconfig) to deploy the ChatGPT-Next-Web stateless application (Deployment) and expose it to the internet. Create a stateless workload Deployment.
-
Log on to the ACS console. On the Cluster page, click the name of your target cluster (ACS-Demo).
-
On the Cluster Information page, click the Connection Information tab. Obtain the public access certificate and follow the on-screen instructions to save it to the correct location.
-
Create a file named chat-next-web.yaml and add the following content.
-
Run the following command to apply the resources to your ACS cluster.
kubectl apply -f chat-next-web.yaml
Step 5: Create an initialization job
Use your cluster certificate to create a Kubernetes job that initializes the RWKV-Runner model. This job runs on a BestEffort QoS class instance. Create a job workload.
-
Create a file named rwkv-init-job.yaml and add the following content.
-
Run the following command to submit the initialization job.
kubectl apply -f rwkv-init-job.yaml -
Confirm that the initialization job completed.
kubectl get podThe job pod
STATUSshowsCompleted.
Step 6: Test the application
Access the deployed application through its service.
-
Log on to the ACS console. On the Cluster page, click the name of your target cluster (ACS-Demo).
-
In the left-side navigation pane, choose Network > Services.
-
On the Services page, find the newly created service (chat-frontend-svc) and click the IP address in the External IP column to access your generative AI chat application.
Clean up resources
ACS cluster fees have two components:
-
Compute power used by workloads, charged by ACS.
-
Other Alibaba Cloud resources, charged by their respective services.
After completing this tutorial:
-
If you no longer need the cluster, delete it and its associated resources. Delete a cluster.
-
To keep using the cluster, maintain an account balance of at least CNY 100.00. Cloud resource billing.