Create a Stream Storage cluster for Apache Fluss-Realtime Compute for Apache Flink(Flink)-阿里云帮助中心

Prerequisites

An Alibaba Cloud account is required. To create one, see Sign up for an Alibaba Cloud account.
If you use a RAM user or a role, ensure you have the required permissions to purchase and create resources. For more information, see RAM authorization.

Create a Fluss cluster

Note

Stream Storage for Apache Fluss is in free public preview. For more information, see Fluss public preview notice.

Log on to the Realtime Compute management console.
Click Free Public Preview next to Stream Storage Fluss.
At the bottom of the purchase page, first create a service-linked role, and then fill out the configuration information. For more information about roles, see Service-linked role.

Configure the parameters.

Parameter	Description	Example
Region	Select the same region as your compute engine. Instances in different regions cannot communicate over the internal network.	China (Beijing)
Deployment mode	single-AZ: Deploys the cluster in a single availability zone. High Availability: Provides cross-availability zone disaster recovery. The system automatically deploys the cluster, lakehouse integration service, and cold storage with multi-AZ redundancy.	single-AZ
VPC	Select a VPC in the specified region. You cannot change the VPC after the instance is created.	-
VSwitch	Each VSwitch corresponds to one availability zone. Each TabletServer requires one IP address. Plan your network CIDR block based on the scale of your cluster.	-
Instance name	The name must start with a lowercase letter and can contain lowercase letters, digits, and hyphens (-). It cannot end with a hyphen. The name can be up to 60 characters long. The name must be unique within the same region and cannot be changed after creation.	fluss-test
TabletServer specifications	The specifications for a single TabletServer node. 1 RCU is equivalent to 1 vCPU core and 8 GiB of memory. For more information, see Cluster capacity planning to select an appropriate configuration.	4 RCU
Number of TabletServers	The number of nodes must be between 3 and 300.	3
Disk capacity	The disk size for a single TabletServer. ESSD is used by default. By default, data is stored in three replicas. Therefore, a 900 GiB disk provides 300 GiB of usable capacity. The minimum size is 500 GiB and the maximum is 2 TiB. You cannot scale in the disk capacity after it is set. You must scale out if local disk usage exceeds 80%.	900 GiB
Cold storage	You are billed based on the amount of data stored and the storage duration. Cold storage serves as a remote extension of the local disk. The system writes to both local storage and cold storage simultaneously to prevent service disruptions caused by a full local disk.	Billed by actual usage
Fixed resources	Reserved compute resources for lakehouse integration sync jobs. 1 CU is equivalent to 1 vCPU core and 4 GiB of memory.	-
Elastic resources	The upper limit of elastic compute resources for lakehouse integration sync jobs. You are billed based on actual usage.	100 CU

Note

The new lakehouse integration service uses a fully-managed mode. The system automatically creates, runs, and maintains sync jobs, eliminating the need to manage the underlying job status. For more information, see Fully-managed lakehouse integration service.

After completing payment, go to the management console. The new Fluss cluster appears with a "Creating" status. Cluster creation typically takes 15 to 20 minutes.
- View cluster details: In the Realtime Compute management console, click Details for the target cluster to view its information.
- Manage permissions: To grant a RAM user access to the cluster, you must configure permissions. For more information, see Authorize access to a Fluss cluster.
- Modify cluster specifications: In the Realtime Compute management console, click More for the target cluster to adjust its resources.

Get started with Fluss

Basic concepts

Understand table classifications and basic characteristics.

Integrate with the Flink engine

Create and write data to Fluss tables.

Lakehouse integration

A new data architecture concept that merges data lakes with real-time streams.

Best practices

Fluss solutions for typical business scenarios.