This topic describes how to create a Stream Storage for Apache Fluss cluster and provides important notes for the creation process.
Prerequisites
-
An Alibaba Cloud account is required. To create one, see Sign up for an Alibaba Cloud account.
-
If you use a RAM user or a role, ensure you have the required permissions to purchase and create resources. For more information, see RAM authorization.
Create a Fluss cluster
Stream Storage for Apache Fluss is in free public preview. For more information, see Fluss public preview notice.
-
Log on to the Realtime Compute management console.
-
Click Free Public Preview next to Stream Storage Fluss.
-
At the bottom of the purchase page, first create a service-linked role, and then fill out the configuration information. For more information about roles, see Service-linked role.
-
Configure the parameters.
Parameter
Description
Example
Region
Select the same region as your compute engine. Instances in different regions cannot communicate over the internal network.
China (Beijing)
Deployment mode
single-AZ: Deploys the cluster in a single availability zone. High Availability: Provides cross-availability zone disaster recovery. The system automatically deploys the cluster, lakehouse integration service, and cold storage with multi-AZ redundancy.
single-AZ
VPC
Select a VPC in the specified region. You cannot change the VPC after the instance is created.
-
VSwitch
Each VSwitch corresponds to one availability zone. Each TabletServer requires one IP address. Plan your network CIDR block based on the scale of your cluster.
-
Instance name
The name must start with a lowercase letter and can contain lowercase letters, digits, and hyphens (-). It cannot end with a hyphen. The name can be up to 60 characters long. The name must be unique within the same region and cannot be changed after creation.
fluss-test
TabletServer specifications
The specifications for a single TabletServer node. 1 RCU is equivalent to 1 vCPU core and 8 GiB of memory.
For more information, see Cluster capacity planning to select an appropriate configuration.
4 RCU
Number of TabletServers
The number of nodes must be between 3 and 300.
3
Disk capacity
The disk size for a single TabletServer. ESSD is used by default. By default, data is stored in three replicas. Therefore, a 900 GiB disk provides 300 GiB of usable capacity. The minimum size is 500 GiB and the maximum is 2 TiB. You cannot scale in the disk capacity after it is set. You must scale out if local disk usage exceeds 80%.
900 GiB
Cold storage
You are billed based on the amount of data stored and the storage duration. Cold storage serves as a remote extension of the local disk. The system writes to both local storage and cold storage simultaneously to prevent service disruptions caused by a full local disk.
Billed by actual usage
Fixed resources
Reserved compute resources for lakehouse integration sync jobs. 1 CU is equivalent to 1 vCPU core and 4 GiB of memory.
-
Elastic resources
The upper limit of elastic compute resources for lakehouse integration sync jobs. You are billed based on actual usage.
100 CU
NoteThe new lakehouse integration service uses a fully-managed mode. The system automatically creates, runs, and maintains sync jobs, eliminating the need to manage the underlying job status. For more information, see Fully-managed lakehouse integration service.
-
After completing payment, go to the management console. The new Fluss cluster appears with a "Creating" status. Cluster creation typically takes 15 to 20 minutes.
-
-
View cluster details: In the Realtime Compute management console, click Details for the target cluster to view its information.
-
Manage permissions: To grant a RAM user access to the cluster, you must configure permissions. For more information, see Authorize access to a Fluss cluster.
-
Modify cluster specifications: In the Realtime Compute management console, click More for the target cluster to adjust its resources.
-
Get started with Fluss
|
Basic concepts Understand table classifications and basic characteristics. |
Integrate with the Flink engine Create and write data to Fluss tables. |
|
Lakehouse integration A new data architecture concept that merges data lakes with real-time streams. |
Best practices Fluss solutions for typical business scenarios. |