Create a Stream Storage cluster for Apache Fluss

更新时间:
复制 MD 格式

This topic describes how to create a Stream Storage for Apache Fluss cluster and provides important notes for the creation process.

Prerequisites

  • An Alibaba Cloud account is required. To create one, see Sign up for an Alibaba Cloud account.

  • If you use a RAM user or a role, ensure you have the required permissions to purchase and create resources. For more information, see RAM authorization.

Create a Fluss cluster

Note

Stream Storage for Apache Fluss is in free public preview. For more information, see Fluss public preview notice.

  1. Log on to the Realtime Compute management console.

  2. Click Free Public Preview next to Stream Storage Fluss.

  3. At the bottom of the purchase page, first create a service-linked role, and then fill out the configuration information. For more information about roles, see Service-linked role.

  4. Configure the parameters.

    Parameter

    Description

    Example

    Region

    Select the same region as your compute engine. Instances in different regions cannot communicate over the internal network.

    China (Beijing)

    Deployment mode

    single-AZ: Deploys the cluster in a single availability zone. High Availability: Provides cross-availability zone disaster recovery. The system automatically deploys the cluster, lakehouse integration service, and cold storage with multi-AZ redundancy.

    single-AZ

    VPC

    Select a VPC in the specified region. You cannot change the VPC after the instance is created.

    -

    VSwitch

    Each VSwitch corresponds to one availability zone. Each TabletServer requires one IP address. Plan your network CIDR block based on the scale of your cluster.

    -

    Instance name

    The name must start with a lowercase letter and can contain lowercase letters, digits, and hyphens (-). It cannot end with a hyphen. The name can be up to 60 characters long. The name must be unique within the same region and cannot be changed after creation.

    fluss-test

    TabletServer specifications

    The specifications for a single TabletServer node. 1 RCU is equivalent to 1 vCPU core and 8 GiB of memory.

    For more information, see Cluster capacity planning to select an appropriate configuration.

    4 RCU

    Number of TabletServers

    The number of nodes must be between 3 and 300.

    3

    Disk capacity

    The disk size for a single TabletServer. ESSD is used by default. By default, data is stored in three replicas. Therefore, a 900 GiB disk provides 300 GiB of usable capacity. The minimum size is 500 GiB and the maximum is 2 TiB. You cannot scale in the disk capacity after it is set. You must scale out if local disk usage exceeds 80%.

    900 GiB

    Cold storage

    You are billed based on the amount of data stored and the storage duration. Cold storage serves as a remote extension of the local disk. The system writes to both local storage and cold storage simultaneously to prevent service disruptions caused by a full local disk.

    Billed by actual usage

    Fixed resources

    Reserved compute resources for lakehouse integration sync jobs. 1 CU is equivalent to 1 vCPU core and 4 GiB of memory.

    -

    Elastic resources

    The upper limit of elastic compute resources for lakehouse integration sync jobs. You are billed based on actual usage.

    100 CU

    Note

    The new lakehouse integration service uses a fully-managed mode. The system automatically creates, runs, and maintains sync jobs, eliminating the need to manage the underlying job status. For more information, see Fully-managed lakehouse integration service.

  5. After completing payment, go to the management console. The new Fluss cluster appears with a "Creating" status. Cluster creation typically takes 15 to 20 minutes.

    • View cluster details: In the Realtime Compute management console, click Details for the target cluster to view its information.

    • Manage permissions: To grant a RAM user access to the cluster, you must configure permissions. For more information, see Authorize access to a Fluss cluster.

    • Modify cluster specifications: In the Realtime Compute management console, click More for the target cluster to adjust its resources.

Get started with Fluss

Basic concepts

Understand table classifications and basic characteristics.

Integrate with the Flink engine

Create and write data to Fluss tables.

Lakehouse integration

A new data architecture concept that merges data lakes with real-time streams.

Best practices

Fluss solutions for typical business scenarios.