FAQ

更新时间:
复制 MD 格式

This topic answers frequently asked questions about EMR Serverless StarRocks.

How do I access OSS across accounts?

By default, EMR Serverless StarRocks provides password-free access to OSS buckets in the same account. To access OSS resources in another account, you must disable this default, configure the target account's AccessKey pair, and apply the new configuration.

  1. Disable password-free access: On the Parameter Configuration tab, clear the values of the following configuration items in the specified files.

    • core-site.xml

      fs.oss.credentials.provider =
    • jindosdk.cfg

      fs.oss.provider.format =
      fs.oss.provider.endpoint =
  2. Add the AccessKey pair for the target account: On the Parameter Configuration tab, click Add Configuration Item and add the following configurations to the specified files.

    • core-site.xml

      fs.oss.accessKeyId = AccessKey ID of the target account
      fs.oss.accessKeySecret = AccessKey Secret of the target account
    • jindosdk.cfg

      fs.oss.accessKeyId = AccessKey ID of the target account
      fs.oss.accessKeySecret = AccessKey Secret of the target account
  3. Apply the configuration: On the Parameter Configuration tab, click Submit Parameters.

Use UDF and JDBC connector drivers

Before you use UDF and JDBC drivers, you must obtain the required JAR files from an external source.

  1. Upload the JAR files to OSS. For more information, see Upload files.

    When you upload the files, set the object ACL to Public Read/Write to grant the JAR files public read and write permissions.

  2. Obtain the URL for each JAR file.

    In the OSS console, find the link for each successfully uploaded JAR file. Use the HTTP URL of the internal endpoint, which must be in one of the following formats:

    • For a JDBC driver: http://<YourBucketName>.oss-cn-xxxx-internal.aliyuncs.com/mysql-connector-java-*.jar.

    • For a UDF: http://<YourBucketName>.oss-cn-xxxx-internal.aliyuncs.com/<YourPath>/<jar_package_name>.

  3. Use the JAR files. For more information, see Java UDF and JDBC Catalog.

How do I reset the instance password?

Important
  • Resetting the instance password interrupts client-server connections. To minimize production impact, perform this operation during off-peak hours.

  • Only users with the AliyunEMRStarRocksFullAccess permission can reset the password.

  1. Go to the instance details page.

    1. Log in to the E-MapReduce console.

    2. In the left-side navigation pane, choose EMR Serverless > StarRocks.

    3. Click the name of the target instance.

  2. On the Instance Details page, in the Basic Information section, click Reset Password.

  3. In the dialog box that appears, enter and confirm the new password, and then click OK.

Error writing data to Paimon tables

  • Symptom: When you use StarRocks to write data to a Paimon table, you may receive the following error message:

    (5025, 'Backend node not found. Check if any backend node is down.')
  • Cause: A permission check in Paimon tables can prevent StarRocks from correctly identifying BE nodes during write operations.

  • Solution:

    • Upgrade the version (Recommended): If your instance version is earlier than one of the following, perform a minor version update to apply the fix.

      • StarRocks 3.2: 3.2.11-1.89 or later

      • StarRocks 3.3: 3.3.8-1.88 or later

    • Workaround: On the Parameter Configuration tab of the StarRocks instance, add the following configuration item to the core-site.xml file.

      dlf.permission.clientCheck=false

When creating a foreign table in StarRocks, if you receive the not a RAM user error, what should you do?

  • Symptom: When creating a foreign table in StarRocks, you may receive the following error message:

    current user is not a RAM user
  • Cause: This error is caused by insufficient permissions or an outdated instance version.

  • Solution:

    1. Check the RAM user permissions: Ensure that the Resource Access Management (RAM) user has the required permissions for StarRocks. For more information, see Grant permissions to a RAM user.

    2. If the permissions are correct, check and upgrade the kernel version on the StarRocks Instance Details page.

      If your instance version is earlier than one of the following, perform a minor version update to apply the fix.

      • StarRocks 3.2: 3.2.11-1.89 or later

      • StarRocks 3.3: 3.3.8-1.88 or later

Error with semicolons in the SQL Editor

  • Symptom: When you run an SQL statement containing a semicolon (;) in the SQL Editor, you receive an error. The error message includes the most similar input is {a legal identifier}.

    The error code is 1064, and the details also include Unexpected input '<EOF>', indicating a syntax error at line 3, column 11.

  • This error occurs because the SQL Editor uses the semicolon (;) as a statement terminator by default. If your SQL statement contains a semicolon (;), a syntax parsing error occurs.

  • Solution:

    1. Set a custom delimiter.

      Before you run an SQL statement that contains a semicolon, set a custom delimiter to prevent syntax parsing errors. For example, you can change the delimiter to $$.

      delimiter $$
    2. Run the SQL statement that contains a semicolon. An example is shown below:

      INSERT INTO sr_test VALUES 
      (1, 'asdsd,asdsads'), 
      (2, 'sadsad;asdsads');
    3. Restore the default delimiter.

      After the SQL statement is executed, restore the default delimiter (;) so that subsequent SQL operations can run as expected.

      delimiter ;
    4. Verify the result.

      Run a query to verify that the data was inserted correctly.

      delimiter ;
      SELECT * FROM sr_test;
      Output
        test_id    test_desc
      0       1    asdsd,asdsads
      1       2    sadsad;asdsads

Failure to import data or access foreign tables

  • Symptom: When you use EMR Serverless StarRocks to import data or access a foreign table, the import or connection may fail if the destination is a public IP address.

  • Cause: An EMR Serverless StarRocks instance runs in a Virtual Private Cloud (VPC) environment by default, which may not have direct access to the internet. Therefore, requests to public resources, such as for data imports or foreign table queries, fail unless internet access is configured.

  • Solution: You can deploy an Internet NAT gateway in the VPC and enable the SNAT feature. This allows the EMR Serverless StarRocks instance to access public resources through the gateway. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet.

Prevent connection closure by SLB/CLB idle timeout

  • Symptom: When using SLB with a StarRocks instance, the SLB forcibly closes the client connection if an SQL query runs for more than 900 seconds, preventing the query from returning a result. For more information about enabling SLB, see Manage gateways.

  • Cause: SLB closes any TCP connection that is idle for more than 900 seconds. This can happen during a long-running SQL query, interrupting the connection before StarRocks returns a result.

  • Solution: Configure client-side TCP Keepalive parameters to prevent the SLB from closing idle connections.

    • Global kernel parameter settings (system-level)

      Modify the operating system's kernel parameters to enable and configure appropriate TCP Keepalive settings for all TCP connections. This helps monitor the status of network connections. The following table describes the parameters to be configured.

      Parameter

      Description

      Recommended value

      • Linux: net.ipv4.tcp_keepalive_time

      • FreeBSD/macOS: net.inet.tcp.keepidle

      The period of inactivity in seconds after which the first Keepalive probe is sent.

      600 seconds

      • Linux: net.ipv4.tcp_keepalive_intvl

      • FreeBSD/macOS: net.inet.tcp.keepintvl

      The interval in seconds between Keepalive probe retransmissions.

      60 seconds

      • Linux: net.ipv4.tcp_keepalive_probes

      • FreeBSD/macOS: net.inet.tcp.keepcnt

      The number of consecutive failed probes after which the connection is dropped.

      5

      Linux

      • Apply settings temporarily

        # Set global Keepalive parameters (root permissions required)
        sudo sysctl -w net.ipv4.tcp_keepalive_time=600   # Corresponds to keepidle (600 seconds)
        sudo sysctl -w net.ipv4.tcp_keepalive_intvl=60   # Corresponds to keepintvl (60 seconds)
        sudo sysctl -w net.ipv4.tcp_keepalive_probes=5   # Corresponds to keepcount (5)
      • Apply settings permanently

        Add the following content to /etc/sysctl.conf and run sysctl -p to apply the settings.

        echo "net.ipv4.tcp_keepalive_time = 600" >> /etc/sysctl.conf
        echo "net.ipv4.tcp_keepalive_intvl = 60" >> /etc/sysctl.conf
        echo "net.ipv4.tcp_keepalive_probes = 5" >> /etc/sysctl.conf

      FreeBSD/macOS

      • Apply settings temporarily

        # Set global Keepalive parameters (root permissions required)
        sudo sysctl -w net.inet.tcp.keepidle=600
        sudo sysctl -w net.inet.tcp.keepintvl=60
        sudo sysctl -w net.inet.tcp.keepcnt=5
      • Apply settings permanently

        Add the following content to /etc/sysctl.conf.

        echo "net.inet.tcp.keepidle = 600" >> /etc/sysctl.conf
        echo "net.inet.tcp.keepintvl = 60" >> /etc/sysctl.conf
        echo "net.inet.tcp.keepcnt = 5" >> /etc/sysctl.conf
    • Application-level settings

      You can use language-specific APIs to set TCP Keepalive parameters for a single connection.

      Java

      The Java standard library has limited support for TCP Keepalive. However, you can implement it by using reflection or low-level socket options.

      Note

      The following code requires system support for options such as tcp_keepidle on Linux or FreeBSD. Additionally, some methods, such as reflection, may not work due to differences in JVM versions. We recommend that you test for compatibility before use in a production environment.

      import java.io.IOException;
      import java.net.InetSocketAddress;
      import java.net.Socket;
      import java.net.SocketOption;
      import java.nio.channels.SocketChannel;
      public class TcpKeepaliveExample {
          public static void main(String[ ] args) {
              try (Socket socket = new Socket()) {
                  // 1. Enable Keepalive
                  socket.setKeepAlive(true);
                  // 2. Set Keepalive parameters (requires low-level socket options)
                  SocketChannel channel = socket.getChannel();
                  if (channel != null) {
                      // Set Keepidle (idle time)
                      channel.setOption(StandardSocketOptions.SO_KEEPALIVE, true); // Keepalive must be enabled first
                      setSocketOptionInt(socket, "tcp_keepidle", 600); // Requires system support
                      // Set Keepintvl (retransmission interval)
                      setSocketOptionInt(socket, "tcp_keepintvl", 60);
                      // Set Keepcount (number of failures)
                      setSocketOptionInt(socket, "tcp_keepcnt", 5); // Note: The parameter name may vary by system
                  }
                  // Connect to the server
                  socket.connect(new InetSocketAddress("example.com", 80));
                  // ... Other operations ...
              } catch (IOException e) {
                  e.printStackTrace();
              }
          }
          // Use reflection to set system-specific socket options (such as on Linux/FreeBSD)
          private static void setSocketOptionInt(Socket socket, String optionName, int value) {
              try {
                  Class<?> clazz = Class.forName("java.net.Socket$SocketOptions");
                  Object options = clazz.getDeclaredMethod("options").invoke(socket);
                  Class<?> optionsClass = options.getClass();
                  optionsClass.getDeclaredMethod("setOption", String.class, int.class)
                          .invoke(options, optionName, value);
              } catch (Exception e) {
                  throw new RuntimeException("Failed to set socket option " + optionName, e);
              }
          }
      }

      Python

      The Python socket module supports direct configuration of TCP Keepalive parameters.

      Note

      Different operating systems may use different parameter names. For example, macOS may require TCP_KEEPALIVE instead of TCP_KEEPIDLE. Some parameters may require root permissions to set.

      import socket
      def create_keepalive_socket():
          sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
          # 1. Enable Keepalive
          sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
          # 2. Set Keepalive parameters (Linux/FreeBSD)
          # Keepidle: 600 seconds
          sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 600)
          # Keepintvl: 60 seconds
          sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 60)
          # Keepcount: 5
          sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)
          return sock
      # Example
      sock = create_keepalive_socket()
      sock.connect(("example.com", 80))
      # ... Other operations ...
      sock.close()

      Golang

      The Golang net package provides basic Keepalive configuration. However, you must use the low-level syscall package to set detailed parameters.

      Note

      Different operating systems may use different parameter names. Some parameters may require root permissions to set.

      package main
      import (
          "fmt"
          "net"
          "syscall"
      )
      func main() {
          // Create a TCP connection
          conn, err := net.Dial("tcp", "example.com:80")
          if err != nil {
              panic(err)
          }
          defer conn.Close()
          // Get the underlying file descriptor
          file, err := conn.(*net.TCPConn).File()
          if err != nil {
              panic(err)
          }
          defer file.Close()
          fd := int(file.Fd())
      
          // Enable Keepalive
          err = syscall.SetsockoptInt(fd, syscall.SOL_SOCKET, syscall.SO_KEEPALIVE, 1)
          if err != nil {
              panic(fmt.Errorf("set SO_KEEPALIVE: %v", err))
          }
          // Set Keepidle (idle time)
          err = syscall.SetsockoptInt(fd, syscall.IPPROTO_TCP, syscall.TCP_KEEPIDLE, 600)
          if err != nil {
              panic(fmt.Errorf("set TCP_KEEPIDLE: %v", err))
          }
          // Set Keepintvl (retransmission interval)
          err = syscall.SetsockoptInt(fd, syscall.IPPROTO_TCP, syscall.TCP_KEEPINTVL, 60)
          if err != nil {
              panic(fmt.Errorf("set TCP_KEEPINTVL: %v", err))
          }
          // Set Keepcount (number of failures)
          err = syscall.SetsockoptInt(fd, syscall.IPPROTO_TCP, syscall.TCP_KEEPCNT, 5)
          if err != nil {
              panic(fmt.Errorf("set TCP_KEEPCNT: %v", err))
          }
          // ... Other operations ...
      }

"Not a OLAP table" error during data backup

  • Symptom: When you back up data by creating a snapshot in StarRocks, you receive the following error message:

    Unexpected exception: Table '<table_name>' is not a OLAP table
  • Causes:

    • The instance is in shared-data mode.

      StarRocks shared-data instances do not support data backup and recovery. This feature is only available for shared-nothing instances.

    • The table engine type is incompatible.

      The StarRocks backup feature only supports tables that use the OLAP engine. This error occurs if the table engine is not OLAP.

  • Solution:

    • Check the instance type.

      In the StarRocks Instance List, check the Instance Type. If the instance is of the Shared-data type, it does not support data backup and recovery. We recommend using a shared-nothing instance to enable backup and recovery. For more information, see Backup and recovery.

    • Check the table engine type.

      Examine the DDL definition of the target table to confirm if ENGINE=OLAP is set.

      SHOW CREATE TABLE <table_name>;

      If the table engine is not OLAP, recreate the table according to your business needs and ensure that you specify ENGINE=OLAP.

"Primary-key index exceeds the limit" error on import

  • Symptom

    When you write data to a Primary Key model table, the following error occurs:
    msg: Cancelled, msg: Primary-key index exceeds the limit. tablet_id: 2506733, consumption: 33176971421, limit: 32116807950. Memory stats of top five tablets: 3656508(4465M) 3656496(4464M) 3656544(4464M) 3656520(4462M) 3656532(4461M): be: backend-0.backend.xxx.svc.cluster.local.xxx

  • Troubleshooting approach

    • Analyze the current memory usage of the BE node to identify any resource bottlenecks.

    • Determine the cluster type (shared-nothing or shared-data) and check the corresponding configuration parameters.

  • Detailed troubleshooting steps

    • Analyze memory usage

      • This error occurs because the primary key index's memory consumption has exceeded the BE node's memory limit.

      • Check the mem_limit configuration of the BE node to assess its available memory capacity. You can obtain this information by running SHOW FRONTENDS or SHOW BACKENDS.

    • Solutions

      • Solution 1: Enable the persistent index (Recommended)

        • Shared-nothing: Set enable_persistent_index to true.

        • Shared-data: Set persistent_index_type to cloud_native.

      • Solution 2: Implement an effective partitioning strategy

        • Partition the primary key table effectively (for example, by time or region) to avoid writing to the entire table at once.

        • After partitioning, each write operation affects only a subset of partitions. The primary key index only needs to load data for the corresponding partitions, thereby reducing the memory pressure of a single write operation.

"Reached timeout" error during import

  • Symptoms

    • A Flink job fails to import data into StarRocks and reports the following error:
      Message: [E1008] Reached timeout=7500ms @x.x.x.x:8060.

    • An INSERT INTO job fails and reports the following error:
      java.sql.SQLException: [E1008] Reached timeout=7500ms @10.106.7.182:8060.

  • Troubleshooting approach

    • If the timeout value in the error message is not 30000 ms (the default value for rpc_connect_timeout_ms), check if the rpc_connect_timeout_ms parameter on the BE node has been manually adjusted.

    • For INSERT INTO import jobs, you need to check whether the query_timeout parameter is set (for example, query_timeout = 15). Currently, the internal logic of StarRocks sets the RPC timeout threshold to half of the query_timeout value, in milliseconds (ms). Therefore, if query_timeout=15, the corresponding timeout is 7,500 ms.

  • Detailed troubleshooting steps

    • If you have modified the rpc_connect_timeout_ms parameter on the BE node, we recommend restoring it to the default value (30000 ms) to avoid false timeouts.

    • The error Reached timeout=7500ms usually indicates that the brpc thread load on the BE node is high, which causes delays in processing RPC requests and in turn triggers a timeout.

    • Analyze the data distribution of the target table by running SHOW TABLET FROM target_database.target_table ORDER BY RowCount DESC; to check if the table's tablet distribution is reasonable. For example, if the data volume of a single tablet far exceeds the recommended range of 1 GB to 10 GB, some BE nodes may experience high load, affecting write performance.

    • Solutions

      • Solution 1 (Recommended)
        Optimize the table's bucketing strategy by selecting an appropriate high-cardinality column as the bucketing key for DISTRIBUTED BY HASH(...) to ensure even data distribution across tablets.

      • Solution 2 (Temporary mitigation)
        If the CPU and I/O load on the BE nodes are not at their limits, you can tune the following parameters:

        • Increase brpc_num_threads: The default value is the number of BE CPU cores. You can try adjusting it to 2 or 4 times the original value. Do not exceed 4 times to avoid increased thread contention.

        • Increase flush_thread_num_per_store: The default value is 2. You can adjust it to 4 to improve data flushing concurrency.

"NULL value in non-nullable column" error

  • Symptom: When importing data into a StarRocks table, the following error occurs: Error: NULL value in non-nullable column 'xxx'.

  • Cause: Writing a NULL value to a NOT NULL column violates the table schema, causing the import job to fail.

  • Solutions

    • Solution 1: Fix the source data

      Before writing data to StarRocks, filter or replace NULL values to ensure the data conforms to the target table's constraints.

    • Solution 2: Modify the table schema

      If your business logic allows the field to be null, modify the table schema to remove the NOT NULL constraint.

"Too many versions" error with Flink connector

  • Symptom

    While continuously writing data to a StarRocks table using the Flink Connector, the following error occurs: because of too many versions, current/limit: 1009/1000.

  • Cause: In StarRocks' Primary Key model or Unique Key model (with merge-on-write), each data import generates a new version. To prevent metadata bloat and ensure query performance, the system limits each partition to a maximum of 1000 versions by default.

  • Solution

    • Check the partition compaction score.

      Run the following SQL statement to check the compaction pressure on the partitions of the target table:

      SELECT 
          TABLE_NAME,
          PARTITION_NAME,
          AvgCS AS avg_compaction_score,
          MaxCS AS max_compaction_score
      FROM information_schema.partitions_meta 
      WHERE TABLE_NAME = 'your_table_name';

      If the MaxCS (maximum compaction score) of the affected partition is much higher than 100, it indicates a large number of small versions are pending merge, and compaction has not completed in time.

    • Manually trigger compaction.

      Run the following command: ALTER TABLE your_db.your_table COMPACT PARTITION your_partition_name;

    • Optimize Flink sink parameters.

      Increase the Flink sink parameters, including sink.buffer-flush.max-bytes, sink.buffer-flush.max-rows, and sink.buffer-flush.interval-ms, to reduce the import frequency and avoid generating too many small versions.

"Failed to get status for file" error

  • Symptom

    When querying a table in an external data lake (such as Paimon or Iceberg), the query fails with the following error:
    (1064, 'Failed to get status for file: oss://data-lakehouse-oss-normal/dataware.db/dwd_annotation2_user/metadata/00097-10647858-814a-499e-b300-51c570ee7ee0.metadata.json'). The OSS API returns the following error message:

    <Error>
      <Code>AccessDenied</Code>
      <Message>You have no right to access this object because of bucket acl.</Message>
      <RequestId>68EC744EB6CD8C3539FAB32A</RequestId>
      <HostId>data-lakehouse-oss-normal.oss-cn-shenzhen-internal.aliyuncs.com</HostId>
      <EC>0003-00000001</EC>
      <RecommendDoc>https://api.aliyun.com/troubleshoot?q=0003-00000001</RecommendDoc>
    </Error>
  • Cause: When querying an external data lake (such as Hive, Iceberg, or Hudi) with the External Catalog feature, StarRocks requires access to object storage like Alibaba Cloud OSS. Improper permissions or configurations cause failures when reading metadata or data files.

  • Solution

    • Verify the AccessKey pair validity: Confirm that the configured accessKeyId and accessKeySecret can access the target OSS bucket.

    • If the accessKeyId and accessKeySecret are configured correctly, check if you are performing cross-account access to an OSS bucket. For cross-account access, you need to modify the configurations. For details, see How do I access OSS across accounts?.

Check disk usage of persistent indexes

After you enable the persistent index (enable_persistent_index = true or persistent_index_type = 'cloud_native'), the primary key index is stored on disk. You can check its disk usage by querying the information_schema.be_tablets table.

-- Query the index size of tables, ordered by index size in descending order
SELECT 
    tables_config.TABLE_NAME,
    t1.TABLE_ID,
    t1.index_sum_mb
FROM (
    -- Calculate the total index size (MB) for each table
    SELECT 
        TABLE_ID,
        SUM(INDEX_DISK)/1024/1024 AS index_sum_mb
    FROM information_schema.be_tablets 
    GROUP BY TABLE_ID
) t1 
JOIN tables_config ON tables_config.TABLE_ID = t1.TABLE_ID 
ORDER BY index_sum_mb DESC
-- Optional: Add a LIMIT clause to restrict the number of results to avoid large outputs
-- LIMIT 100
;

View in-progress write transactions and tablets

Track ongoing or recently completed import jobs to identify which tablets are being written to.

SELECT 
    txn_table.*,
    tc.table_name
FROM (
    SELECT 
        bt.TABLET_ID,
        bt.COMMIT_TIME,
        bt.PUBLISH_TIME,
        bt.TABLE_ID
    FROM information_schema.be_txns bt
    JOIN information_schema.be_tablets btt 
        ON bt.TABLET_ID = btt.TABLET_ID
) AS txn_table
JOIN information_schema.tables_config tc 
    ON txn_table.TABLE_ID = tc.TABLE_ID;

Analyze CPU or memory load spikes

Use the audit log to identify queries with high resource consumption.

SELECT 
    queryId,
    timestamp,
    ROUND(memCostBytes / 1024 / 1024 / 1024, 2) AS memCostGB,
    cpuCostNs
FROM _starrocks_audit_db_.starrocks_audit_tbl
WHERE timestamp BETWEEN '2025-xx-xx hh:mm:ss' AND '2025-xx-xx hh:mm:ss'
ORDER BY cpuCostNs DESC, memCostGB DESC
LIMIT 20;

Analyze I/O load spikes

I/O spikes are typically caused by large-scale scans, such as full table scans or queries that miss partitions or indexes.

SELECT 
    queryId,
    timestamp,
    ROUND(scanBytes / 1024 / 1024 / 1024, 2) AS scanTotalGB
FROM _starrocks_audit_db_.starrocks_audit_tbl
WHERE timestamp BETWEEN '2025-xx-xx hh:mm:ss' AND '2025-xx-xx hh:mm:ss'
ORDER BY scanTotalGB DESC
LIMIT 20;

"Insufficient storage" error during BE node scale-in

  • Symptom

    When you scale in BE nodes from the console of a fully managed StarRocks cluster, the following error is reported: invalid status: [insufficient storage].

  • Cause: The scale-in validation fails. The cluster must meet the following storage condition after the scale-in, or the operation will be rejected:

    Used Storage < Total Capacity after Scale-in × 0.7

    The terms are defined as follows:

    • Total Capacity after Scale-in = The sum of the total capacity of the remaining nodes (Total Nodes - Scaled-in Nodes).

    • Used Storage = The sum of (Total Capacity - Available Capacity) for all nodes.

    • Obtain the TotalCapacity and AvailCapacity values by running SHOW BACKENDS.

  • Solution

    • Check the current cluster capacity.

      Run the following SQL command to get the total and available capacity of each BE node:

      SHOW BACKENDS\G

      Focus on the TotalCapacity and AvailCapacity fields to calculate whether the used storage exceeds 70% of the total capacity after the scale-in.

    • Expand the disk and retry.

      If the disk capacity is insufficient, expand the disk size of the BE node in the EMR console. After you ensure that Used storage < Total capacity after scale-in × 0.7, retry the scale-in operation.

"RAM.Permission.NotAllow" error in StarRocks Manager

  • Symptom

    When a RAM user logs in to the StarRocks Manager page for a fully managed cluster, the error You are not authorized to perform the operation (code: RAM.Permission.NotAllow) appears.

  • Cause: The RAM user lacks the necessary permissions for EMR Serverless StarRocks, preventing access to the StarRocks Manager page. For more information, see Grant permissions to a RAM user.

  • Solution

    • Solution 1: Attach a system policy.

      Log in to the RAM console and attach the AliyunEMRStarRocksFullAccess system policy to the target RAM user to grant full operational permissions for EMR Serverless StarRocks.

    • Solution 2: Grant specific permissions.

      To avoid granting full access, identify the specific missing permission from the RequestId in the error message, and then grant only that permission to the RAM user in the RAM console. For example, if the emr-serverless-starrocks:ListInstances permission is missing, you can create a custom policy to grant it individually.

    To view all permissions included in this system policy, search for AliyunEMRStarRocksFullAccess on the Policies page in the RAM console.