Big data instance types

更新时间:
复制 MD 格式

Big data instance families combine large-scale local storage with high internal network bandwidth. They are optimized for Hadoop MapReduce, Hadoop Distributed File System (HDFS), Hive, HBase, Spark, Elasticsearch, and Kafka workloads. Most families offer a 1:4 CPU-to-memory ratio.

Note

Instance availability varies by region. Before you select an instance type, check purchase availability by region and review the instance type selection guide. For metric definitions, see instance metric descriptions. To estimate costs, use the ECS Price Calculator.

Recommended instance families

RecommendedNot recommended (use the recommended families if these are sold out)
d3s, d3c, d2c, d2sd1ne

Choose between d3s and d3c

Both d3s and d3c use Intel Ice Lake processors and support hot disk swapping, but they serve different workload profiles:

d3s (storage-intensive)d3c (compute-intensive)
Max local storage per instance32 × 11,918 GB (~380 TB)4 × 13,743 GB (~55 TB)
Max network bandwidth80 Gbit/s40 Gbit/s
Disk IOPS specNot publishedUp to 100,000 IOPS
OS supportNot restricted (Linux and Windows supported)Linux only
Storage-compute decouplingNot mentioned in product documentationSupported (EMR JindoFS + OSS)

Choose d3s when storage capacity and sequential throughput are the primary constraints — for example, large HDFS clusters with high data volumes per node. Choose d3c when you need higher disk IOPS, storage-compute decoupling via EMR JindoFS and Object Storage Service (OSS), or hot/cold data separation.

Local disk considerations

Warning

Local disk data durability depends on the reliability of the physical host. A hardware failure on the host can result in data loss. Store only temporary or replicated data on local disks. For more information, see Local disks.

Frameworks such as HDFS and Kafka replicate data across multiple nodes by design. Size your cluster with enough replicas to maintain durability if a single node fails.

Additional constraints:

  • Instances with local SSDs do not support instance type changes.

  • Local disks are tied to specific instance types. The count and capacity vary by instance type. Local disks cannot be purchased separately or moved to other instances.

  • Snapshots cannot be created for local disks. To create an image from an instance with local SSDs, snapshot only the system disk and any cloud data disks (not local disks), then combine those snapshots into an image.

  • Images that combine system disk snapshots with local SSD data disk snapshots cannot be created.

  • Standard SSDs can be attached to instances with local SSDs, and their capacity can be extended.

  • Certain instance operations affect data on local disks. For details, see Impacts of instance operations on data stored on local disks.

Initialize local disks

Linux kernel v2.6.37 and later enable the lazyinit feature by default, which defers inode table initialization until the file system is mounted. On instances with many local disks, this deferred initialization can consume up to 600 MB/s of disk throughput and affect service stability. Linux kernel v4.x increased the concurrency limit for lazy initialization. For the upstream fix, see this kernel commit.

To initialize all local disks before starting services:

  1. List all local serial advanced technology attachment (SATA) HDDs on the instance.

  2. Run the following command for each local disk to disable lazy initialization. This example formats /dev/vdb with an ext4 file system:

    mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdb &

    Run this command in parallel for each disk (note the trailing &).

  3. After all disks finish formatting, run the following command and wait until the I/O activity for every disk drops to 0:

    iostat -x 5
  4. Mount all disks.

d3s, storage-intensive big data instance family

Key specs: Up to 32 × 11,918 GB local SATA HDDs (~380 TB raw), up to 80 Gbit/s network bandwidth, 2.7 GHz Intel® Xeon® Scalable (Ice Lake) processors with 3.5 GHz all-core turbo frequency.

Use cases:

  • Hadoop MapReduce, HDFS, Hive, and HBase workloads

  • Spark in-memory computing and MLlib

  • Elasticsearch and Kafka deployments

Hardware:

  • All instances are I/O optimized

  • Supported cloud disk types: ESSDs and ESSD AutoPL disks

  • Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Disk failure handling: d3s supports online replacement and hot swapping of failed disks without instance shutdown. When a disk fails, you receive a system event. Initiate the disk repair process to resolve it. For details, see O&M scenarios and system events for instances equipped with local disks.

Important

Data on a failed disk cannot be restored after you initiate the repair process.

Instance types:

Instance typevCPUsMemory (GiB)Local storageNetwork baseline/burst bandwidth (Gbit/s)Packet forwarding rate (pps)Disk baseline/burst bandwidth (Gbit/s)
ecs.d3s.2xlarge8324 × 11,918 GB (4 × 11,100 GiB)10/burstable up to 152,000,0003/burstable up to 5
ecs.d3s.4xlarge16648 × 11,918 GB (8 × 11,100 GiB)25/none3,000,0005/none
ecs.d3s.8xlarge3212816 × 11,918 GB (16 × 11,100 GiB)40/none6,000,0008/none
ecs.d3s.12xlarge4819224 × 11,918 GB (24 × 11,100 GiB)60/none9,000,00012/none
ecs.d3s.16xlarge6425632 × 11,918 GB (32 × 11,100 GiB)80/none12,000,00016/none

d3c, compute-intensive big data instance family

Key specs: Up to 4 × 13,743 GB local disks (~55 TB raw), up to 40 Gbit/s network bandwidth, third-generation 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors with 3.5 GHz all-core turbo frequency.

Note

d3c supports Linux images only. Select a Linux image when creating an instance.

Use cases:

  • Hadoop MapReduce, HDFS, Hive, and HBase workloads

  • Storage-compute decoupling with EMR JindoFS and OSS (hot/cold data separation)

  • Spark in-memory computing and MLlib

  • Elasticsearch and Kafka deployments

Hardware:

  • All instances are I/O optimized

  • Supported cloud disk types: ESSDs and ESSD AutoPL disks

  • Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Disk failure handling: d3c supports online replacement and hot swapping of failed disks without instance shutdown. When a disk fails, you receive a system event. Initiate the disk repair process to resolve it. For details, see O&M scenarios and system events for instances equipped with local disks.

Important

Data on a failed disk cannot be restored after you initiate the repair process.

Instance types:

Instance typevCPUsMemory (GiB)Local storageNetwork baseline/burst bandwidth (Gbit/s)Packet forwarding rate (pps)Disk baseline/burst IOPSDisk baseline/burst bandwidth (Gbit/s)
ecs.d3c.3xlarge1456.01 × 13,743 GB (1 × 12,800 GiB)8/burstable up to 101,600,00040,000/none3/none
ecs.d3c.7xlarge28112.02 × 13,743 GB (2 × 12,800 GiB)16/burstable up to 252,500,00050,000/none4/none
ecs.d3c.14xlarge56224.04 × 13,743 GB (4 × 12,800 GiB)40/none5,000,000100,000/none8/none

d2c, compute-intensive big data instance family

Key specs: Up to 12 × 3,972 GB local SATA HDDs (~47 TB raw), up to 35 Gbit/s network bandwidth, 2.5 GHz Intel® Xeon® Platinum 8269CY (Cascade Lake) processors.

Use cases:

  • Hadoop MapReduce, HDFS, Hive, and HBase workloads

  • Storage-compute decoupling with EMR JindoFS and OSS (hot/cold data separation)

  • Spark in-memory computing and MLlib

  • Elasticsearch and Kafka deployments

Hardware:

  • All instances are I/O optimized

  • Supported cloud disk types: Enhanced SSDs (ESSDs), ESSD AutoPL disks, standard SSDs, and ultra disks

  • Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Disk failure handling: d2c supports online replacement and hot swapping of failed disks without instance shutdown. When a disk fails, you receive a system event. Initiate the disk repair process to resolve it. For details, see O&M scenarios and system events for instances equipped with local disks.

Important

Data on a failed disk cannot be restored after you initiate the repair process.

Instance types:

Instance typevCPUsMemory (GiB)Local storageNetwork baseline bandwidth (Gbit/s)Packet forwarding rate (pps)
ecs.d2c.6xlarge2488.03 × 3,972 GB (3 × 3,700 GiB)12.01,600,000
ecs.d2c.12xlarge48176.06 × 3,972 GB (6 × 3,700 GiB)20.02,000,000
ecs.d2c.24xlarge96352.012 × 3,972 GB (12 × 3,700 GiB)35.04,500,000

d2s, storage-intensive big data instance family

Key specs: Up to 30 × 7,838 GB local SATA HDDs (~235 TB raw), up to 35 Gbit/s network bandwidth, 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.

Use cases:

  • Hadoop MapReduce, HDFS, Hive, and HBase workloads

  • Spark in-memory computing and MLlib

  • Elasticsearch and Kafka deployments

Hardware:

  • All instances are I/O optimized

  • Supported cloud disk types: ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks

  • Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Disk failure handling: d2s supports online replacement and hot swapping of failed disks without instance shutdown. When a disk fails, you receive a system event. Initiate the disk repair process to resolve it. For details, see O&M scenarios and system events for instances equipped with local disks.

Important

Data on a failed disk cannot be restored after you initiate the repair process.

Instance types:

Instance typevCPUsMemory (GiB)Local storageNetwork baseline bandwidth (Gbit/s)Packet forwarding rate (pps)
ecs.d2s.5xlarge2088.08 × 7,838 GB (8 × 7,300 GiB)12.01,600,000
ecs.d2s.10xlarge40176.015 × 7,838 GB (15 × 7,300 GiB)20.02,000,000
ecs.d2s.20xlarge80352.030 × 7,838 GB (30 × 7,300 GiB)35.04,500,000

d1ne, network-enhanced big data instance family (not recommended)

Note

d1ne is no longer recommended. Use d3s, d3c, d2c, or d2s instead.

Key specs: Up to 28 × 5,905 GB local SATA HDDs (~165 TB raw), up to 35 Gbit/s network bandwidth, 1:4 CPU-to-memory ratio.

Use cases:

  • Hadoop MapReduce, HDFS, Hive, and HBase workloads

  • Spark in-memory computing and MLlib

  • Elasticsearch deployments

Hardware:

  • Processor: 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) or Intel® Xeon® Platinum 8163 (Skylake)

  • All instances are I/O optimized

  • Supported cloud disk types: standard SSDs and ultra disks only

  • Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Instance types:

Instance typevCPUsMemory (GiB)Local storageNetwork baseline bandwidth (Gbit/s)Packet forwarding rate (pps)
ecs.d1ne.2xlarge832.04 × 5,905 GB (4 × 5,500 GiB)6.01,000,000
ecs.d1ne.4xlarge1664.08 × 5,905 GB (8 × 5,500 GiB)12.01,600,000
ecs.d1ne.6xlarge2496.012 × 5,905 GB (12 × 5,500 GiB)16.02,000,000
ecs.d1ne-c8d3.8xlarge32128.012 × 5,905 GB (12 × 5,500 GiB)20.02,000,000
ecs.d1ne.8xlarge32128.016 × 5,905 GB (16 × 5,500 GiB)20.02,500,000
ecs.d1ne-c14d3.14xlarge56160.012 × 5,905 GB (12 × 5,500 GiB)35.04,500,000
ecs.d1ne.14xlarge56224.028 × 5,905 GB (28 × 5,500 GiB)35.04,500,000