Big data instances-Elastic Compute Service(ECS)-阿里云帮助中心

Big data instance families combine large-scale local storage with high internal network bandwidth. They are optimized for Hadoop MapReduce, Hadoop Distributed File System (HDFS), Hive, HBase, Spark, Elasticsearch, and Kafka workloads. Most families offer a 1:4 CPU-to-memory ratio.

Note

Instance availability varies by region. Before you select an instance type, check purchase availability by region and review the instance type selection guide. For metric definitions, see instance metric descriptions. To estimate costs, use the ECS Price Calculator.

Recommended instance families

Recommended	Not recommended (use the recommended families if these are sold out)
d3s, d3c, d2c, d2s	d1ne

Choose between d3s and d3c

Both d3s and d3c use Intel Ice Lake processors and support hot disk swapping, but they serve different workload profiles:

	d3s (storage-intensive)	d3c (compute-intensive)
Max local storage per instance	32 × 11,918 GB (~380 TB)	4 × 13,743 GB (~55 TB)
Max network bandwidth	80 Gbit/s	40 Gbit/s
Disk IOPS spec	Not published	Up to 100,000 IOPS
OS support	Not restricted (Linux and Windows supported)	Linux only
Storage-compute decoupling	Not mentioned in product documentation	Supported (EMR JindoFS + OSS)

Choose d3s when storage capacity and sequential throughput are the primary constraints — for example, large HDFS clusters with high data volumes per node. Choose d3c when you need higher disk IOPS, storage-compute decoupling via EMR JindoFS and Object Storage Service (OSS), or hot/cold data separation.

Local disk considerations

Warning

Local disk data durability depends on the reliability of the physical host. A hardware failure on the host can result in data loss. Store only temporary or replicated data on local disks. For more information, see Local disks.

Frameworks such as HDFS and Kafka replicate data across multiple nodes by design. Size your cluster with enough replicas to maintain durability if a single node fails.

Additional constraints:

Instances with local SSDs do not support instance type changes.
Local disks are tied to specific instance types. The count and capacity vary by instance type. Local disks cannot be purchased separately or moved to other instances.
Snapshots cannot be created for local disks. To create an image from an instance with local SSDs, snapshot only the system disk and any cloud data disks (not local disks), then combine those snapshots into an image.
Images that combine system disk snapshots with local SSD data disk snapshots cannot be created.
Standard SSDs can be attached to instances with local SSDs, and their capacity can be extended.
Certain instance operations affect data on local disks. For details, see Impacts of instance operations on data stored on local disks.

Initialize local disks

Linux kernel v2.6.37 and later enable the lazyinit feature by default, which defers inode table initialization until the file system is mounted. On instances with many local disks, this deferred initialization can consume up to 600 MB/s of disk throughput and affect service stability. Linux kernel v4.x increased the concurrency limit for lazy initialization. For the upstream fix, see this kernel commit.

To initialize all local disks before starting services:

List all local serial advanced technology attachment (SATA) HDDs on the instance.
Run the following command for each local disk to disable lazy initialization. This example formats /dev/vdb with an ext4 file system:
```
mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdb &
```
Run this command in parallel for each disk (note the trailing &).
After all disks finish formatting, run the following command and wait until the I/O activity for every disk drops to 0:
```
iostat -x 5
```
Mount all disks.

d3s, storage-intensive big data instance family

Key specs: Up to 32 × 11,918 GB local SATA HDDs (~380 TB raw), up to 80 Gbit/s network bandwidth, 2.7 GHz Intel® Xeon® Scalable (Ice Lake) processors with 3.5 GHz all-core turbo frequency.

Use cases:

Hadoop MapReduce, HDFS, Hive, and HBase workloads
Spark in-memory computing and MLlib
Elasticsearch and Kafka deployments

Hardware:

All instances are I/O optimized
Supported cloud disk types: ESSDs and ESSD AutoPL disks
Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Disk failure handling: d3s supports online replacement and hot swapping of failed disks without instance shutdown. When a disk fails, you receive a system event. Initiate the disk repair process to resolve it. For details, see O&M scenarios and system events for instances equipped with local disks.

Important

Data on a failed disk cannot be restored after you initiate the repair process.

Instance types:

Instance type	vCPUs	Memory (GiB)	Local storage	Network baseline/burst bandwidth (Gbit/s)	Packet forwarding rate (pps)	Disk baseline/burst bandwidth (Gbit/s)
ecs.d3s.2xlarge	8	32	4 × 11,918 GB (4 × 11,100 GiB)	10/burstable up to 15	2,000,000	3/burstable up to 5
ecs.d3s.4xlarge	16	64	8 × 11,918 GB (8 × 11,100 GiB)	25/none	3,000,000	5/none
ecs.d3s.8xlarge	32	128	16 × 11,918 GB (16 × 11,100 GiB)	40/none	6,000,000	8/none
ecs.d3s.12xlarge	48	192	24 × 11,918 GB (24 × 11,100 GiB)	60/none	9,000,000	12/none
ecs.d3s.16xlarge	64	256	32 × 11,918 GB (32 × 11,100 GiB)	80/none	12,000,000	16/none

d3c, compute-intensive big data instance family

Key specs: Up to 4 × 13,743 GB local disks (~55 TB raw), up to 40 Gbit/s network bandwidth, third-generation 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors with 3.5 GHz all-core turbo frequency.

Note

d3c supports Linux images only. Select a Linux image when creating an instance.

Use cases:

Hadoop MapReduce, HDFS, Hive, and HBase workloads
Storage-compute decoupling with EMR JindoFS and OSS (hot/cold data separation)
Spark in-memory computing and MLlib
Elasticsearch and Kafka deployments

Hardware:

All instances are I/O optimized
Supported cloud disk types: ESSDs and ESSD AutoPL disks
Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Disk failure handling: d3c supports online replacement and hot swapping of failed disks without instance shutdown. When a disk fails, you receive a system event. Initiate the disk repair process to resolve it. For details, see O&M scenarios and system events for instances equipped with local disks.

Important

Data on a failed disk cannot be restored after you initiate the repair process.

Instance types:

Instance type	vCPUs	Memory (GiB)	Local storage	Network baseline/burst bandwidth (Gbit/s)	Packet forwarding rate (pps)	Disk baseline/burst IOPS	Disk baseline/burst bandwidth (Gbit/s)
ecs.d3c.3xlarge	14	56.0	1 × 13,743 GB (1 × 12,800 GiB)	8/burstable up to 10	1,600,000	40,000/none	3/none
ecs.d3c.7xlarge	28	112.0	2 × 13,743 GB (2 × 12,800 GiB)	16/burstable up to 25	2,500,000	50,000/none	4/none
ecs.d3c.14xlarge	56	224.0	4 × 13,743 GB (4 × 12,800 GiB)	40/none	5,000,000	100,000/none	8/none

d2c, compute-intensive big data instance family

Key specs: Up to 12 × 3,972 GB local SATA HDDs (~47 TB raw), up to 35 Gbit/s network bandwidth, 2.5 GHz Intel® Xeon® Platinum 8269CY (Cascade Lake) processors.

Use cases:

Hadoop MapReduce, HDFS, Hive, and HBase workloads
Storage-compute decoupling with EMR JindoFS and OSS (hot/cold data separation)
Spark in-memory computing and MLlib
Elasticsearch and Kafka deployments

Hardware:

All instances are I/O optimized
Supported cloud disk types: Enhanced SSDs (ESSDs), ESSD AutoPL disks, standard SSDs, and ultra disks
Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Disk failure handling: d2c supports online replacement and hot swapping of failed disks without instance shutdown. When a disk fails, you receive a system event. Initiate the disk repair process to resolve it. For details, see O&M scenarios and system events for instances equipped with local disks.

Important

Data on a failed disk cannot be restored after you initiate the repair process.

Instance types:

Instance type	vCPUs	Memory (GiB)	Local storage	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)
ecs.d2c.6xlarge	24	88.0	3 × 3,972 GB (3 × 3,700 GiB)	12.0	1,600,000
ecs.d2c.12xlarge	48	176.0	6 × 3,972 GB (6 × 3,700 GiB)	20.0	2,000,000
ecs.d2c.24xlarge	96	352.0	12 × 3,972 GB (12 × 3,700 GiB)	35.0	4,500,000

d2s, storage-intensive big data instance family

Key specs: Up to 30 × 7,838 GB local SATA HDDs (~235 TB raw), up to 35 Gbit/s network bandwidth, 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.

Use cases:

Hadoop MapReduce, HDFS, Hive, and HBase workloads
Spark in-memory computing and MLlib
Elasticsearch and Kafka deployments

Hardware:

All instances are I/O optimized
Supported cloud disk types: ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks
Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Disk failure handling: d2s supports online replacement and hot swapping of failed disks without instance shutdown. When a disk fails, you receive a system event. Initiate the disk repair process to resolve it. For details, see O&M scenarios and system events for instances equipped with local disks.

Important

Data on a failed disk cannot be restored after you initiate the repair process.

Instance types:

Instance type	vCPUs	Memory (GiB)	Local storage	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)
ecs.d2s.5xlarge	20	88.0	8 × 7,838 GB (8 × 7,300 GiB)	12.0	1,600,000
ecs.d2s.10xlarge	40	176.0	15 × 7,838 GB (15 × 7,300 GiB)	20.0	2,000,000
ecs.d2s.20xlarge	80	352.0	30 × 7,838 GB (30 × 7,300 GiB)	35.0	4,500,000

d1ne, network-enhanced big data instance family (not recommended)

Note

d1ne is no longer recommended. Use d3s, d3c, d2c, or d2s instead.

Key specs: Up to 28 × 5,905 GB local SATA HDDs (~165 TB raw), up to 35 Gbit/s network bandwidth, 1:4 CPU-to-memory ratio.

Use cases:

Hadoop MapReduce, HDFS, Hive, and HBase workloads
Spark in-memory computing and MLlib
Elasticsearch deployments

Hardware:

Processor: 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) or Intel® Xeon® Platinum 8163 (Skylake)
All instances are I/O optimized
Supported cloud disk types: standard SSDs and ultra disks only
Network: IPv4 and IPv6. For IPv6 setup, see IPv6 communication. Network performance scales with instance size.

Instance types:

Instance type	vCPUs	Memory (GiB)	Local storage	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)
ecs.d1ne.2xlarge	8	32.0	4 × 5,905 GB (4 × 5,500 GiB)	6.0	1,000,000
ecs.d1ne.4xlarge	16	64.0	8 × 5,905 GB (8 × 5,500 GiB)	12.0	1,600,000
ecs.d1ne.6xlarge	24	96.0	12 × 5,905 GB (12 × 5,500 GiB)	16.0	2,000,000
ecs.d1ne-c8d3.8xlarge	32	128.0	12 × 5,905 GB (12 × 5,500 GiB)	20.0	2,000,000
ecs.d1ne.8xlarge	32	128.0	16 × 5,905 GB (16 × 5,500 GiB)	20.0	2,500,000
ecs.d1ne-c14d3.14xlarge	56	160.0	12 × 5,905 GB (12 × 5,500 GiB)	35.0	4,500,000
ecs.d1ne.14xlarge	56	224.0	28 × 5,905 GB (28 × 5,500 GiB)	35.0	4,500,000