Benchmarks comparing ossfs 2.0, ossfs 1.0, and goofys across sequential I/O and concurrent small-file reads, measuring throughput, CPU usage, and memory consumption.
Test environment
-
Hardware environment
-
Instance type: ecs.g9i.48xlarge
-
vCPU: 192 vCPUs
-
Memory: 768 GiB
-
Network bandwidth: 64 Gbps
-
-
Software environment
-
Operating system: Alibaba Cloud Linux 3.2104 LTS 64-bit
-
Kernel version: 5.10.134-18.al8.x86_64
-
Tool versions: ossfs 2.0.4, ossfs 1.91.8, and goofys 0.24.0
-
Mount configurations
Mount options used in this benchmark:
This test uses HTTPS endpoints. In a trusted environment, HTTP endpoints reduce CPU overhead at the same throughput.
ossfs 2.0.4
-
Mount configuration file (ossfs2.conf)
Upload part size is set to 33554432 bytes (32 MB).
# The endpoint of the bucket's region --oss_endpoint=https://oss-cn-hangzhou-internal.aliyuncs.com # The bucket name --oss_bucket=bucket-test # The AccessKey ID and AccessKey secret --oss_access_key_id=yourAccessKeyID --oss_access_key_secret=yourAccessKeySecret # The upload part size, in bytes --upload_buffer_size=33554432 -
Mount command
Mount
bucket-testto/mnt/ossfs2/usingossfs2.conf:ossfs2 mount /mnt/ossfs2/ -c /etc/ossfs2.conf
ossfs 1.91.8
Mount bucket-test to /mnt/ossfs with direct read and cache optimization enabled:
ossfs bucket-test /mnt/ossfs -ourl=https://oss-cn-hangzhou-internal.aliyuncs.com -odirect_read -oreaddir_optimize
goofys 0.24.0
Mount bucket-test to /mnt/goofys:
goofys --endpoint https://oss-cn-hangzhou-internal.aliyuncs.com --subdomain bucket-test --stat-cache-ttl 60s --type-cache-ttl 60s /mnt/goofys
Test scenarios
Each tool mounted the same bucket, then FIO measured read/write performance. Results follow.
Single-threaded sequential write (100 GB)
ossfs 1.0 write performance is limited by disk I/O.
-
Test command
Single-threaded direct write of 100 GB with 1 MB block size:
fio --name=file-100G --ioengine=libaio --rw=write --bs=1M --size=100G --numjobs=1 --direct=1 --directory=/mnt/oss/fio_direct_write --group_reporting -
Test results
Tool
Bandwidth
CPU core utilization (100% for a single fully loaded core)
Peak memory
ossfs 2.0
2.2 GB/s
207%
2167 MB
ossfs 1.0
118 MB/s
5%
15 MB
goofys
450 MB/s
250%
7.5 GB
Single-threaded sequential read (100 GB)
-
Test command
Clear the page cache, then run a single-threaded sequential read with 1 MB blocks:
echo 1 > /proc/sys/vm/drop_caches fio --name=file-100G --ioengine=libaio --direct=1 --rw=read --bs=1M --directory=/mnt/oss/fio_direct_write --group_reporting --numjobs=1 -
Test results
Test tool
Bandwidth
CPU core utilization (100% for a single fully loaded core)
Peak memory
ossfs 2.0
4.3 GB/s
610%
1629 MB
ossfs 1.0
1.0 GB/s
530%
260 MB
goofys
1.3 GB/s
270%
976 MB
Multi-threaded sequential read (4 × 100 GB)
-
Generate test files
Create four 100 GB test files in
/mnt/oss/fio:fio --name=file-100g --ioengine=libaio --direct=1 --iodepth=1 --numjobs=4 --nrfiles=1 --rw=write --bs=1M --size=100G --group_reporting --thread --directory=/mnt/oss/fio -
Test command
Clear the page cache, then read four 100 GB files with 4 threads for 30 seconds (1 MB blocks) in
/mnt/oss/fio:echo 1 > /proc/sys/vm/drop_caches fio --name=file-100g --ioengine=libaio --direct=1 --iodepth=1 --numjobs=4 --nrfiles=1 --rw=read --bs=1M --size=100G --group_reporting --thread --directory=/mnt/oss/fio --time_based --runtime=30 -
Test results
Tool
Bandwidth
CPU core utilization (100% for a single fully loaded core)
Peak memory
ossfs 2.0
7.4 GB/s
890%
6.2 GB
ossfs 1.0
1.8 GB/s
739%
735 MB
goofys
2.8 GB/s
7800%
2.7 GB
Concurrent small-file read (128 threads, 100K × 128 KB)
OSS has a default 10,000 QPS limit. To reproduce these results, ensure no other services consume the test account's QPS quota.
-
Steps
-
Create a Go program named
rw-bench.go.The program concurrently creates or reads files in a target directory and reports throughput.
-
Compile the
rw-bench.goprogram file.go build rw-bench.go -
Create 100,000 files (128 KB each) in the mounted directory:
mkdir -p <path_to_mounted_test_directory> && ./rw-bench --dir <path_to_mounted_test_directory> --file-size-KB 128 --file-count 100000 --write -
Clear the page cache and run the test five times. Record steady-state results after latency stabilizes.
echo 1 > /proc/sys/vm/drop_caches ./rw-bench --dir <path_to_mounted_test_directory> --threads 128
-
-
Test results
Tool
Bandwidth
CPU core utilization (100% for a single fully loaded core)
Peak memory
ossfs 2.0
1 GB/s
247%
176 MB
ossfs 1.0
45 MB/s
25%
412 MB
goofys
1 GB/s
750%
1.3 GB