ossfs 2.0 performance benchmarks

更新时间:
复制 MD 格式

Benchmarks comparing ossfs 2.0, ossfs 1.0, and goofys across sequential I/O and concurrent small-file reads, measuring throughput, CPU usage, and memory consumption.

Test environment

  • Hardware environment

    • Instance type: ecs.g9i.48xlarge

    • vCPU: 192 vCPUs

    • Memory: 768 GiB

    • Network bandwidth: 64 Gbps

  • Software environment

    • Operating system: Alibaba Cloud Linux 3.2104 LTS 64-bit

    • Kernel version: 5.10.134-18.al8.x86_64

    • Tool versions: ossfs 2.0.4, ossfs 1.91.8, and goofys 0.24.0

Mount configurations

Mount options used in this benchmark:

Note

This test uses HTTPS endpoints. In a trusted environment, HTTP endpoints reduce CPU overhead at the same throughput.

ossfs 2.0.4

  • Mount configuration file (ossfs2.conf)

    Upload part size is set to 33554432 bytes (32 MB).

    # The endpoint of the bucket's region
    --oss_endpoint=https://oss-cn-hangzhou-internal.aliyuncs.com
    
    # The bucket name
    --oss_bucket=bucket-test
    
    # The AccessKey ID and AccessKey secret
    --oss_access_key_id=yourAccessKeyID
    --oss_access_key_secret=yourAccessKeySecret
    
    # The upload part size, in bytes
    --upload_buffer_size=33554432
  • Mount command

    Mount bucket-test to /mnt/ossfs2/ using ossfs2.conf:

    ossfs2 mount /mnt/ossfs2/ -c /etc/ossfs2.conf

ossfs 1.91.8

Mount bucket-test to /mnt/ossfs with direct read and cache optimization enabled:

ossfs bucket-test /mnt/ossfs -ourl=https://oss-cn-hangzhou-internal.aliyuncs.com -odirect_read -oreaddir_optimize

goofys 0.24.0

Mount bucket-test to /mnt/goofys:

goofys --endpoint https://oss-cn-hangzhou-internal.aliyuncs.com --subdomain bucket-test --stat-cache-ttl 60s --type-cache-ttl 60s /mnt/goofys

Test scenarios

Each tool mounted the same bucket, then FIO measured read/write performance. Results follow.

Single-threaded sequential write (100 GB)

Note

ossfs 1.0 write performance is limited by disk I/O.

  • Test command

    Single-threaded direct write of 100 GB with 1 MB block size:

    fio --name=file-100G --ioengine=libaio --rw=write --bs=1M --size=100G --numjobs=1 --direct=1 --directory=/mnt/oss/fio_direct_write --group_reporting
  • Test results

    Tool

    Bandwidth

    CPU core utilization (100% for a single fully loaded core)

    Peak memory

    ossfs 2.0

    2.2 GB/s

    207%

    2167 MB

    ossfs 1.0

    118 MB/s

    5%

    15 MB

    goofys

    450 MB/s

    250%

    7.5 GB

Single-threaded sequential read (100 GB)

  • Test command

    Clear the page cache, then run a single-threaded sequential read with 1 MB blocks:

    echo 1 > /proc/sys/vm/drop_caches
    fio --name=file-100G --ioengine=libaio --direct=1 --rw=read --bs=1M --directory=/mnt/oss/fio_direct_write --group_reporting --numjobs=1
  • Test results

    Test tool

    Bandwidth

    CPU core utilization (100% for a single fully loaded core)

    Peak memory

    ossfs 2.0

    4.3 GB/s

    610%

    1629 MB

    ossfs 1.0

    1.0 GB/s

    530%

    260 MB

    goofys

    1.3 GB/s

    270%

    976 MB

Multi-threaded sequential read (4 × 100 GB)

  • Generate test files

    Create four 100 GB test files in /mnt/oss/fio:

    fio --name=file-100g --ioengine=libaio --direct=1 --iodepth=1 --numjobs=4 --nrfiles=1 --rw=write --bs=1M  --size=100G --group_reporting --thread --directory=/mnt/oss/fio
  • Test command

    Clear the page cache, then read four 100 GB files with 4 threads for 30 seconds (1 MB blocks) in /mnt/oss/fio:

    echo 1 > /proc/sys/vm/drop_caches
    fio --name=file-100g --ioengine=libaio --direct=1 --iodepth=1 --numjobs=4 --nrfiles=1 --rw=read --bs=1M  --size=100G --group_reporting --thread --directory=/mnt/oss/fio --time_based --runtime=30
  • Test results

    Tool

    Bandwidth

    CPU core utilization (100% for a single fully loaded core)

    Peak memory

    ossfs 2.0

    7.4 GB/s

    890%

    6.2 GB

    ossfs 1.0

    1.8 GB/s

    739%

    735 MB

    goofys

    2.8 GB/s

    7800%

    2.7 GB

Concurrent small-file read (128 threads, 100K × 128 KB)

Note

OSS has a default 10,000 QPS limit. To reproduce these results, ensure no other services consume the test account's QPS quota.

  • Steps

    1. Create a Go program named rw-bench.go.

      The program concurrently creates or reads files in a target directory and reports throughput.

      Sample code

      package main
      
      import (
      	"flag"
      	"fmt"
      	"io"
      	"log"
      	"os"
      	"path/filepath"
      	"sync"
      	"time"
      )
      
      var dir = flag.String("dir", "", "work dir")
      var threads = flag.Int("threads", 8, "concurrency threads count")
      var isWrite = flag.Bool("write", false, "test write files")
      var fileSize = flag.Int64("file-size-KB", 128, "file size in KBytes")
      var fileCount = flag.Int("file-count", 0, "file count")
      
      type fileInfo struct {
      	Name string
      	Size int64
      }
      
      func getFileList(dir string, isWrite bool) []fileInfo {
      	var files []fileInfo
      
      	if isWrite {
      		for i := 0; i < *fileCount; i++ {
      			files = append(files, fileInfo{
      				Name: fmt.Sprintf("%v/%v.dat", dir, i),
      				Size: *fileSize * 1024,
      			})
      		}
      	} else {
      		err := filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
      			if err != nil {
      				return err
      			}
      			if !info.IsDir() {
      				files = append(files, fileInfo{
      					Name: path,
      					Size: info.Size(),
      				})
      			}
      			return nil
      		})
      
      		if err != nil {
      			log.Fatalf("Error walking the path %v: %v\n", dir, err)
      		}
      	}
      
      	return files
      }
      
      func worker(taskChan <-chan fileInfo, wg *sync.WaitGroup, bytesChan chan<- int64, isWrite bool) {
      	defer wg.Done()
      	buffer := make([]byte, 1024*1024)
      
      	for fInfo := range taskChan {
      		var fd *os.File
      		var err error
      		if isWrite {
      			fd, err = os.OpenFile(fInfo.Name, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0644)
      			if err != nil {
      				fmt.Printf("Failed to create/open %v with %v\n", fInfo.Name, err)
      				continue
      			}
      		} else {
      			fd, err = os.OpenFile(fInfo.Name, os.O_RDONLY, 0)
      			if err != nil {
      				fmt.Printf("Failed to open %v with %v\n", fInfo.Name, err)
      				continue
      			}
      		}
      
      		offset := int64(0)
      		var totalBytes int64
      		for offset < fInfo.Size {
      			var n int
      
      			if offset+int64(len(buffer)) > fInfo.Size {
      				buffer = buffer[:fInfo.Size-offset]
      			}
      
      			if isWrite {
      				n, err = fd.WriteAt(buffer, offset)
      				if err != nil {
      					fmt.Printf("Failed to write file %v at %v, with %v\n", fInfo.Name, offset, err)
      					break
      				}
      			} else {
      				n, err = fd.ReadAt(buffer, offset)
      				if err != nil && err != io.EOF {
      					fmt.Printf("Failed to read file %v at %v, with %v\n", fInfo.Name, offset, err)
      					break
      				}
      			}
      
      			totalBytes += int64(n)
      			offset += int64(n)
      		}
      
      		fd.Close()
      		bytesChan <- totalBytes
      	}
      }
      
      func doBench(dir string, isWrite bool) {
      	files := getFileList(dir, isWrite)
      	var wg sync.WaitGroup
      
      	if isWrite {
      		fmt.Printf("start write bench with %v files\n", len(files))
      	} else {
      		fmt.Printf("start read bench with %v files\n", len(files))
      	}
      
      	taskChan := make(chan fileInfo, 1024)
      
      	go func(taskChan chan<- fileInfo) {
      		for _, fInfo := range files {
      			taskChan <- fInfo
      		}
      		close(taskChan)
      	}(taskChan)
      
      	bytesChan := make(chan int64, 1024)
      	for i := 0; i < *threads; i++ {
      		wg.Add(1)
      		go worker(taskChan, &wg, bytesChan, isWrite)
      	}
      
      	st := time.Now()
      	go func() {
      		wg.Wait()
      		close(bytesChan)
      	}()
      
      	var totalBytes int64
      	for bytes := range bytesChan {
      		totalBytes += bytes
      	}
      
      	ed := time.Now()
      	duration := ed.Sub(st)
      	throughput := float64(totalBytes) / (float64(duration.Nanoseconds()) / 1e9)
      
      	fmt.Printf("Total time: %v\n", duration)
      	if isWrite {
      		fmt.Printf("Write throughput: %.2f MBytes/s\n", throughput/1000/1000)
      	} else {
      		fmt.Printf("Read throughput: %.2f MBytes/s\n", throughput/1000/1000)
      	}
      }
      
      func main() {
      	flag.Parse()
      
      	workdir := *dir
      	if workdir == "" {
      		flag.Usage()
      		os.Exit(1)
      	}
      
      	if _, err := os.Stat(workdir); err != nil {
      		fmt.Printf("Failed to access %v with %v\n", workdir, err)
      		os.Exit(1)
      	}
      
      	doBench(workdir, *isWrite)
      }
    2. Compile the rw-bench.go program file.

      go build rw-bench.go
    3. Create 100,000 files (128 KB each) in the mounted directory:

      mkdir -p <path_to_mounted_test_directory> && ./rw-bench --dir <path_to_mounted_test_directory> --file-size-KB 128 --file-count 100000 --write
    4. Clear the page cache and run the test five times. Record steady-state results after latency stabilizes.

      echo 1 > /proc/sys/vm/drop_caches
      ./rw-bench --dir <path_to_mounted_test_directory> --threads 128
  • Test results

    Tool

    Bandwidth

    CPU core utilization (100% for a single fully loaded core)

    Peak memory

    ossfs 2.0

    1 GB/s

    247%

    176 MB

    ossfs 1.0

    45 MB/s

    25%

    412 MB

    goofys

    1 GB/s

    750%

    1.3 GB