Reduce OSS requests and improve mount target performance

更新时间:
复制 MD 格式

Optimizing the metadata requests that ossfs 2.0 sends to OSS reduces API call costs, improves concurrency, and speeds up read/write operations on mount points.

Basic principles

ossfs 2.0 is built on the FUSE (Filesystem in Userspace) framework. It translates file system metadata operations into OSS requests, allowing you to access OSS storage resources through standard file system interfaces.

Command

Interface conversion rules

lookup

When executing lookup or stat operations and the metadata cache is invalid, ossfs 2.0 will first send a GetObjectMeta request to OSS to obtain the attribute information of the object with the same name.

If the GetObjectMeta request returns a 404 response (indicating the object does not exist), it will further send a ListObject(max-keys=1) request to query whether a virtual folder object with the same name exists.

stat

readdir

When executing readdir or readdirplus operations, ossfs 2.0 will repeatedly send ListObject requests to OSS.

Note that ossfs 2.0 enables the readdirplus feature by default. When this feature is enabled, the results of ListObject requests will be used to update the metadata cache information for all subitems in the corresponding folder, thereby effectively reducing the number of subsequent metadata requests for child files.

readdirplus

Scenario analysis

Accessing a file through a file system differs significantly from accessing the corresponding object directly in OSS.

File access methods

ossfs resolves file paths top-down from the root directory. For example, to obtain the attributes of /dir/object, the stat /dir/object command executes as follows:

  1. First, perform an operation on /dir, sending a GetObjectMeta dir request. If it returns 404 Not Found, it indicates that the object does not exist, and then a ListObject (max-keys=1)dir/ request is sent. If it returns 200 OK, it indicates that a corresponding virtual folder exists.

  2. Perform an operation on /dir/object, sending a GetObjectMeta dir/object request. If it returns 200 OK, the object attribute information is successfully obtained.

A single stat /dir/object command results in two GetObjectMeta requests and one ListObject request. Because each path component requires its own metadata lookups, the number of OSS requests grows with the file depth, which degrades performance.

Impact of file metadata caching

Important

ossfs 2.0 enables file metadata caching by default, with a default cache validity period of 60 seconds. The cache capacity of metadata is implemented based on the FUSE low-level API and is determined by the operating system kernel when to evict. Machines with more memory can typically cache more metadata information.

The following example shows how metadata caching affects performance when reading attributes of 100 child files in the /dir/ directory.

  • Without metadata caching

    • Accessing files with a known file list:

      When executing the stat /dir/object-<i> command in a loop, each stat operation will be converted into one GetObjectMeta request, ultimately generating 100 GetObjectMeta requests sent to OSS to obtain file attributes, resulting in too many metadata requests affecting performance.

    • Accessing files with an unknown file list:

      When executing the ls command, this operation will be converted into one ListObject request sent to OSS to obtain the file list, and then execute the stat /dir/object-<i> command in a loop to obtain file attributes based on the obtained file list. This will ultimately generate one ListObject request and 100 GetObjectMeta requests sent to OSS, resulting in too many metadata requests affecting performance.

  • With metadata caching

    • Accessing files with a known file list:

      When executing the stat /dir/object-<i> command in a loop, each stat operation will be converted into one GetObjectMeta request, ultimately generating 100 GetObjectMeta requests. These 100 requests will directly hit the local metadata cache to obtain file attributes within the cache validity period, thereby effectively reducing the number of requests sent to OSS.

    • Accessing files with an unknown file list:

      When executing the ls command, this operation will be converted into one ListObject request sent to OSS while updating the local metadata cache. After completing the cache update, when executing the stat /dir/object-<i> command in a loop, since the metadata is already in the local cache, no additional OSS requests will be sent.

Metadata caching effectively reduces repeated requests to OSS. When traversing all files in a folder, running ls first preloads the cache and eliminates subsequent per-file OSS requests.

Optimization methods

Use the following methods to reduce metadata requests to OSS and improve performance:

Extend metadata cache time

If your data is immutable after upload or changes infrequently relative to the cache duration, increase the attr_timeout mount option to extend the metadata cache validity period and reduce repeated requests.

  • Business scenario: In a data annotation scenario, the system reads a batch of previously collected raw data, processes it, and then generates a new batch of data. In this scenario, the raw data will not be modified once uploaded to OSS.

  • Mount configuration: In the ossfs 2.0 configuration file, configure the metadata cache validity period to 7200 seconds.

    # Bucket Endpoint (region node)
    --oss_endpoint=https://oss-cn-hangzhou-internal.aliyuncs.com
    
    # Bucket name
    --oss_bucket=bucketName
    
    # Metadata cache validity period
    --attr_timeout=7200
    
    # Access keys AccessKey ID and AccessKey Secret (optional for ossfs 2.0.1 and later versions)
    --oss_access_key_id=LTAI******************
    --oss_access_key_secret=8CE4**********************

Operate after obtaining file list

Before accessing individual files in a directory, run the ls command or send a ListObject request to preload all file metadata into the local cache. Combined with a longer cache validity period, this eliminates repeated per-file requests to OSS.

You can replace the ls command with any program that reads directory contents. The following examples list files in the /mnt/data/ directory.

Python

os.listdir('/mnt/data/')

Go

entries, err := os.ReadDir("/mnt/data/")

C

dir = opendir("/mnt/data/");
if (dir != NULL) {
  struct dirent *entry;
  while((entry = readdir(dir)) != NULL) {}
  closedir(dir);
}

Use a negative cache to accelerate file creation

To create a new file, a file system executes two system calls in sequence: lookup and create.

  1. The lookup operation determines if the corresponding file exists. In ossfs 2.0, this operation is parsed into a GetObjectMeta request and a ListObjects request.

  2. If a 404 Not Found error is returned, ossfs creates the file using the create operation. When ossfs 2.0 executes create, it also sends a GetObjectMeta request and a ListObjects request to query whether the file exists in OSS.

Therefore, the process of creating a new file involves four OSS metadata query operations.

ossfs 2.0 supports the caching of `404` requests that are returned by OSS to reduce subsequent duplicate requests. To enable this feature, specify the following options when you mount the file system:

  • --oss_negative_cache_timeout=30 (The default value is 0 seconds. We recommend that you set this value to be less than the value of attr_timeout.)

  • --oss_negative_cache_size=10000 (Default value: 10000)

When the OSS negative cache is enabled, the 404 request from the lookup operation for a new file is cached. As a result, the subsequent query during the create operation hits the negative cache, and no request is sent to OSS. This reduces the number of OSS requests for the file creation process from four to two.

Important

After you enable the OSS negative cache, if a 404 cache entry for a file named object-A is cached, the file is visible at the mount target only after the cache entry expires, even if you immediately create object-A in OSS. The cache validity period is specified by oss_negative_cache_timeout. We do not recommend that you enable this feature in scenarios that require high data consistency.

Performance comparison

Test method: Mount an OSS Bucket with ossfs 2.0 on an ECS instance in the same region using an internal endpoint with metadata caching enabled, then read the metadata of 10,000 files in the mounted directory.

Test results

Operation

Time consumed

Without preloading metadata cache (reading file metadata in the folder directly without executing the ls command in advance)

111 seconds

With preloading metadata cache (executing the ls command in advance, then reading file metadata in the folder)

18 seconds

Test conclusion: Preloading the metadata cache before bulk file access, combined with an appropriate cache validity period, significantly reduces OSS metadata requests and improves overall performance.