如何创建OSS投递任务（新版）-阿里云帮助中心

After Simple Log Service (SLS) collects data, you can ship it to an Object Storage Service (OSS) bucket for storage and analysis. This topic describes how to create an OSS data shipping job (new version).

Prerequisites

Create a project and a logstore. For more information, see Create a project and a logstore.
Collect data to your logstore. For more information, see Data collection.
Create a bucket in the same region as your SLS project. For more information, see Create a bucket in the console.

Supported regions

Simple Log Service ships data to an OSS bucket in the same region as the SLS project.

Important

This feature is available only in the following regions: China (Hangzhou), China (Shanghai), China (Nanjing), China (Hangzhou) Finance, China (Shanghai) Finance, China (Qingdao), China (Beijing), China (Zhangjiakou), China (Hohhot), China (Ulanqab), China (Chengdu), China (Shenzhen), China (Heyuan), China (Guangzhou), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Philippines (Manila), Thailand (Bangkok), Japan (Tokyo), US (Silicon Valley), and US (Virginia).

China (Hangzhou) Finance supports only buckets that are accessible over the public endpoint in the OSS China (Hangzhou) Finance region. China (Shanghai) Finance supports only buckets in the OSS China (Shanghai) Finance region.

Create a data shipping job

Log on to the Simple Log Service console.
In the Projects section, click the one you want.
On the Log Storage > Logstores tab, click the > icon to the left of the target logstore and choose Data Processing > Export > Object Storage Service.
Hover over Object Storage Service and click the + icon.

In the Data Shipping to OSS panel, configure the following parameters and click OK.

Select New Version for the Shipping Version parameter. The following table describes the key parameters.

Important

After you create a data shipping job, a shipping operation for each shard is triggered when the specified batch size is reached or the batch interval has elapsed.
After you create the job, verify that it works as expected by checking its status and the data in OSS.

Parameter	Description
Job name	The unique name of the data shipping job.
Display Name	The display name of the data shipping job.
Job description	The description of the OSS data shipping job.
OSS bucket	The name of the destination OSS bucket. Important The bucket must exist in the same region as the SLS project and must not have Write-Once-Read-Many (WORM) enabled. For more information about WORM, see Bucket-level retention policies (BucketWorm). You can ship data to a bucket with the Standard, Infrequent Access (IA), Archive, Cold Archive, or Deep Cold Archive storage class. The storage class of the generated OSS objects defaults to that of the bucket. For more information, see Storage classes. Storage classes other than Standard have minimum storage durations and billable sizes. Choose a storage class for the destination bucket that meets your requirements. For more information, see Storage class comparison.
File Delivery Directory	The directory in the OSS bucket. The directory name cannot start with a forward slash (`/`) or a backslash (`\`). After you create the data shipping job, Simple Log Service ships data from the logstore to this directory in the destination OSS bucket.
Object Suffix	If you do not specify an object suffix, Simple Log Service automatically generates one based on the storage format and compression type, such as `.suffix`.
Partition Format	A format that dynamically generates a directory path in the OSS bucket based on the shipping time. The path cannot start with a forward slash (/). The default value is %Y/%m/%d/%H/%M. For examples, see Partition format. For parameter details, see the strptime API.
OSS Write RAM Role	The RAM role that grants the data shipping job permissions to write data to the OSS bucket. Default Role: Authorizes the data shipping job to assume the Alibaba Cloud system role `AliyunLogDefaultRole` to write data to the OSS bucket. Enter the ARN of `AliyunLogDefaultRole`. For information about how to obtain the ARN, see Access data by using a default role. Custom Role: Authorizes the data shipping job to assume a custom RAM role to write data to the OSS bucket. First, grant the custom RAM role permissions to write data to the OSS bucket. Then, enter the ARN of your custom RAM role in the OSS Write RAM Role field. For more information about obtaining the ARN, see the following topics: If the logstore and the OSS bucket belong to the same Alibaba Cloud account, see Step 2: Grant a RAM role permissions to write data to an OSS bucket. If the logstore and the OSS bucket belong to different Alibaba Cloud accounts, see Step 2: Grant RAM role role-b under Alibaba Cloud account B permissions to write data to an OSS bucket.
Logstore read RAM role	The RAM role that grants the data shipping job permissions to read data from the logstore. Default Role: Authorizes the data shipping job to assume the Alibaba Cloud system role `AliyunLogDefaultRole` to read data from the logstore. Enter the ARN of `AliyunLogDefaultRole`. For information about how to obtain the ARN, see Access data by using a default role. Custom Role: Authorizes the data shipping job to assume a custom RAM role to read data from the logstore. First, grant the custom RAM role permissions to read data from the logstore. Then, enter the ARN of your custom RAM role in the Logstore read RAM role field. For more information about obtaining the ARN, see the following topics: If the logstore and the OSS bucket belong to the same Alibaba Cloud account, see Step 1: Grant a RAM role permissions to read data from a logstore. If the logstore and the OSS bucket belong to different Alibaba Cloud accounts, see Step 1: Grant RAM role role-a under Alibaba Cloud account A permissions to read data from a logstore.
Storage Format	The file format for data stored in OSS. For more information, see CSV format, JSON format, Parquet format, and ORC format.
Compress	The compression method for data stored in OSS. none: Data is not compressed. snappy: Compresses data by using the snappy algorithm. For more information, see snappy. zstd: Compresses data by using the zstd algorithm. gzip: Compresses data by using the gzip algorithm.
Ship Tags	Specifies whether to include `__tag__` fields, which are reserved fields in Simple Log Service, in the shipped data. For more information, see reserved fields.
Batch size	The maximum size of uncompressed data, in MB, to ship from a shard in a single batch. A shipping operation is triggered when this size is reached. Value range: 5 to 256. Unit: MB. Note Batch size refers to the size of data to be batched after it is read, not the size of data already written to SLS. Data is read and shipped only after the batch interval has elapsed.
Batch interval	The maximum time to wait, in seconds, before shipping a batch of data from a shard. The interval starts when the first log entry of a batch is received. A shipping operation is triggered when the interval elapses. The value must be between 300 and 900 seconds. The default value is 300 seconds.
Shipping latency	The delay before data is shipped. For example, if you set this parameter to 3600, data is shipped with a 1-hour delay. For example, data from 10:00:00 on 2023-06-05 is written to the OSS bucket no earlier than 11:00:00 on 2023-06-05. For information about the limitations, see Configuration limits.
Start Time Range	The time range of the data to ship, based on when the logs are received by SLS. The following options are available: All: Ships data starting from the first log entry received in the logstore. The job runs until you manually stop it. From Specific Time: Ships data starting from a specified point in time. The job runs until you manually stop it. Specific Time Range: Ships data within a specified start and end time. The job stops automatically at the end time. Note The time range refers to `__tag__:__receive_time__`. For more information, see reserved field.
Time Zone	The time zone used to format the time in the directory path. If you specify a Time Zone and a Partition Format, the system generates the directory path in the OSS bucket based on your settings.

View data in OSS

After data is successfully shipped to OSS, you can access it using the OSS console, an API, an SDK, or other tools. For more information, see File management.

The OSS object path is in the following format:

oss://OSS-BUCKET/OSS-PREFIX/PARTITION-FORMAT_RANDOM-ID

In this format, OSS-BUCKET is the OSS bucket name, OSS-PREFIX is the directory prefix, PARTITION-FORMAT is the partition format calculated from the shipping time using the strptime API, and RANDOM-ID is the unique ID of a shipping operation.

Note

Simple Log Service ships data to OSS in batches. Each shipping operation creates one object that contains a batch of data. The object path is determined by the earliest receive_time (the time when data arrives at SLS) in the batch. Note the following scenarios:

Shipping real-time data: Assume data is shipped every 5 minutes. A shipping operation at 00:00:00 on 2022-01-22 might process data received after 23:55:00 on 2022-01-21. Since the object path is based on the earliest timestamp in the batch, the resulting object could be placed in the 2022/01/21/ directory. Therefore, to analyze all data for 2022-01-22, you must check all objects in the 2022/01/22/ directory and the last few objects in the 2022/01/21/ directory.
Shipping historical data: If the logstore contains a small volume of data, a single data shipping operation may include data that spans multiple days. As a result, an object in the 2022/01/22/ directory might contain all the data for 2022-01-23, leaving the 2022/01/23/ directory empty.

Partition format

Each shipping operation corresponds to an OSS object path in the format oss://OSS-BUCKET/OSS-PREFIX/PARTITION-FORMAT_RANDOM-ID. The following table provides examples of partition formats for a shipping job created at 19:50:43 on 2022/01/20.

OSS bucket	OSS prefix	Partition format	Object suffix	OSS object path
test-bucket	test-table	%Y/%m/%d/%H/%M	.suffix	oss://test-bucket/test-table/2022/01/20/19/50_1484913043351525351_2850008.suffix
test-bucket	log_ship_oss_example	year=%Y/mon=%m/day=%d/log_%H%M	.suffix	oss://test-bucket/log_ship_oss_example/year=2022/mon=01/day=20/log_1950_1484913043351525351_2850008.suffix
test-bucket	log_ship_oss_example	ds=%Y%m%d/%H	.suffix	oss://test-bucket/log_ship_oss_example/ds=20220120/19_1484913043351525351_2850008.suffix
test-bucket	log_ship_oss_example	%Y%m%d/	.suffix	oss://test-bucket/log_ship_oss_example/20220120/_1484913043351525351_2850008.suffix Note This format may cause platforms like Hive to fail when parsing the OSS content. We recommend that you do not use this format.
test-bucket	log_ship_oss_example	%Y%m%d%H	.suffix	oss://test-bucket/log_ship_oss_example/2022012019_1484913043351525351_2850008.suffix

When you analyze OSS data with big data platforms like Hive, MaxCompute, or Alibaba Cloud Data Lake Analytics (DLA), set the partition format to a key=value format to use partition information. For example, in the path oss://test-bucket/log_ship_oss_example/year=2022/mon=01/day=20/log_195043_1484913043351525351_2850008.parquet, three partition columns are defined: year, mon, and day.

SDK example

Create an OSS data shipping job