Quick start for OSS and OSS-HDFS

更新时间:
复制 MD 格式

This topic explains how to quickly access OSS and OSS-HDFS.

Prerequisites

  • Activate OSS.

  • Create buckets.

  • Verify you have the required permissions to access OSS.

    • EMR clusters managed in the new console have the required permissions by default. If you encounter an issue, see Authorize service roles.

    • EMR clusters managed in the old console have the required permissions by default. If you encounter an issue, see Assign roles.

    • To grant permissions in a non-EMR environment, see OSS and OSS-HDFS authorization.

  • (Optional, recommended) Activate OSS-HDFS and grant permissions to access OSS-HDFS.

  • Verify the deployed JindoSDK version.

    • EMR clusters include JindoSDK by default.

      Note

      To access OSS-HDFS, you must create a cluster of EMR-3.42.0 or later, or EMR-5.8.0 or later.

    • For non-EMR environments, see Deploy JindoSDK in an environment other than EMR.

      Note

      To access OSS-HDFS, you must deploy JindoSDK 4.x or later.

Paths

The only difference when accessing OSS and OSS-HDFS is the endpoint in the path. All other usage is identical. The following table shows sample root paths and descriptions for both systems.

Storage system

Sample root path

Description

OSS

oss://examplebucket.oss-cn-shanghai-internal.aliyuncs.com/

This path is for an OSS bucket named examplebucket in the China (Shanghai) region, accessed through an internal endpoint.

Note

This configuration means that cross-region access is not supported by default.

OSS-HDFS

oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/

This path is for an OSS-HDFS-enabled bucket named examplebucket in the China (Shanghai) region.

Note

This configuration means that cross-region access is not supported by default.

Access methods

You can access OSS and OSS-HDFS by using four methods: Hadoop Shell commands, Jindo CLI commands, POSIX commands, and the OSS console.

Access method

Example

Description

Hadoop Shell command

hadoop fs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/

JindoOssFileSystem, included in JindoSDK, is an implementation of the Hadoop FileSystem interface. When you run a Hadoop Shell command, the client uses the endpoint in the path to access OSS or OSS-HDFS. For more information, see Use Hadoop Shell commands to access OSS or OSS-HDFS.

Jindo CLI command

jindo fs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/

Accessing OSS or OSS-HDFS with Jindo CLI commands is similar to using Hadoop Shell commands. Jindo CLI provides additional features, such as data archiving, caching, and error analysis. For more information, see Use Jindo CLI commands to access OSS or OSS-HDFS.

POSIX command

mkdir -p /mnt/oss jindo-fuse /mnt/oss -ouri=oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/ ls /mnt/oss

JindoFuse implements the FUSE API to mount an OSS or OSS-HDFS path to a local directory. This setup lets you access objects in OSS or OSS-HDFS as if they were local files. For more information, see Use POSIX commands to access OSS or OSS-HDFS.

OSS console

Use the console UI. See the Description column for details.

To manage your data in the OSS console, follow these steps:

  1. Log on to the OSS console.

  2. On the Buckets page, click the name of the target bucket.

  3. In the left navigation pane, choose File Management > Objects.

  4. Click the OSS Object or HDFS tab to access your data.