Genomic data files
This topic describes how to use genomic data files in the Genomics Analysis Platform, including operations such as uploading, downloading, and deleting files.
The Genomics Analysis Platform does not directly store genomic data. Instead, it relies on Alibaba Cloud Object Storage Service (OSS) to manage your genomic data files. When you activate the Genomics Analysis Platform, you grant it a role-based permission to access your OSS. The platform's running tasks retrieve computing input from your OSS, and the final results are written back to your OSS.
The Data Management page in the Genomics Analysis Platform workspace has a built-in OSS feature. This feature lets you upload, download, and delete genomic data files. These operations are identical to performing them directly in the OSS console.
By default, the Files page of Data Management in a workspace displays the data files in the OSS bucket associated with the workspace. You can also switch to access any other OSS bucket in the same region for which you have permissions.
Management tools
You can use all the tools listed in Developer Tools for OSS to manage genomic data files. The platform supports multiple methods, such as a web page, a graphical client, and a command line interface.
In the file upload dialog box, the platform provides the name and region of the OSS bucket associated with the workspace. You can use this information to perform upload, download, and delete operations.
Upload and download local genomic data
You can use the tools mentioned above to upload local data to your Genomics Analysis Platform workspace. You can download the analysis results to your local machine in the same way. The following suggestions can help you choose the right tool:
Web page upload: The maximum size for a single file is 5 GB. You must keep the web page open during the upload. If the upload is interrupted, it cannot be resumed. This method is suitable for uploading small files.
OSSBrowser client: This client provides a graphical user interface (GUI). It has no file size limit and supports parallel uploads and resumable uploads. This method is simple, convenient, and suitable for uploading genomic data files from your personal computer.
ossutil command line interface: This tool has no file size limit and supports parallel uploads and resumable uploads. It is suitable for uploading genomic data files from a local server or a High-Performance Computing (HPC) cluster. You can also write scripts to automate uploads.
Offline migration (Data Transport): This method is for one-time, terabyte-scale to petabyte-scale data migration to the cloud. It is suitable for migrating large batches of historical data from an on-premises data center that has low bandwidth or no public network access.
Uploading genomic data to OSS is free of charge. However, you are charged for data storage and downloads. For more information, see Billing overview for Object Storage Service (OSS).
Process data from other sources
Data from other Alibaba Cloud accounts
If your genomic data is from another Alibaba Cloud user, such as a sequencing service provider, that user can use Alibaba Cloud Resource Access Management (RAM) to grant you permission to access the data bucket. For more information, see Can the Genomics Analysis Platform access OSS resources across accounts? in the FAQ. After you are granted permission, you can use the data in the Genomics Analysis Platform.
Data from other public cloud providers
If your genomic data is stored on another public cloud provider, you can use Alibaba Cloud's Data Online Migration service to import the data into your Genomics Analysis Platform workspace.