To seamlessly migrate your data from a self-hosted origin server or a third-party cloud storage service to Alibaba Cloud Object Storage (OSS) without service disruption, you can configure mirroring-based back-to-origin. When a client requests an object that does not exist in OSS, OSS automatically fetches the object from your specified origin server, returns it to the client, and stores it in your bucket. This feature ensures that all data remains accessible during migration, enabling a smooth transition.
How it works
The mirroring-based back-to-origin feature works as a server-side proxy. When a client sends a GET request for an object that does not exist in an OSS bucket, OSS checks if the request triggers a back-to-origin rule (for example, by matching an object name prefix and returning an HTTP 404 error). If a rule is triggered, OSS sends an HTTP request to the specified origin server to fetch the object. If the origin server returns a 200 OK status code, OSS returns the object to the client and simultaneously stores it in the bucket. If the origin server returns a 404 Not Found or another error status code, OSS returns the corresponding error to the client. In this process, OSS acts as a proxy, enabling on-demand data migration and one-time caching. Note that once an object is stored in OSS, it is not automatically updated even if the source object on the origin server changes.
Fetch missing objects from a website
This is the most basic scenario for configuring mirroring-based back-to-origin. When a client requests an object that does not exist in OSS, OSS automatically fetches it from a specified origin server and stores it in the bucket. This example shows how to configure a rule to fetch objects from https://example.com/ when a requested object is not found in the examplefolder/ directory of the examplebucket bucket.
Step 1: Configure a mirror back-to-origin rule
Go to the Buckets page and click the name of the target bucket.
In the left-side navigation pane, choose .
On the Mirroring-based Back-to-origin page, click Create Rule.
In the Create Rule panel, configure the parameters. Use the default values for any other parameters.
Parameter
Configuration
Method
Select Image.
Condition
Select Object Name Prefix and enter examplefolder/ in the text box.
Origin URL
In the first column (Protocol), select
https. In the second column (Domain Name), enterexample.com. Leave the third column (Path Prefix) empty. The path prefix is appended to the domain name to form the path of the origin URL.Click OK.
Step 2: Verify the rule
Access
https://examplebucket.oss-cn-hangzhou.aliyuncs.com/examplefolder/example.txt.If the
examplefolder/example.txtobject does not exist in theexamplebucketbucket, OSS requests the object fromhttps://example.com/examplefolder/example.txt.After fetching the object, OSS saves it as
examplefolder/example.txtin theexamplebucketbucket and returns the object to the client.
Replace directory and verify integrity
In some scenarios, the directory structure in your OSS bucket may differ from that of your origin server. You may also need to ensure the integrity of the objects fetched from the origin server. This use case shows how to map directories and use MD5 verification to ensure reliable data transfer.
When a client requests an object that does not exist in the
examplefolderdirectory of thebucket-01bucket in the China (Hangzhou) region, OSS fetches the object from thedestfolderdirectory of thehttps://example.comwebsite.OSS verifies the MD5 hash of the fetched object. Objects with a mismatched MD5 hash are not saved to the
bucket-01bucket.
Step 1: Configure a mirror back-to-origin rule
Go to the Buckets page and click the name of the target bucket.
In the left-side navigation pane, choose .
On the Mirroring-based Back-to-origin page, click Create Rule.
In the Create Rule panel, configure the required parameters as described in the following table. Use the default values for other parameters.
Parameter
Configuration
Method
Select Image.
Condition
Select Object Name Prefix and set it to examplefolder/.
Replace or Delete File Prefix
Select Replace or Delete File Prefix and set it to destfolder/.
NoteThis option is displayed only after you set Object Name Prefix for the back-to-origin condition.
Origin URL
Set the first column to https, the second column to example.com, and leave the third column empty.
MD5 Verification
Select Perform MD5 verification. If the response to the back-to-origin request contains the Content-MD5 header, OSS verifies whether the MD5 hash of the fetched object matches the value of the Content-MD5 header.
If the values match, the client receives the object, and OSS saves the object.
If the values do not match, the client still receives the object, but OSS does not save it. This is because calculating the MD5 hash requires the complete object data, and at that point, the object has already been streamed to the client.
Click OK.
Step 2: Verify the rule
Access
https://bucket-01.oss-cn-hangzhou.aliyuncs.com/examplefolder/example.txt.If the
examplefolder/example.txtobject does not exist in thebucket-01bucket, OSS requests the object fromhttps://example.com/destfolder/example.txt.After fetching the object, OSS performs the following operations:
If the response to the back-to-origin request contains the Content-MD5 header, OSS calculates the MD5 hash of the fetched object and compares it with the value of the Content-MD5 header. If the values match, OSS saves the object as
examplefolder/example.txtto thebucket-01bucket and returns the object to the client. If the values do not match, OSS returns the object to the client but does not save it to thebucket-01bucket.If the response to the back-to-origin request does not contain the Content-MD5 header, OSS saves the object as
examplefolder/example.txtto thebucket-01bucket and returns the object to the client.
Route requests based on directory
If your business involves multiple origin servers, you can route requests to different servers based on the requested object path. This scenario is useful for consolidating data from multiple sources or migrating from a distributed storage architecture. For example, you have two origin servers with identical directory structures, Origin Server A (https://example.com) and Origin Server B (https://example.org), and you want to implement the following behavior:
When a client requests an object that does not exist in the
bucket-02/dir1directory in the China (Beijing) region, OSS fetches the object from theexample1directory of thehttps://example.comwebsite.When a client requests an object that does not exist in the
bucket-02/dir2directory, OSS fetches the object from theexample2directory of thehttps://example.orgwebsite.Depending on whether redirect policies are configured on Origin Server A and Origin Server B, OSS decides whether to request the object from the redirected address.
Step 1: Configure mirror back-to-origin rules
Go to the Buckets page and click the name of the target bucket.
In the left-side navigation pane, choose .
On the Mirroring-based Back-to-origin page, click Create Rule.
In the Create Rule panel, configure two mirroring-based back-to-origin rules as described below. Use the default values for any other parameters.
Rule 1
Parameter
Configuration
Method
Select Image.
Condition
Select Object Name Prefix and set it to dir1/.
Replace or Delete File Prefix
Select Replace or Delete File Prefix and set it to example1/.
NoteThis option is displayed only after you set Object Name Prefix for the back-to-origin condition.
Origin URL
Set the first column to https, the second column to example.com, and leave the third column empty.
3xx Response
Select Follow Origin to Redirect Request.
NoteIf Follow Origin to Redirect Request is not selected, OSS returns the redirect address specified by the origin server directly to the client.
Rule 2
Parameter
Configuration
Method
Select Image.
Condition
Select Object Name Prefix and set it to dir2/.
Replace or Delete File Prefix
Select Replace or Delete File Prefix and set it to example2/.
NoteThis option is displayed only after you set Object Name Prefix for the back-to-origin condition.
Origin URL
Set the first column to https, the second column to example.org, and leave the third column empty.
3xx Response
Select Follow Origin to Redirect Request.
Click OK.
Step 2: Verify the rules
Access
https://bucket-02.oss-cn-beijing.aliyuncs.com/dir1/example.txt.If the
example.txtobject does not exist in thedir1directory of thebucket-02bucket, OSS sends a request for the object tohttps://example.com/example1/example.txt.If Origin Server A has a redirect rule for
example1/example.txt, OSS sends a new request to the redirected address. After fetching the object, OSS saves it asdir1/example.txtto thebucket-02bucket and returns it to the client.If Origin Server A does not have a redirect rule for
example1/example.txt, OSS fetches the object, saves it asdir1/example.txtto thebucket-02bucket, and returns it to the client.
If a client requests
https://bucket-02.oss-cn-beijing.aliyuncs.com/dir2/example.txt, the object fetched through the mirroring-based back-to-origin rule is stored asdir2/example.txtin thebucket-02bucket.
Fetch from a private bucket and forward parameters
When your origin server is a private OSS bucket, you must configure the necessary access permissions. You may also need to forward specific parameters from the client request to the origin server. This use case shows how to configure back-to-origin for a private OSS bucket and forward parameters. For example, you have two buckets in the China (Shanghai) region: bucket-03 (public-read) and bucket-04 (private). You want to implement the following behavior:
When a client requests an object that does not exist in the
examplefolderdirectory of thebucket-03bucket, OSS fetches the object from theexamplefolderdirectory of thebucket-04bucket.OSS passes the query string from the request URL to the origin server.
OSS passes the HTTP headers
header1,header2, andheader3from the request to the origin server.
Step 1: Configure a mirror back-to-origin rule
Go to the Buckets page and click the name of the target bucket.
In the left-side navigation pane, choose .
On the Mirroring-based Back-to-origin page, click Create Rule.
In the Create Rule panel, configure the required parameters as described in the following table. Use the default values for any other parameters.
Parameter
Configuration
Method
Select Image.
Condition
Select Object Name Prefix and set it to examplefolder/.
Origin Type
Select Back-to-origin to Private OSS Bucket, and then select
bucket-04from the Source Bucket drop-down list.After this option is configured, when a client requests an object that does not exist, OSS uses the default role
AliyunOSSMirrorDefaultRoleto fetch the data from the specified private origin bucket. This requires theAliyunOSSReadOnlyAccesspermission, which ensures that OSS can only access the origin data in read-only mode and cannot modify or delete it.To configure mirroring-based back-to-origin for a private OSS bucket, a RAM user must have the
ram:GetRolepermission. This permission is used to check if theAliyunOSSMirrorDefaultRolerole exists.If the role exists, it is used directly.
If the role does not exist, we recommend that you use the primary Alibaba Cloud account associated with the RAM user to create the
AliyunOSSMirrorDefaultRolerole in advance and grant it theAliyunOSSReadOnlyAccesspermission. This practice avoids granting high-risk permissions, such as creating roles (ram:CreateRole) and attaching policies to roles (ram:AttachPolicyToRole), to the RAM user. After the role is authorized, the RAM user can reuse the existing role, which reduces permission configuration risks.
Origin URL
Set the first column to https and leave the other fields empty.
Origin Parameter
Select Transfer with Query String.
OSS passes the query string from the URL request to the origin server.
Set Transmission Rule of HTTP Header
Select Transmit Specific HTTP Headers and add the HTTP headers
header1,header2, andheader3. Back-to-origin rules do not support forwarding certain standard HTTP headers, such asauthorization,authorization2,range,content-length, anddate, or any headers that start withx-oss-,oss-, orx-drs-.ImportantWhen fetching from a private bucket, do not select the option to forward all HTTP headers. This causes the back-to-origin fetch to fail.
Click OK.
Step 2: Verify the rule
Access
https://bucket-03.oss-cn-shanghai.aliyuncs.com/examplefolder/example.png?caller=lucas&production=oss.If the
examplefolder/example.pngobject does not exist in thebucket-03bucket, OSS sends a request for the object tohttps://bucket-04.oss-cn-shanghai.aliyuncs.com/examplefolder/example.png?caller=lucas&production=oss.The
bucket-04bucket returns theexample.pngobject to OSS based on the forwarded?caller=lucas&production=ossparameters.OSS saves the fetched object as
examplefolder/example.pngin thebucket-03bucket.
If the request also carries the header1, header2, and header3 HTTP headers, OSS also passes them to the bucket-04 bucket.
Production use cases
Seamless data migration
For more information about the migration solution, see Seamlessly migrate services to Alibaba Cloud OSS by using mirroring-based back-to-origin.
Refresh cached objects
Mirroring-based back-to-origin is a one-time caching mechanism. If an object on the origin server is updated, OSS does not automatically refresh or re-fetch it. You can use the following methods to manually refresh cached objects.
Manual deletion: Delete the object from the OSS bucket by using the console or an API. The next time the object is accessed, the back-to-origin rule is triggered again.
Lifecycle rules: Configure an expiration policy for the mirrored objects. They are automatically deleted after a specified period, enabling periodic refreshes.
Object name versioning: When you update an object on the origin server, use a new name (for example,
style.v2.css). This is the recommended approach to avoid caching issues.
Risk prevention and fault tolerance
Origin server load: Ensure that your origin server has sufficient bandwidth and processing capacity to handle back-to-origin requests. During the initial migration phase, the volume of back-to-origin requests may be high. We recommend that you monitor the load on your origin server and consider pre-warming the data during off-peak hours.
Cost control: To avoid unexpected high costs, we recommend that you set up cost alerts in the Alibaba Cloud Management Console to monitor the volume of back-to-origin requests.
Security configuration: Ensure that your origin server is accessible to OSS. If the origin URL uses the HTTPS protocol, make sure the origin server's certificate is issued by a trusted Certificate Authority (CA), the domain name matches, and the certificate has not expired.
Log query: Use the real-time log query feature to view logs related to back-to-origin. The User-Agent for back-to-origin requests contains the string
aliyun-oss-mirror.
Quotas and limits
Number and order of rules: You can configure up to 20 back-to-origin rules for each bucket. Rules are matched in ascending order of their RuleNumber. Once a rule is matched, it is executed, and subsequent rules are not checked. You can adjust the matching priority by using the Up or Down options next to a rule.
QPS and bandwidth:
Regions in the Chinese mainland: The default total QPS is 2,000, and the total bandwidth is 2 Gbit/s.
Regions outside the Chinese mainland: The default total QPS is 1,000, and the total bandwidth is 1 Gbit/s.
This limit applies to the total mirroring-based back-to-origin capacity for all buckets that belong to a single Alibaba Cloud account in the corresponding region. Requests that exceed this limit are throttled, and a 503 error is returned. To request a higher quota, contact Technical Support.
Origin server address: The address must be a publicly accessible domain name or IP address that complies with RFC 3986 encoding standards. Internal network addresses are not supported.
Timeout: The default timeout for mirroring-based back-to-origin is 10 seconds.
Chunked back-to-origin: If your origin server supports range requests and you require the chunked back-to-origin feature, contact Technical Support.
FAQ
Mirrored object size differs from source
If you find a size discrepancy between the mirrored object and the source object, follow these steps to investigate.
Check the
Last-Modifiedtimestamps of the mirrored object and the source object.import oss2 import requests from datetime import datetime from oss2.credentials import EnvironmentVariableCredentialsProvider # Obtain credentials from environment variables. Before running this code, # make sure the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set. auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider()) # Specify the endpoint for the region where your bucket is located. # For example, for China (Hangzhou), the endpoint is https://oss-cn-hangzhou.aliyuncs.com. endpoint = "https://oss-cn-hangzhou.aliyuncs.com" # Specify the region corresponding to the endpoint, e.g., cn-hangzhou. # This parameter is required for V4 signatures. region = "cn-hangzhou" # Replace "yourBucketName" with the name of the bucket where you configured the rule. bucket = oss2.Bucket(auth, endpoint, "yourBucketName", region=region) # Specify the full path of the mirrored object. object_key = 'yourObjectKey' # Specify the full path of the source object. source_url = 'yourSourceUrl' # Get the Last-Modified timestamp of the mirrored object. oss_object_info = bucket.get_object_meta(object_key) oss_last_modified = oss_object_info.headers['last-modified'] print(f"OSS Last-Modified: {oss_last_modified}") # Get the Last-Modified timestamp of the source object. response = requests.head(source_url) source_last_modified = response.headers.get('last-modified') print(f"Source Last-Modified: {source_last_modified}") # Convert the timestamp strings to datetime objects for comparison. oss_time = datetime.strptime(oss_last_modified, '%a, %d %b %Y %H:%M:%S %Z') source_time = datetime.strptime(source_last_modified, '%a, %d %b %Y %H:%M:%S %Z') if oss_time < source_time: print("The source object has been updated.") elif oss_time > source_time: print("The mirrored object is newer.") else: print("The timestamps of the two objects are identical.")If the
Last-Modifiedtimestamp of the source file is greater than theLast-Modifiedtimestamp of the mirrored file, this indicates that the source file may have been updated after the mirrored file was generated.NoteWhen OSS fetches an object from an origin server and writes it to a bucket, it does not preserve the
Last-Modifiedtimestamp of the source object. Instead, OSS sets theLast-Modifiedtimestamp of the mirrored object to the time it was created or updated in OSS.If the
Last-Modifiedtimestamp of the source file is ≤ theLast-Modifiedtimestamp of the mirroring-based back-to-origin file, it indicates that the source file has not been updated since the mirroring-based back-to-origin file was generated. The next step is to check the MD5 or CRC64 checksum values of both files.
Compare the MD5 or CRC64 checksums of the mirrored object and the source object.
# -*- coding: utf-8 -*- import oss2 import hashlib import requests # For CRC64 comparison, the Python standard library does not support CRC64. # You can use a third-party library like crcmod. # Install crcmod: pip install crcmod import crcmod from oss2.credentials import EnvironmentVariableCredentialsProvider # Obtain credentials from environment variables. Before running this code, # make sure the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set. auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider()) # Specify the endpoint for the region where your bucket is located. # For example, for China (Hangzhou), the endpoint is https://oss-cn-hangzhou.aliyuncs.com. endpoint = "https://oss-cn-hangzhou.aliyuncs.com" # Specify the region corresponding to the endpoint, e.g., cn-hangzhou. # This parameter is required for V4 signatures. region = "cn-hangzhou" # Replace "yourBucketName" with the name of the bucket where you configured the rule. bucket = oss2.Bucket(auth, endpoint, "yourBucketName", region=region) # Specify the full path of the mirrored object. object_key = 'yourObjectKey' # Specify the full path of the source object. source_url = 'yourSourceUrl' # Get the metadata of the mirrored object. oss_object_info = bucket.get_object_meta(object_key) oss_md5 = oss_object_info.headers.get('etag', '').strip('"') # ETag is usually the MD5 hash oss_crc64 = oss_object_info.headers.get('x-oss-hash-crc64ecma', '') print(f"OSS MD5: {oss_md5}") print(f"OSS CRC64: {oss_crc64}") # Get the content of the source object and calculate its MD5 and CRC64. response = requests.get(source_url) if response.status_code == 200: source_content = response.content source_md5 = hashlib.md5(source_content).hexdigest() print(f"Source MD5: {source_md5}") crc64_func = crcmod.predefined.mkCrcFun('crc-64') source_crc64 = hex(crc64_func(source_content))[2:].upper().zfill(16) # Convert to hex string and format print(f"Source CRC64: {source_crc64}") # Compare the MD5 values. if oss_md5 == source_md5: print("MD5 checksums are identical.") else: print("MD5 checksums do not match.") # Compare the CRC64 values. if oss_crc64.upper() == source_crc64: print("CRC64 checksums are identical.") else: print("CRC64 checksums do not match.") else: print(f"Failed to fetch source file. HTTP Status Code: {response.status_code}")If the MD5 or CRC64 checksums are identical, the content of the two objects is the same. In this case, their sizes should also be identical.
If the MD5 or CRC64 checksums do not match, the content of the two objects is different. Proceed to the next step to check for special request headers.
Check for special request headers.

Check if the back-to-origin request contains special HTTP request headers, such as
Accept-Encoding: gzip, deflate, br. This header indicates that the client can accept compressed data.If the back-to-origin request uses HTTP compression and the requested object meets the compression criteria, the sizes of the two objects will differ.
If the
Accept-Encodingheader is present, do not forward it.If you have configured the rule to forward all HTTP headers, add
accept-encodingto the list of prohibited headers.
If you have configured the rule to forward specific HTTP headers, ensure that
accept-encodingis not included in the list of specified headers.
Troubleshoot back-to-origin failures
If you encounter an origin fetch failure (such as a 424 MirrorFailed error), you can troubleshoot the issue by following the steps below.
Check the reachability of the origin server.
# Replace the URL with your actual origin server address and file path curl -I "https://www.example.com/images/test.jpg"Check the DNS resolution.
# Replace the domain name with your actual origin domain name nslookup www.example.comCheck the HTTPS certificate (if the origin server uses HTTPS).
# Replace the domain name with your actual origin domain name openssl s_client -connect www.example.com:443 -servername www.example.comAnalyze the issue by using OSS's real-time log query feature.
No mirrored object created
A client HEAD request retrieves only object metadata, such as size and type, without downloading the content. Therefore, HEAD requests do not trigger mirroring-based back-to-origin rules to fetch an object from the origin server and write it to the OSS bucket.
Unexpected status code from back-to-origin
When a request triggers mirroring-based back-to-origin, if the origin server returns a status code other than 404, 200, or 206, analyze the origin server's response.
Origin is OSS: Check the following configuration items.
Prohibit specific HTTP headers from being forwarded: Prohibit forwarding the host header to avoid exposing origin server information and to ensure back-to-origin requests are processed as expected. If you do not prohibit forwarding the host header, the back-to-origin request will pass the host value of the target bucket to the origin server. Because each bucket's host value is unique, if the requested host does not match the origin's actual host, the origin server returns a 403 error. OSS then returns a 424 error to the client.

Back-to-origin for a private OSS bucket: If permissions are not configured, check whether the ACL of the target bucket and its objects is set to public-read. If permissions are configured, check whether the role authorization policy for mirroring-based back-to-origin has changed, resulting in insufficient permissions. The default role for mirroring-based back-to-origin is
AliyunOSSMirrorDefaultRole, and its default system policy isAliyunOSSReadOnlyAccess.
Origin is not OSS: Analyze the server-side logs and check configurations for Server Name Indication (SNI), back-to-origin parameters, and header forwarding to identify the specific cause of the origin server error. The origin server may return status codes such as 401 (Unauthorized), 403 (Forbidden), or a 5xx (Server Internal Error).
Back-to-origin rule matching order
Rules are matched based on the rule number (RuleNumber) in ascending order. After the first rule that meets the conditions is matched, that rule is immediately executed and no subsequent rules are matched.
Fetch from a VPC or internal IP
No. The origin server must have a publicly accessible address. To access a service in a VPC, expose it to the public internet by using a NAT Gateway or an internet-facing SLB instance.
OSS object not updated after source update
Mirroring-based back-to-origin is a one-time pull mechanism and does not automatically synchronize updates from the origin server. To fetch an updated object, you must manually delete the mirrored object from OSS or use an object name versioning strategy.