STR_HASH

更新时间:
复制 MD 格式

STR_HASH is a sharding function for CHAR and VARCHAR shard keys. It extracts a substring from the shard key value and uses that substring—either as a string or an integer—to calculate which database shard and table shard a row belongs to.

Compared to the default HASH function, STR_HASH lets you control which part of the shard key drives routing. This is useful when the shard key follows a fixed format (such as an order ID with a predictable suffix) and you need deterministic one-to-one mapping between a shard key value and a physical shard.

Important

STR_HASH only supports point queries. Range queries trigger a full table scan and degrade query performance.

Syntax

STR_HASH( shardKey [, startIndex, endIndex [, valType [, randSeed ] ] ] )

Only shardKey is required. Omitting all optional parameters routes rows by hashing the full shard key value as a string.

Parameters

ParameterDescriptionDefault
shardKeyThe shard key column name. Must be CHAR or VARCHAR.Required
startIndexStart position of the substring (0-based, inclusive). Set to -1 to use the last endIndex characters instead.-1
endIndexEnd position of the substring (0-based, exclusive). Set to -1 to use the first startIndex characters instead.-1
valTypeHow the substring is interpreted for routing: 0 = as a string (uses UNI_HASH), 1 = as an integer (value must not exceed 9,223,372,036,854,775,807 and must not be a floating-point number).0
randSeedRandom seed for hash calculation when valType is 0. Change this only when the default seed produces uneven shard distribution. Common alternative values include 131, 13131, and 1313131.31

How startIndex and endIndex work together

ConditionSubstring extracted
startIndex = j, endIndex = k (j >= 0, k >= 0, k > j)Characters at positions [j, k) of the original string
startIndex = -1, endIndex = k (k >= 0)Last k characters; the full string if it has fewer than k characters
startIndex = k, endIndex = -1 (k >= 0)First k characters; the full string if it has fewer than k characters
startIndex = -1, endIndex = -1The full string (no truncation)

Examples using the string ABCDEFG:

RangeResult
[1, 5)BCDE
[2, 2)'' (empty)
[4, 100)EFG
[100, 105)'' (empty)

randSeed notes

Important

randSeed applies only when valType is 0. Changing randSeed on an existing table requires exporting all data and reimporting it under the new sharding algorithm. Data is not redistributed automatically.

Limits

  • The shard key must be CHAR or VARCHAR. Numeric columns are not supported.

  • STR_HASH parameters cannot be changed after the table is created.

  • String routing is case-sensitive: Order001 and order001 route to different shards.

Use cases

Precise routing by a fixed string format

Use STR_HASH when the shard key follows a fixed format and you need each physical table shard to correspond to exactly one shard key value range. A typical example: an order management system that shards by year-month (using YYYYMM) for database partitioning, then shards table partitions by the last three digits of each order ID—which always fall in the range 000999.

CREATE TABLE orders (
    id INT NOT NULL AUTO_INCREMENT,
    order_id VARCHAR(30) NOT NULL,
    create_time DATETIME DEFAULT NULL,
    PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
dbpartition BY YYYYMM(`create_time`)
tbpartition BY STR_HASH(`order_id`, -1, 3, 1) tbpartitions 1024;

This extracts the last three characters of order_id, converts them to an integer (valType=1), and applies a modulo operation across 1,024 table shards. Each shard maps to exactly one three-digit suffix—something the default HASH function cannot guarantee, because its hashCode may map different strings to the same integer.

String shard keys with point queries

When queries always filter on the exact shard key value (no range conditions), STR_HASH routes efficiently for any CHAR or VARCHAR column—such as transaction IDs, logistics tracking numbers, or user identifiers.

Examples

All examples use order_id VARCHAR(32) as the shard key, targeting 4 database shards and 8 table shards (2 table shards per database shard).

Route by the last 4 characters as an integer

CREATE TABLE test_str_hash_tb (
    id INT NOT NULL AUTO_INCREMENT,
    order_id VARCHAR(32) NOT NULL,
    create_time DATETIME DEFAULT NULL,
    PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
dbpartition BY STR_HASH(`order_id`, -1, 4, 1)
tbpartition BY STR_HASH(`order_id`, -1, 4, 1) tbpartitions 2;

Route by characters 3 through 7 (positions 2–7) as a string

CREATE TABLE test_str_hash_tb (
    id INT NOT NULL AUTO_INCREMENT,
    order_id VARCHAR(32) NOT NULL,
    create_time DATETIME DEFAULT NULL,
    PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
dbpartition BY STR_HASH(`order_id`, 2, 7)
tbpartition BY STR_HASH(`order_id`, 2, 7) tbpartitions 2;

Route by the first 5 characters as a string

CREATE TABLE test_str_hash_tb (
    id INT NOT NULL AUTO_INCREMENT,
    order_id VARCHAR(32) NOT NULL,
    create_time DATETIME DEFAULT NULL,
    PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
dbpartition BY STR_HASH(`order_id`, 5, -1)
tbpartition BY STR_HASH(`order_id`, 5, -1) tbpartitions 2;

FAQ

What's the difference between STR_HASH and HASH?

Both functions route rows using a string shard key, but they use different algorithms:

STR_HASHHASH
Substring extractionSupportedNot supported
Routing algorithmUNI_HASHSimple modulo on hashCode
One-to-one shard mappingAchievable (when substring is deterministic)Not guaranteed—different strings may hash to the same integer

Use STR_HASH when you need deterministic one-to-one mapping between a shard key value (or its substring) and a physical shard.