Full-text index and query acceleration support

更新时间:
复制 MD 格式

SelectDB supports full-text search through a set of MATCH_* operators backed by an inverted index, and accelerates standard SQL predicates on indexed columns using the same index structure.

Full-text search operators

All MATCH_* operators share the same syntax: WHERE <column> <operator> '<keywords>'.

Operator What it matches
MATCH_ANY Rows containing any of the specified keywords
MATCH_ALL Rows containing all of the specified keywords
MATCH_PHRASE Rows where keywords appear adjacent and in order
MATCH_PHRASE (with slop) Rows where keywords appear within a maximum token distance
MATCH_PHRASE (strict order) Rows where keywords appear in order within a maximum token distance
MATCH_PHRASE_PREFIX Rows where keywords appear as a phrase with the last token matched as a prefix
MATCH_REGEXP Rows where any post-tokenization token matches the regular expression
MATCH_PHRASE_EDGE Rows where the first token is matched as a suffix, middle tokens exactly, and the last token as a prefix

MATCH_ANY

Returns rows where the column contains at least one of the specified keywords.

SELECT * FROM table_name WHERE content MATCH_ANY 'keyword1 keyword2';

MATCH_ALL

Returns rows where the column contains every specified keyword.

SELECT * FROM table_name WHERE content MATCH_ALL 'keyword1 keyword2';

MATCH_PHRASE

Returns rows where all keywords appear adjacent and in the same order.

To accelerate MATCH_PHRASE queries with an index, set "support_phrase" = "true" in the index properties.
SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2';

MATCH_PHRASE with slop

Allows up to N other tokens between the keywords, where N is the slop value appended as ~N.

-- Allows a maximum of 3 other tokens between keyword1 and keyword2
SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2 ~3';

MATCH_PHRASE with strict order

Combines slop with + to enforce keyword order. Keywords must appear within N tokens of each other and in the specified order.

SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2 ~3+';

MATCH_PHRASE_PREFIX

Performs a phrase match where the last token is matched as a prefix. If only one token is provided, it performs a prefix match on that single token.

-- The last token is matched as a prefix
SELECT * FROM table_name WHERE content MATCH_PHRASE_PREFIX 'keyword1 key';

-- A single token defaults to a prefix match
SELECT * FROM table_name WHERE content MATCH_PHRASE_PREFIX 'keyword1';

MATCH_REGEXP

Matches rows where any token — after tokenization — satisfies the regular expression.

SELECT * FROM table_name WHERE content MATCH_REGEXP '^key_word.*';

MATCH_PHRASE_EDGE

Performs an edge phrase match: the first token is matched as a suffix, middle tokens are matched exactly, and the last token is matched as a prefix. All tokens must be adjacent.

SELECT * FROM table_name WHERE content MATCH_PHRASE_EDGE 'search engine optim';

Inverted index query acceleration

The inverted index accelerates standard SQL predicates in addition to full-text search. When an inverted index is defined on a column, the following operators and functions automatically use index acceleration.

Category Operators / Functions
Equality and collection =, !=, IN, NOT IN
Range >, >=, <, <=, BETWEEN
Null value checks IS NULL, IS NOT NULL
Array functions array_contains, array_overlaps
-- Range: returns rows where price is between 100 (inclusive) and 200 (exclusive)
SELECT * FROM t WHERE price >= 100 AND price < 200;

-- Collection: returns rows where tags contains 'a', 'b', or 'c'
SELECT * FROM t WHERE tags IN ('a', 'b', 'c');

-- Array: returns rows where the attributes array contains 'color'
SELECT * FROM t WHERE array_contains(attributes, 'color');