SQL syntax reference

更新时间:
复制 MD 格式

The Lindorm compute engine supports a standard SQL dialect compatible with Apache Doris. This reference covers the supported data types, SQL operators, and built-in functions available in OLAP resource groups.

Supported data types

Data typeExampleSizeRange / notes
TINYINT1271 byteThe value range depends on the length and whether the value has a sign.
SMALLINT327672 bytesThe value range depends on the length and whether the value has a sign.
INTEGER21474836474 bytesThe value range depends on the length and whether the value has a sign.
BIGINT92233720368547758078 bytesThe value range depends on the length and whether the value has a sign.
BOOLEANtrue1 byteValues: true or false
FLOAT1.2344 bytesSingle-precision. Use for monitoring and non-precision-critical scenarios.
DOUBLE1.234568 bytesDouble-precision. Supports scientific notation in SQL statements.
DECIMAL(P[,S])4873.6293048479VariableVariable-precision decimal. P (total digits): 1–38. S (digits after decimal point): 0–P. Storage space increases with precision (P value). Use FLOAT or DOUBLE when high precision is not required.
VARBINARYx'baba'VariableVariable-length binary data for raw binary storage.
STRING'abcde'VariableUp to 1,048,576 characters. Equivalent to VARCHAR(1048576).
DATE'2025-04-15'Date only (no time component).
DATETIME'2025-04-15 00:00:00.123456'Date and time combined.

SQL operators

OLAP resource groups are compatible with Apache Doris operators. The following sections describe each operator with its syntax, behavior, and examples.

Complete SELECT syntax

All SELECT clauses combine in the following order:

[WITH name AS (SELECT ...)]
SELECT [DISTINCT] expr_list
FROM table_or_subquery
[WHERE filter_condition]
[GROUP BY expr_list [HAVING condition]]
[ORDER BY expr_list [ASC | DESC] [NULLS FIRST | NULLS LAST]]
[LIMIT [offset,] count | LIMIT count OFFSET offset]

Projection

Syntax

SELECT expr0, expr1, ...

The SELECT clause projects columns or expressions from the source data. Projection expressions support column references, transformation functions, DISTINCT deduplication, and arithmetic. Use AS to alias projected columns.

Examples

-- Select all columns
SELECT * FROM nation;

-- Select a specific column
SELECT n_name FROM nation;

-- Apply a function
SELECT lower(n_name) FROM nation;

-- Deduplicate values
SELECT DISTINCT n_regionkey FROM nation;

-- Arithmetic expression
SELECT n_nationkey + 1 FROM nation;

-- Column alias
SELECT n_nationkey + 1 AS colname FROM nation;

Filter

Syntax

... WHERE expr0 [AND expr1] [OR expr2] ...

The WHERE clause filters rows by evaluating Boolean expressions. Supports AND, OR, LIKE fuzzy matching, and regular expression matching with regexp.

Examples

-- Equality filter
SELECT n_name FROM nation WHERE n_regionkey = 2;

-- IN list
SELECT n_nationkey FROM nation WHERE n_name IN ('CHINA', 'VIETNAM');

-- Range filter
SELECT n_name FROM nation WHERE n_nationkey BETWEEN 10 AND 20;

-- LIKE fuzzy match
SELECT n_name FROM nation WHERE n_name LIKE 'C%';

-- Regular expression match
SELECT n_name FROM nation WHERE n_name regexp '^(C|U).*';

-- Subquery filter
SELECT n_name FROM nation WHERE n_regionkey IN (SELECT r_regionkey FROM region WHERE r_name = 'ASIA');

Sort

Syntax

... ORDER BY expr0, expr1 [ASC | DESC] [NULLS FIRST | NULLS LAST]

ORDER BY sorts the result set by one or more columns. The default direction is ascending. Use NULLS FIRST or NULLS LAST to control where NULL values appear.

Examples

-- Default ascending sort
SELECT * FROM nation ORDER BY n_name;

-- Descending sort
SELECT * FROM nation ORDER BY n_name DESC;

-- NULLs first
SELECT * FROM nation ORDER BY n_name NULLS FIRST;

-- NULLs last
SELECT * FROM nation ORDER BY n_name NULLS LAST;

Pagination

Syntax

... LIMIT [offset,] count
-- or equivalently:
... LIMIT count OFFSET offset

LIMIT restricts the number of output rows. Combine with ORDER BY to get deterministic results.

Examples

-- First 5 rows
SELECT * FROM nation LIMIT 5;

-- 10 rows starting at offset 5
SELECT * FROM nation LIMIT 5, 10;

-- Ordered pagination: 5 rows starting at offset 3
SELECT * FROM nation ORDER BY n_name LIMIT 3, 5;

-- Equivalent using OFFSET keyword
SELECT * FROM nation ORDER BY n_name LIMIT 5 OFFSET 3;

Aggregation

Syntax

SELECT AGG_FUNC(expr), ... [GROUP BY expr_list] [HAVING condition]

GROUP BY groups rows by one or more columns and computes aggregates per group. Use HAVING to filter groups after aggregation.

Examples

-- Full-table count
SELECT count(*) FROM nation;

-- Maximum value
SELECT max(o_totalprice) FROM orders;

-- Count per group
SELECT c_nationkey, count(*) FROM customer GROUP BY c_nationkey;

-- Filter groups with HAVING
SELECT c_nationkey, count(*) AS user_count FROM customer GROUP BY c_nationkey HAVING user_count > 6000;

Common table expressions

Syntax

WITH name1 AS (SELECT ...) [, name2 AS (SELECT ...)]
SELECT ...

A Common Table Expression (CTE) defines a named temporary result set scoped to a single SQL statement. CTEs can be referenced multiple times in the main query and improve readability for complex queries.

Example

WITH china_users AS (
  SELECT c_custkey FROM customer WHERE c_nationkey = 18
),
canada_users AS (
  SELECT c_custkey FROM customer WHERE c_nationkey = 3
)
SELECT * FROM china_users
UNION ALL
SELECT * FROM canada_users;

Set operators

UNION

Syntax

query1 UNION [ALL | DISTINCT] query2

UNION combines results from two queries. UNION and UNION DISTINCT both remove duplicates. UNION ALL keeps all rows including duplicates.

Examples

-- UNION (removes duplicates, same as UNION DISTINCT)
SELECT n_regionkey FROM nation UNION SELECT n_regionkey FROM nation;

-- Explicit DISTINCT
SELECT n_regionkey FROM nation UNION DISTINCT SELECT n_regionkey FROM nation;

-- Keep duplicates
SELECT n_regionkey FROM nation UNION ALL SELECT n_regionkey FROM nation;

INTERSECT

Syntax

query1 INTERSECT query2

INTERSECT returns rows that appear in both queries, with duplicates removed.

Example

SELECT n_name FROM nation INTERSECT SELECT n_name FROM nation;

EXCEPT / MINUS

Syntax

query1 EXCEPT query2
-- or equivalently:
query1 MINUS query2

EXCEPT (or its alias MINUS) returns rows from the left query that do not appear in the right query, with duplicates removed.

Examples

SELECT n_name FROM nation EXCEPT SELECT n_name FROM nation;

SELECT n_name FROM nation MINUS SELECT n_name FROM nation;

Join

Syntax

... FROM table1 [LEFT | RIGHT | FULL] [OUTER | SEMI | ANTI] JOIN table2 [ON condition]

JOIN merges rows from two tables according to a join condition. Supported join types:

Join typeDescription
INNER JOINReturns rows with matching keys in both tables.
LEFT OUTER JOINReturns all rows from the left table; NULL-fills unmatched right rows.
RIGHT OUTER JOINReturns all rows from the right table; NULL-fills unmatched left rows.
FULL OUTER JOINReturns all rows from both tables; NULL-fills unmatched rows on either side.
CROSS JOINReturns the Cartesian product of both tables.
SELF JOINJoins a table to itself.
LEFT SEMI JOINReturns left-table rows that have at least one match in the right table.
RIGHT SEMI JOINReturns right-table rows that have at least one match in the left table.
LEFT ANTI JOINReturns left-table rows with no match in the right table.
RIGHT ANTI JOINReturns right-table rows with no match in the left table.

Examples

-- Implicit CROSS JOIN (comma syntax)
SELECT a.n_name, b.n_name FROM nation a, nation b;

-- Explicit JOIN (defaults to INNER JOIN)
SELECT a.n_name, b.n_name FROM nation a JOIN nation b;

-- CROSS JOIN
SELECT a.n_name, b.n_name FROM nation a CROSS JOIN nation b;

-- INNER JOIN with ON condition
SELECT r_name, n_name FROM nation JOIN region ON nation.n_regionkey = region.r_regionkey;

-- INNER JOIN using WHERE (equivalent)
SELECT r_name, n_name FROM nation, region WHERE r_regionkey = n_regionkey;

-- LEFT OUTER JOIN
SELECT r_name, n_name FROM nation LEFT OUTER JOIN region ON n_regionkey = r_regionkey;

-- RIGHT OUTER JOIN
SELECT r_name, n_name FROM nation RIGHT OUTER JOIN region ON n_regionkey = r_regionkey;

-- FULL OUTER JOIN
SELECT r_name, n_name FROM nation FULL OUTER JOIN region ON n_regionkey = r_regionkey;

-- LEFT SEMI JOIN
SELECT * FROM nation LEFT SEMI JOIN region ON n_regionkey = r_regionkey;

-- RIGHT SEMI JOIN
SELECT * FROM nation RIGHT SEMI JOIN region ON n_regionkey = r_regionkey;

-- LEFT ANTI JOIN
SELECT * FROM nation LEFT ANTI JOIN region ON n_regionkey = r_regionkey;

-- RIGHT ANTI JOIN
SELECT * FROM nation RIGHT ANTI JOIN region ON n_regionkey = r_regionkey;

Window functions

Syntax

window_function() OVER ([PARTITION BY expr] [ORDER BY expr] [ROWS BETWEEN ... AND ...])

Window functions compute a value for each row based on a set of related rows defined by the OVER clause. Unlike GROUP BY aggregates, they return one value per row without collapsing the result set. This lets you add computed columns alongside the original data—useful for ranking, running totals, and trend analysis.

Examples

-- Row number across all rows, ordered by region key
SELECT row_number() OVER (ORDER BY n_regionkey), n_regionkey, n_name FROM nation;

-- Row number within each partition
SELECT row_number() OVER (PARTITION BY n_regionkey ORDER BY n_regionkey), n_regionkey, n_name FROM nation;

-- Running total of order prices per year, per customer
SELECT
  o_orderdate,
  o_totalprice,
  sum(o_totalprice) OVER (
    PARTITION BY substr(o_orderdate, 1, 4)
    ORDER BY o_orderdate
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS sum
FROM orders
WHERE o_custkey = 53978
ORDER BY o_orderdate;

Common functions

OLAP resource groups support most standard SQL functions. The following sections list commonly used functions by category.

System info

SignatureExampleDescription
VARCHAR CATALOG()CATALOG()Returns the name of the current catalog.
VARCHAR DATABASE()DATABASE()Returns the name of the current database.
VARCHAR CURRENT_USER()CURRENT_USER()Returns the name of the current connected user.
VARCHAR WEBUI()WEBUI()Returns the WebUI address of the resource group, matching what is displayed in the console.

Type conversion and string functions

SignatureExampleDescription
CAST(input AS type)CAST('9.2' AS double)Converts a value from one data type to another.
INT CHAR_LENGTH(VARCHAR str)CHAR_LENGTH('abc')Returns the number of characters in a string.
VARCHAR CONCAT(VARCHAR, ...)CONCAT('b', 'a')Concatenates two or more strings.
INT INSTR(VARCHAR str, VARCHAR substr)INSTR('alibaba', 'ba')Returns the 1-based position of the first occurrence of substr in str. Returns 0 if not found.
INT STRCMP(VARCHAR lhs, VARCHAR rhs)STRCMP('abc', 'bbc')Compares two strings lexicographically. Returns -1 if lhs < rhs, 0 if equal, 1 if lhs > rhs.
VARCHAR SUBSTR(VARCHAR str, pos[, len])SUBSTR('alibaba', 4)Returns a substring starting at the given 1-based position, optionally limited to len characters.
VARCHAR LOWER(VARCHAR str)LOWER('Alibaba')Converts a string to lowercase.
VARCHAR UPPER(VARCHAR str)UPPER('Alibaba')Converts a string to uppercase.

JSON functions

SignatureExampleDescription
JSON PARSE_JSON(VARCHAR str)PARSE_JSON('{"key":"value"}')Parses a JSON string into a JSON object.
VARCHAR JSON_STRING(JSON j)JSON_STRING(json_object_expr)Serializes a JSON object to a string.
JSON -> VARCHAR pathPARSE_JSON('{"key":"value"}') -> '$.key'Reads the element at the given path from a JSON object.
JSON_KEYS(VARCHAR|JSON json_obj)JSON_KEYS('{"a":1,"b":2,"c":3}')Returns an array of top-level keys in the JSON object.
INT JSON_LENGTH(VARCHAR|JSON json_obj)JSON_LENGTH('[1,2,3]')Returns the number of elements in a JSON object or array.
BOOL JSON_EXISTS(VARCHAR|JSON json_obj, VARCHAR path)JSON_EXISTS('{"key":1}', '$.key')Returns whether the JSON object contains the specified path.
INT GET_JSON_INT(VARCHAR|JSON json_obj, VARCHAR path)GET_JSON_INT('{"key":2}', '$.key')Returns the integer value at the specified path.
DOUBLE GET_JSON_DOUBLE(VARCHAR|JSON json_obj, VARCHAR path)GET_JSON_DOUBLE('{"key":2.2}', '$.key')Returns the double value at the specified path.
VARCHAR GET_JSON_STRING(VARCHAR|JSON json_obj, VARCHAR path)GET_JSON_STRING('{"key":{"key2":2}}', '$.key')Returns the string representation of the value at the specified path.
JSON_EACH(VARCHAR|JSON json_obj)SELECT * FROM jsontable, LATERAL JSON_EACH(col_json)Table function for use with LATERAL JOIN. Expands a JSON column into multiple rows, appending key-value pairs to each original row.

Date and time functions

SignatureExampleDescription
DATETIME NOW()NOW()Returns the current timestamp, accurate to milliseconds.
DATE CURDATE()CURDATE()Returns the current date.
BIGINT UNIX_TIMESTAMP(DATETIME date)UNIX_TIMESTAMP('2021-01-07 14:13:20')Converts a datetime value to a Unix timestamp.
VARCHAR FROM_UNIXTIME(BIGINT unix_timestamp[, VARCHAR format])FROM_UNIXTIME(1610000000)Converts a Unix timestamp to a formatted string.
BIGINT DATE_DIFF(VARCHAR unit, DATETIME expr1, DATETIME expr2)DATE_DIFF('hour', '2021-01-07 14:13:20', '2021-01-05 14:13:20')Returns the difference between two datetimes in the specified unit (e.g., 'hour', 'day').
DATETIME DATE_ADD(DATETIME|DATE date, INTERVAL expr type)DATE_ADD('2021-01-07 14:13:20', INTERVAL 2 DAY)Adds a time interval to a date or datetime.
DATETIME DATE_SUB(DATETIME|DATE date, INTERVAL expr type)DATE_SUB('2021-01-07 14:13:20', INTERVAL 2 DAY)Subtracts a time interval from a date or datetime.
VARCHAR DATE_FORMAT(DATETIME date, VARCHAR format)DATE_FORMAT('2021-01-07 14:13:20', '%W %M %Y')Formats a datetime value as a string using the specified format pattern.

Aggregation functions

SignatureExampleDescription
AVG([DISTINCT] expr)AVG(o_totalprice), AVG(DISTINCT o_totalprice)Returns the average value. With DISTINCT, duplicates are removed before averaging.
COUNT(expr)COUNT(*), COUNT(col_name)COUNT(*) counts all rows. COUNT(col_name) counts rows where the column is not NULL.
COUNT(DISTINCT expr [, expr, ...])COUNT(DISTINCT col1, col2)Returns the number of distinct value combinations.
COUNT_IF(condition)COUNT_IF(o_totalprice > 150000)Returns the number of rows that satisfy the condition.
MAX(expr)MAX(o_totalprice)Returns the maximum value.
MAX_BY(x, y)MAX_BY(o_custkey, o_totalprice)Returns the value of x from the row where y is at its maximum—for example, the customer ID from the highest-value order.
MIN(expr)MIN(o_totalprice)Returns the minimum value.
MIN_BY(x, y)MIN_BY(o_custkey, o_totalprice)Returns the value of x from the row where y is at its minimum—for example, the customer ID from the lowest-value order.
SUM([DISTINCT] expr)SUM(o_totalprice), SUM(DISTINCT o_totalprice)Returns the sum. With DISTINCT, duplicates are removed before summing.

Window functions

All window functions use the OVER clause to define the window frame.

Aggregate window functions

SignatureExampleDescription
AVG(expr)AVG(column_name)Returns the average value within the window. Ignores NULL values.
COUNT(expr)COUNT(column_name)Returns the count of rows within the window.
MAX(expr)MAX(column_name)Returns the maximum value within the window.
MIN(expr)MIN(column_name)Returns the minimum value within the window.
SUM(expr)SUM(column_name)Returns the sum of values within the window.

Value window functions

SignatureExampleDescription
FIRST_VALUE(expr [IGNORE NULLS])FIRST_VALUE(column_name IGNORE NULLS)Returns the first value in the window frame. Use IGNORE NULLS to skip NULL values.
LAST_VALUE(expr [IGNORE NULLS])LAST_VALUE(column_name IGNORE NULLS)Returns the last value in the window frame. Opposite of FIRST_VALUE.
LAG(expr [IGNORE NULLS] [, offset[, default]])LAG(column_name, 2, 0)Returns the value from offset rows before the current row. Returns default if no such row exists. Use for row-over-row comparisons.
LEAD(expr [IGNORE NULLS] [, offset[, default]])LEAD(column_name IGNORE NULLS, 3)Returns the value from offset rows after the current row. Returns default if no such row exists.

Usage notes

  • LAG and LEAD default to an offset of 1 and a default value of NULL when those arguments are omitted.

  • Use IGNORE NULLS with FIRST_VALUE, LAST_VALUE, LAG, and LEAD to skip NULL values when scanning the window frame.

  • FIRST_VALUE and LAST_VALUE results depend on the ORDER BY and frame bounds defined in the OVER clause. Without an explicit frame, the default frame extends to the end of the partition, which can produce unexpected results for LAST_VALUE.

Ranking functions

All three ranking functions assign a numeric rank to each row within a partition. They differ in how they handle ties:

FunctionTies get same rank?Gaps after ties?Values always consecutive?
RANK()YesYesNo
DENSE_RANK()YesNoYes
ROW_NUMBER()N/A (unique per row)N/AYes

Usage notes

  • RANK(): Tied rows share the same rank, and the next rank skips the occupied positions. For example, if two rows tie at rank 1, the next row gets rank 3.

  • DENSE_RANK(): Tied rows share the same rank, but the next rank is always the next integer. For the same example, the next row gets rank 2.

  • ROW_NUMBER(): Assigns a unique, continuously incrementing integer to every row regardless of value—no ties, no gaps.

SignatureExample
RANK()RANK()
DENSE_RANK()DENSE_RANK()
ROW_NUMBER()ROW_NUMBER()