DML-ApsaraDB for HBase(Hbase)-阿里云帮助中心

Lindorm Cassandra Query Language (CQL) supports three data manipulation statements: SELECT, UPDATE, and DELETE.

Syntax conventions

Symbol	Description
`UPPERCASE`	Keyword. Type exactly as shown.
`lowercase`	Non-literal placeholder. Replace with an actual value.
`[ ]`	Optional. Omit the brackets when you type the clause.
`( )`	Group. Choose from the items inside.
`\|`	Or. Choose one of the alternatives.
`*`	Repeatable. The preceding element can appear zero or more times.
`::=`	Is defined as.

SELECT

Queries one or more columns from one or more rows in a table.

Lindorm CQL does not support JOIN clauses or subqueries. SELECT applies to a single table only.

Syntax

select_statement ::=  SELECT [ JSON | HOTDATA ] ( select_clause | '*' )
                      FROM table_name
                      [ WHERE where_clause ]
                      [ GROUP BY group_by_clause ]
                      [ ORDER BY ordering_clause ]
                      [ PER PARTITION LIMIT (integer | bind_marker) ]
                      [ LIMIT (integer | bind_marker) ]
                      [ ALLOW FILTERING ]
select_clause    ::=  selector [ AS identifier ] ( ',' selector [ AS identifier ] )
selector         ::=  column_name
                      | term
                      | CAST '(' selector AS cql_type ')'
                      | function_name '(' [ selector ( ',' selector )* ] ')'
                      | COUNT '(' '*' ')'
where_clause     ::=  relation ( AND relation )*
relation         ::=  column_name operator term
                      '(' column_name ( ',' column_name )* ')' operator tuple_literal
                      TOKEN '(' column_name ( ',' column_name )* ')' operator term
operator         ::=  '=' | '<' | '>' | '<=' | '>=' | '!=' | IN | CONTAINS | CONTAINS KEY
group_by_clause  ::=  column_name ( ',' column_name )*
ordering_clause  ::=  column_name [ ASC | DESC ] ( ',' column_name [ ASC | DESC ] )*

Examples

-- Basic query with IN filter
CREATE TABLE persioninfo (c1 int PRIMARY KEY, c2 text, c3 text);
SELECT c1, c2 FROM persioninfo WHERE c1 IN (199, 200, 207);

-- Return results as JSON
SELECT JSON c1, c2 FROM persioninfo WHERE c1 = 199;

-- Use column aliases
SELECT c1 AS name, c2 AS address FROM persioninfo;

-- Range query on a clustering column
CREATE TABLE persioninfo (c1 int, c2 text, c3 text, PRIMARY KEY(c1, c2));
SELECT * FROM persioninfo
  WHERE c1 = 12
    AND c2 > '2011-02-03'
    AND c2 <= '2012-01-01';

-- Count rows
SELECT COUNT(*) AS user_count FROM persioninfo;

SELECT clause

A SELECT clause specifies which columns to return. It can be a wildcard (*) to return all columns, or a comma-separated list of selectors.

Supported selectors

Selector	Description
`column_name`	Returns the values of the specified column.
`term`	A literal value. Typically nested inside a function.
`CAST(selector AS cql_type)`	Converts the result of a nested selector to another CQL type.
`function_name(selector, ...)`	Calls a function with selectors as arguments.
`COUNT(*)`	Counts non-empty results.

WRITETIME function

WRITETIME(column_name) is available only in SELECT clauses. It returns the timestamp metadata stored with each column value. It accepts exactly one argument: a column name.

WHERE clause

The WHERE clause filters which rows to return. Relations can be defined on primary key columns or columns with a secondary index.

Key constraints:

Non-equality relations (<, >, <=, >=) are not allowed on partition key columns. The IN operator is treated as an equality relation.
For clustering columns, relations must select a contiguous set of rows.

Allowed vs. not allowed

The following query is allowed because all three primary key columns are constrained in order:

-- Allowed: userid (partition key) + blog_title + posted_at (clustering columns in order)
CREATE TABLE posts (
    userid text,
    blog_title text,
    posted_at timestamp,
    entry_title text,
    content text,
    category int,
    PRIMARY KEY (userid, blog_title, posted_at)
);

SELECT entry_title, content FROM posts
  WHERE userid = 'john doe'
    AND blog_title = 'John''s Blog'
    AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31';

The following query is not allowed because blog_title is skipped, which breaks the contiguous row requirement:

-- Not allowed: blog_title must be constrained before posted_at
SELECT entry_title, content FROM posts
  WHERE userid = 'john doe'
    AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31';

IN relation

The IN relation is allowed only on the last column of the partition key and the last column of the primary key.

Tuple notation for clustering columns

Tuple notation lets you combine multiple clustering columns in a single relation:

-- Rows where (blog_title, posted_at) is greater than ('John''s Blog', '2012-01-01')
SELECT * FROM posts
  WHERE userid = 'john doe'
    AND (blog_title, posted_at) > ('John''s Blog', '2012-01-01');

Tuple notation also works with IN:

SELECT * FROM posts
  WHERE userid = 'john doe'
    AND (blog_title, posted_at) IN (('John''s Blog', '2012-01-01'), ('Extreme Chess', '2014-06-01'));

GROUP BY clause

GROUP BY groups rows that share the same column values into a single row.

Constraints:

Group only at the partition key or clustering column level.
Column names in GROUP BY must follow the primary key column order.
Equality-constrained primary key columns cannot appear in GROUP BY.

GROUP BY transfers data to a server node for aggregation. Avoid it on large datasets, as it may significantly increase query latency.

ORDER BY clause

ORDER BY controls the sort order of results. Specify column names followed by ASC (ascending, the default) or DESC (descending).

The valid orderings depend on the table's CLUSTERING ORDER definition:

If no CLUSTERING ORDER is defined: you can sort by clustering column order or its reverse.
If CLUSTERING ORDER is defined: you can sort by that order or its reverse.

LIMIT clause

LIMIT restricts the number of rows returned. Use PER PARTITION LIMIT to cap results per partition.

ALLOW FILTERING

By default, Lindorm CQL rejects queries that require server-side filtering — queries whose performance cannot be predicted from the number of returned results alone. You can use the ALLOW FILTERING clause to explicitly run queries that contain filters. After you specify ALLOW FILTERING, the performance of the query may be unpredictable.

Example

Given the following table and index:

CREATE TABLE users (
    username text PRIMARY KEY,
    firstname text,
    lastname text,
    birth_year int,
    country text
);

CREATE INDEX ON users(birth_year);

Both of these queries are allowed because Lindorm CQL can guarantee their performance is proportional to the result size:

SELECT * FROM users;
SELECT * FROM users WHERE birth_year = 1981;

This query is rejected by default because it may scan all indexed rows for birth_year = 1981 even if few users are from France:

-- Rejected without ALLOW FILTERING
SELECT * FROM users WHERE birth_year = 1981 AND country = 'FR';

Add ALLOW FILTERING to run it explicitly:

SELECT * FROM users WHERE birth_year = 1981 AND country = 'FR' ALLOW FILTERING;

HOTDATA

When a table is created with hot/cold data separation, use the HOTDATA keyword to query only the hot data tier:

SELECT HOTDATA * FROM persioninfo;

Without HOTDATA, the query scans both hot and cold data and returns the combined results.

Operations on search indexes

Lindorm CQL supports fuzzy queries, multi-dimensional queries, and sorting on columns that have a search index.

Fuzzy matching with LIKE

Use % as a wildcard to match any number of characters:

Pattern	Match type	Example
`LIKE 'v2'`	Exact	Value equals `v2`
`LIKE '%v2'`	Suffix	Value ends with `v2`
`LIKE 'v2%'`	Prefix	Value starts with `v2`
`LIKE '%v2%'`	Contains	Value contains `v2`

-- Exact match
SELECT * FROM persioninfo WHERE c2 LIKE 'v2';
-- Suffix match
SELECT * FROM kss.tb WHERE c2 LIKE '%v2';
-- Prefix match
SELECT * FROM kss.tb WHERE c2 LIKE 'v2%';
-- Contains match
SELECT * FROM kss.tb WHERE c2 LIKE '%v2%';

search_query keyword

search_query lets you embed a Solr-compatible query expression directly in a SELECT statement:

SELECT selectors
  FROM table
  WHERE (indexed_column_expression | search_query = 'search_expression')
  [ LIMIT n ]
  [ ORDER BY column_name ];

search_query cannot be combined with other WHERE conditions. For example, WHERE search_query = 'c1:name' AND column = 'a' is not supported.

Supported search_expression options

Option	Type	Description
`q`	string	Main query expression
`fq`	string or array of strings	Filter query expressions
`facet`	object	Facet query expression
`sort`	string	Sort expression
`start`	number	Starting index for pagination
`timeAllowed`	number	Query time limit (ms)
`TZ`	string	Time zone (any valid Java TimeZone zone ID)
`paging`	string	Pagination mode (`"driver"`)
`distrib.singlePass`	boolean	Single-pass distributed query
`shards.failover`	boolean	Fail over to other shards on error. Default: `true`
`shards.tolerant`	boolean	Continue when some shards fail. Default: `false`
`commit`	boolean	Commit before querying
`route.partition`	array of strings	Partition routing expression
`route.range`	array of strings	Range routing expression
`query.name`	string	Query name for monitoring

For the full list of characters and syntax supported in search_expression, see Local parameters in queries.

Example

CREATE TABLE persioninfo (
    name text PRIMARY KEY,
    city text,
    id int,
    friend nested
) WITH extensions = {'CONSISTENCY_TYPE': 'strong', 'MUTABILITY': 'IMMUTABLE'};

CREATE SEARCH INDEX sidx ON persioninfo WITH COLUMNS (id, city);

INSERT INTO persioninfo (name, city, id, friend)
  VALUES ('MY', 'hangzhou', 18, '[{"lili":18}, {"gg":20}]');

-- Query using a Solr expression
SELECT name, id FROM persioninfo
  WHERE search_query = '+city:hangzhou +age:[1 TO 18]';

UPDATE

Updates one or more columns for specified rows in a table.

Syntax

update_statement ::=  UPDATE table_name
                      [ USING update_parameter ( AND update_parameter )* ]
                      SET assignment ( ',' assignment )*
                      WHERE where_clause
update_parameter ::=  ( integer | bind_marker )
assignment       ::=  simple_selection '=' term
                     | column_name '=' column_name ( '+' | '-' ) term
                     | column_name '=' list_literal '+' column_name
simple_selection ::=  column_name
                     | column_name '[' term ']'
                     | column_name '.' field_name
condition        ::=  simple_selection operator term

Key behaviors

The WHERE clause must include all primary key columns.
Unlike SQL, UPDATE does not check whether the row exists. If the row exists, it is updated. If not, it is created. After the statement runs, there is no way to determine which action occurred. This is called upsert behavior.
All updates within the same partition key are applied atomically and in isolation.
Counter columns support increment and decrement operations only (for example, c = c + 3). The column name on both sides of the = must be the same. Unlike Apache Cassandra CQL, Lindorm CQL allows counter columns and non-counter columns to coexist in the same table.

Example

CREATE TABLE product (productname text PRIMARY KEY, company text, subdepartment text);

UPDATE product
  SET company       = 'aliyun cloud',
      subdepartment = 'database nosql'
  WHERE productname = 'lindorm';

DELETE

Deletes rows or specific columns from a table.

Syntax

delete_statement ::=  DELETE [ simple_selection ( ',' simple_selection ) ]
                      FROM table_name
                      [ USING update_parameter ( AND update_parameter )* ]
                      WHERE where_clause

Key behaviors

To delete specific columns, list them after the DELETE keyword. The rows identified by the WHERE clause are kept; only those columns are removed.
To delete entire rows, omit the column list after DELETE.
Use IN in the WHERE clause to delete multiple rows in a single statement.
Range-based row deletion is not supported.
All delete operations within the same partition key are applied atomically and in isolation.

Examples

Delete a specific column from matching rows

INSERT INTO t1 (c1, c2, c3) VALUES ('11', '22', '33');
DELETE c2 FROM t1 WHERE c1 IN ('11');

Delete entire rows

-- Omit the column list to delete the entire row
DELETE FROM t1 WHERE c1 = '11';