全局二级索引支持在指定列上建立索引,生成的索引表中数据按您指定的索引列进行排序,主表的每一个写入都将自动异步同步到索引表。您只向主表中写入数据,根据索引表进行查询,在许多场景下,将极大的提高查询的效率。本文为您介绍电话话单查询场景下如何使用全局二级索引。

以我们常见的电话话单查询为例,建立主表如下:
CellNumber StartTime(Unix时间戳) CalledNumber Duration BaseStationNumber
123456 1532574644 654321 60 1
234567 1532574714 765432 10 1
234567 1532574734 123456 20 3
345678 1532574795 123456 5 2
345678 1532574861 123456 100 2
456789 1532584054 345678 200 3
  • CellNumberStartTime作为表的联合主键,分别代表主叫号码通话发生时间
  • CalleNumberDurationBaseStationNumber三列为表的预定义列,分别代表被叫号码通话时长基站号码

每次用户通话结束后,都会将此次通话的信息记录到该表中。可以分别在被叫号码基站号码列上建立二级索引,来满足不同角度的查询需求(具体建立索引的示例代码见附录)。

假设有以下几种查询需求:

  • 查询号码234567的所有主叫话单。

    由于表格存储为全局有序模型,所有行按主键进行排序,并且提供顺序扫描(getRange)接口,所以只需要在调用getRange接口时,将PK0列的最大及最小值均设置为234567,PK1列(通话发生时间)的最小值设置为0,最大值设置为INT_MAX,对主表进行扫描即可:

    private static void getRangeFromMainTable(SyncClient client, long cellNumber)
    {
        RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria(TABLE_NAME);
    
        // 构造主键
        PrimaryKeyBuilder startPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.fromLong(cellNumber));
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.fromLong(0));
        rangeRowQueryCriteria.setInclusiveStartPrimaryKey(startPrimaryKeyBuilder.build());
    
        // 构造主键
        PrimaryKeyBuilder endPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.fromLong(cellNumber));
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.INF_MAX);
        rangeRowQueryCriteria.setExclusiveEndPrimaryKey(endPrimaryKeyBuilder.build());
    
        rangeRowQueryCriteria.setMaxVersions(1);
    
        String strNum = String.format("%d", cellNumber);
        System.out.println("号码" + strNum + "的所有主叫话单:");
        while (true) {
            GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
            for (Row row : getRangeResponse.getRows()) {
                System.out.println(row);
            }
    
            // 若nextStartPrimaryKey不为null, 则继续读取.
            if (getRangeResponse.getNextStartPrimaryKey() != null) {
                rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
            } else {
                break;
            }
        }
    }
  • 查询号码123456的被叫话单。

    表格存储的模型是对所有行按照主键进行排序,由于被叫号码存在于表的预定义列中,所以无法进行快速查询。因此可以在被叫号码索引表上进行查询。

    索引表IndexOnBeCalledNumber

    PK0 PK1 PK2
    CalledNumber CellNumber StartTime
    123456 234567 1532574734
    123456 345678 1532574795
    123456 345678 1532574861
    654321 123456 1532574644
    765432 234567 1532574714
    345678 456789 1532584054
    说明 系统会自动进行索引列补齐。即把主表的PK添加到索引列后面,共同作为索引表的PK。所以索引表中有三列PK。

    由于索引表IndexOnBeCalledNumber是按被叫号码作为主键,可以直接扫描索引表得到结果:

    private static void getRangeFromIndexTable(SyncClient client, long cellNumber) {
        RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria(INDEX0_NAME);
    
        // 构造主键
        PrimaryKeyBuilder startPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        startPrimaryKeyBuilder.addPrimaryKeyColumn(DEFINED_COL_NAME_1, PrimaryKeyValue.fromLong(cellNumber));
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.INF_MIN);
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.INF_MIN);
        rangeRowQueryCriteria.setInclusiveStartPrimaryKey(startPrimaryKeyBuilder.build());
    
        // 构造主键
        PrimaryKeyBuilder endPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        endPrimaryKeyBuilder.addPrimaryKeyColumn(DEFINED_COL_NAME_1, PrimaryKeyValue.fromLong(cellNumber));
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.INF_MAX);
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.INF_MAX);
        rangeRowQueryCriteria.setExclusiveEndPrimaryKey(endPrimaryKeyBuilder.build());
    
        rangeRowQueryCriteria.setMaxVersions(1);
    
        String strNum = String.format("%d", cellNumber);
        System.out.println("号码" + strNum + "的所有被叫话单:");
        while (true) {
            GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
            for (Row row : getRangeResponse.getRows()) {
                System.out.println(row);
            }
    
            // 若nextStartPrimaryKey不为null, 则继续读取.
            if (getRangeResponse.getNextStartPrimaryKey() != null) {
                rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
            } else {
                break;
            }
        }
    }
  • 查询基站002从时间1532574740开始的所有话单。

    与上述示例类似,但是查询不仅把基站号码列作为条件,同时把通话发生时间列作为查询条件,因此我们可以在基站号码通话发生时间列上建立组合索引。

    索引表IndexOnBaseStation1

    PK0 PK1 PK2
    BaseStationNumber StartTime CellNumber
    001 1532574644 123456
    001 1532574714 234567
    002 1532574795 345678
    002 1532574861 345678
    003 1532574734 234567
    003 1532584054 456789

    然后在IndexOnBaseStation1索引表上进行查询:

    private static void getRangeFromIndexTable(SyncClient client,
                                               long baseStationNumber,
                                               long startTime) {
        RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria(INDEX1_NAME);
    
        // 构造主键
        PrimaryKeyBuilder startPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        startPrimaryKeyBuilder.addPrimaryKeyColumn(DEFINED_COL_NAME_3, PrimaryKeyValue.fromLong(baseStationNumber));
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.fromLong(startTime));
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.INF_MIN);
        rangeRowQueryCriteria.setInclusiveStartPrimaryKey(startPrimaryKeyBuilder.build());
    
        // 构造主键
        PrimaryKeyBuilder endPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        endPrimaryKeyBuilder.addPrimaryKeyColumn(DEFINED_COL_NAME_3, PrimaryKeyValue.fromLong(baseStationNumber));
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.INF_MAX);
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.INF_MAX);
        rangeRowQueryCriteria.setExclusiveEndPrimaryKey(endPrimaryKeyBuilder.build());
    
        rangeRowQueryCriteria.setMaxVersions(1);
    
        String strBaseStationNum = String.format("%d", baseStationNumber);
        String strStartTime = String.format("%d", startTime);
        System.out.println("基站" + strBaseStationNum + "从时间" + strStartTime + "开始的所有被叫话单:");
        while (true) {
            GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
            for (Row row : getRangeResponse.getRows()) {
                System.out.println(row);
            }
    
            // 若nextStartPrimaryKey不为null, 则继续读取.
            if (getRangeResponse.getNextStartPrimaryKey() != null) {
                rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
            } else {
                break;
            }
        }
    }
  • 查询发生在基站003上时间从15325748611532584054的所有通话记录的通话时长。

    在该查询中不仅把基站号码列与通话发生时间列作为查询条件,而且只把通话时长列作为返回结果。您可以上一个查询中的索引,查索引表成功后反查主表得到通话时长:

    private static void getRowFromIndexAndMainTable(SyncClient client,
                                                    long baseStationNumber,
                                                    long startTime,
                                                    long endTime,
                                                    String colName) {
        RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria(INDEX1_NAME);
    
        // 构造主键
        PrimaryKeyBuilder startPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        startPrimaryKeyBuilder.addPrimaryKeyColumn(DEFINED_COL_NAME_3, PrimaryKeyValue.fromLong(baseStationNumber));
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.fromLong(startTime));
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.INF_MIN);
        rangeRowQueryCriteria.setInclusiveStartPrimaryKey(startPrimaryKeyBuilder.build());
    
        // 构造主键
        PrimaryKeyBuilder endPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        endPrimaryKeyBuilder.addPrimaryKeyColumn(DEFINED_COL_NAME_3, PrimaryKeyValue.fromLong(baseStationNumber));
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.fromLong(endTime));
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.INF_MAX);
        rangeRowQueryCriteria.setExclusiveEndPrimaryKey(endPrimaryKeyBuilder.build());
    
        rangeRowQueryCriteria.setMaxVersions(1);
    
        String strBaseStationNum = String.format("%d", baseStationNumber);
        String strStartTime = String.format("%d", startTime);
        String strEndTime = String.format("%d", endTime);
    
        System.out.println("基站" + strBaseStationNum + "从时间" + strStartTime + "到" + strEndTime + "的所有话单通话时长:");
        while (true) {
            GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
            for (Row row : getRangeResponse.getRows()) {
                PrimaryKey curIndexPrimaryKey = row.getPrimaryKey();
                PrimaryKeyColumn mainCalledNumber = curIndexPrimaryKey.getPrimaryKeyColumn(PRIMARY_KEY_NAME_1);
                PrimaryKeyColumn callStartTime = curIndexPrimaryKey.getPrimaryKeyColumn(PRIMARY_KEY_NAME_2);
                PrimaryKeyBuilder mainTablePKBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
                mainTablePKBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, mainCalledNumber.getValue());
                mainTablePKBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, callStartTime.getValue());
                PrimaryKey mainTablePK = mainTablePKBuilder.build(); // 构造主表PK
    
                // 反查主表
                SingleRowQueryCriteria criteria = new SingleRowQueryCriteria(TABLE_NAME, mainTablePK);
                criteria.addColumnsToGet(colName); // 读取主表的"通话时长"列
                // 设置读取最新版本
                criteria.setMaxVersions(1);
                GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
                Row mainTableRow = getRowResponse.getRow();
    
                System.out.println(mainTableRow);
            }
    
            // 若nextStartPrimaryKey不为null, 则继续读取.
            if (getRangeResponse.getNextStartPrimaryKey() != null) {
                rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
            } else {
                break;
            }
        }
    }

    为了提高查询效率,可以在基站号码列与通话发生时间列上建立组合索引,并把通话时长列作为索引表的属性列:

    数据库中的记录将会如下所示:

    索引表IndexOnBaseStation2

    PK0 PK1 PK2 Defined0
    BaseStationNumber StartTime CellNumber Duration
    001 1532574644 123456 60
    001 1532574714 234567 10
    002 1532574795 345678 5
    002 1532574861 345678 100
    003 1532574734 234567 20
    003 1532584054 456789 200

    然后在IndexOnBaseStation2索引表上进行查询:

    private static void getRangeFromIndexTable(SyncClient client,
                                               long baseStationNumber,
                                               long startTime,
                                               long endTime,
                                               String colName) {
        RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria(INDEX2_NAME);
    
        // 构造主键
        PrimaryKeyBuilder startPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        startPrimaryKeyBuilder.addPrimaryKeyColumn(DEFINED_COL_NAME_3, PrimaryKeyValue.fromLong(baseStationNumber));
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.fromLong(startTime));
        startPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.INF_MIN);
        rangeRowQueryCriteria.setInclusiveStartPrimaryKey(startPrimaryKeyBuilder.build());
    
        // 构造主键
        PrimaryKeyBuilder endPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        endPrimaryKeyBuilder.addPrimaryKeyColumn(DEFINED_COL_NAME_3, PrimaryKeyValue.fromLong(baseStationNumber));
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_2, PrimaryKeyValue.fromLong(endTime));
        endPrimaryKeyBuilder.addPrimaryKeyColumn(PRIMARY_KEY_NAME_1, PrimaryKeyValue.INF_MAX);
        rangeRowQueryCriteria.setExclusiveEndPrimaryKey(endPrimaryKeyBuilder.build());
    
        // 设置读取列
        rangeRowQueryCriteria.addColumnsToGet(colName);
    
        rangeRowQueryCriteria.setMaxVersions(1);
    
        String strBaseStationNum = String.format("%d", baseStationNumber);
        String strStartTime = String.format("%d", startTime);
        String strEndTime = String.format("%d", endTime);
    
        System.out.println("基站" + strBaseStationNum + "从时间" + strStartTime + "到" + strEndTime + "的所有话单通话时长:");
        while (true) {
            GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
            for (Row row : getRangeResponse.getRows()) {
                System.out.println(row);
            }
    
            // 若nextStartPrimaryKey不为null, 则继续读取.
            if (getRangeResponse.getNextStartPrimaryKey() != null) {
                rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
            } else {
                break;
            }
        }
    }
    ​```

    由此可见,如果不把通话时长列作为索引表的属性列,在每次查询时都需先从索引表中解出主表的PK,然后对主表进行随机读。当然,把通话时长列作为索引表的属性列后,该列被同时存储在了主表及索引表中,增加了总的存储空间占用。

  • 查询发生在基站003上时间从15325748611532584054的所有通话记录的总通话时长,平均通话时长,最大通话时长,最小通话时长。

    相对于上一条查询,这里不要求返回每一条通话记录的时长,只要求返回所有通话时长的统计信息。用户可以使用与上条查询相同的的查询方式,然后自行对返回的每条通话时长做计算并得到最终结果。也可以使用SQL-on-OTS,省去客户端的计算,直接使用SQL语句返回最终统计结果,SQL-on-OTS的开通使用文档,请参见可见OLAP on Table Store:基于Data Lake Analytics的Serverless SQL大数据分析。其兼容绝大多数MySql语法,可以更方便的进行更复杂的、更贴近用户业务逻辑的计算。