本文为您介绍如何进行PyODPS的排序。

前提条件

请提前开通MaxCompute和DataWorks服务,并在DataWorks完成业务流程的创建。本例使用DataWorks简单模式。

操作步骤

  1. 准备测试数据。
    1. 单击此处下载鸢尾花数据集iris.data,重命名为iris.csv
    2. 登录DataWorks控制台,单击数据开发 > 业务流程 > 新建表 ,以DDL模式创建表pyodps_iris。创建表iris.png单击生成表结构 > 提交到生成环境,完成创建表。
      建表语句如下。
      CREATE TABLE if not exists pyodps_iris
      (
      sepallength  DOUBLE comment '片长度(cm)',
      sepalwidth   DOUBLE comment '片宽度(cm)',
      petallength  DOUBLE comment '瓣长度(cm)',
      petalwidth   DOUBLE comment '瓣宽度(cm)',
      name         STRING comment '种类'
      );
    3. 将iris.csv的数据导入到表pyodps.iris中。单击表pyodps.iris,右键选择导入数据,在数据导入向导中上传iris.csv文件完成数据导入。数据导入
  2. 新建PyODPS节点。单击业务流程 > 数据开发,右键选择数据开发节点 > PyODPS,并将节点命名为数据排序。数据排序
    数据排序的代码如下。
    from  odps.df import DataFrame
    
    iris = DataFrame(o.get_table('pyodps_iris'))
    
    #排序
    
    print iris.sort('sepalwidth').head(5)
    
    #降序排列两种方式
    
    #设置参数ascending=False;进行降序排列
    print iris.sort('sepalwidth',ascending=False).head(5)
    
    #设置-实现降序排列
    print iris.sort(-iris.sepalwidth).head(5)
    
    #多字段排序
    print iris.sort(['sepalwidth','petallength']).head(5)
    
    #多字段排序时,如果是升序降序不同,ascending参数可以用于传入一个列表,长度必须等同于排序的字段,它们的值都是BOOLEAN类型
    print iris.sort(['sepalwidth','petallength'],ascending=[True,False]).head(5)
    print iris.sort(['sepalwidth',-iris.petallength]).head(5)
  3. 单击运行按钮,运行该节点。运行节点
  4. 运行日志中查看运行结果。运行日志
    完整的运行结果如下。
    Sql compiled:
    CREATE TABLE tmp_pyodps_d1b06785_dc18_4288_ad34_de860de1be08 LIFECYCLE 1 AS
    SELECT *
    FROM WB_BestPractice_dev.`pyodps_iris` t1
    ORDER BY sepalwidth
    LIMIT 10000
    Instance ID: 20191010061554817gwml0lim
    
       sepallength  sepalwidth  petallength  petalwidth             name
    0          5.0         2.0          3.5         1.0  Iris-versicolor
    1          6.0         2.2          5.0         1.5   Iris-virginica
    2          6.2         2.2          4.5         1.5  Iris-versicolor
    3          6.0         2.2          4.0         1.0  Iris-versicolor
    4          5.5         2.3          4.0         1.3  Iris-versicolor
    Sql compiled:
    CREATE TABLE tmp_pyodps_3cb90bb2_fb95_43fb_ae84_f2b5a27d72dc LIFECYCLE 1 AS
    SELECT *
    FROM WB_BestPractice_dev.`pyodps_iris` t1
    ORDER BY sepalwidth DESC
    LIMIT 10000
    Instance ID: 20191010061601287gs086792
    
    
       sepallength  sepalwidth  petallength  petalwidth         name
    0          5.7         4.4          1.5         0.4  Iris-setosa
    1          5.5         4.2          1.4         0.2  Iris-setosa
    2          5.2         4.1          1.5         0.1  Iris-setosa
    3          5.8         4.0          1.2         0.2  Iris-setosa
    4          5.4         3.9          1.3         0.4  Iris-setosa
    Sql compiled:
    CREATE TABLE tmp_pyodps_97b080bb_e014_48e8_a310_4b45fcd6a2ed LIFECYCLE 1 AS
    SELECT *
    FROM WB_BestPractice_dev.`pyodps_iris` t1
    ORDER BY sepalwidth DESC
    LIMIT 10000
    Instance ID: 20191010061606927g6emz192
    
       sepallength  sepalwidth  petallength  petalwidth         name
    0          5.7         4.4          1.5         0.4  Iris-setosa
    1          5.5         4.2          1.4         0.2  Iris-setosa
    2          5.2         4.1          1.5         0.1  Iris-setosa
    3          5.8         4.0          1.2         0.2  Iris-setosa
    4          5.4         3.9          1.3         0.4  Iris-setosa
    Sql compiled:
    CREATE TABLE tmp_pyodps_6fe37b6e_6705_4052_b733_211eb9bd16ac LIFECYCLE 1 AS
    SELECT *
    FROM WB_BestPractice_dev.`pyodps_iris` t1
    ORDER BY sepalwidth, petallength
    LIMIT 10000
    Instance ID: 20191010061611714gn586792
    
       sepallength  sepalwidth  petallength  petalwidth             name
    0          5.0         2.0          3.5         1.0  Iris-versicolor
    1          6.0         2.2          4.0         1.0  Iris-versicolor
    2          6.2         2.2          4.5         1.5  Iris-versicolor
    3          6.0         2.2          5.0         1.5   Iris-virginica
    4          4.5         2.3          1.3         0.3      Iris-setosa
    Sql compiled:
    CREATE TABLE tmp_pyodps_a52c805c_94a1_4a75_a6af_4fc9ed06ae68 LIFECYCLE 1 AS
    SELECT *
    FROM WB_BestPractice_dev.`pyodps_iris` t1
    ORDER BY sepalwidth, petallength DESC
    LIMIT 10000
    Instance ID: 20191010061616553gw3m9592
    
       sepallength  sepalwidth  petallength  petalwidth             name
    0          5.0         2.0          3.5         1.0  Iris-versicolor
    1          6.0         2.2          5.0         1.5   Iris-virginica
    2          6.2         2.2          4.5         1.5  Iris-versicolor
    3          6.0         2.2          4.0         1.0  Iris-versicolor
    4          6.3         2.3          4.4         1.3  Iris-versicolor
    Sql compiled:
    CREATE TABLE tmp_pyodps_aac5538e_9b40_4078_b3c6_852b99c663c1 LIFECYCLE 1 AS
    SELECT *
    FROM WB_BestPractice_dev.`pyodps_iris` t1
    ORDER BY sepalwidth, petallength DESC
    LIMIT 10000
    Instance ID: 20191010061621329gvmkc292
    
    
       sepallength  sepalwidth  petallength  petalwidth             name
    0          5.0         2.0          3.5         1.0  Iris-versicolor
    1          6.0         2.2          5.0         1.5   Iris-virginica
    2          6.2         2.2          4.5         1.5  Iris-versicolor
    3          6.0         2.2          4.0         1.0  Iris-versicolor
    4          6.3         2.3          4.4         1.3  Iris-versicolor