全部产品
阿里云办公

JSON 数据类型操作

更新时间:2017-07-11 17:11:11

JSON 类型几乎已成为互联网及物联网(IoT)的基础数据类型,其重要性不言而喻,具体协议请参见 JSON 官网

PostgreSQL 对 JSON 的支持已经比较完善,阿里云深度优化云数据库 HybridDB for PostgreSQL,基于 PostgreSQL 语法进行了 JSON 数据类型的支持。

检查现有版本是否支持 JSON

执行如下命令,检查是否已经支持 JSON:

=> SELECT '""'::json;

若系统出现如下信息,则说明已经支持 JSON 类型,可以使用实例了。若执行不成功,请重新启动实例。

  1. json
  2. ------
  3. ""
  4. (1 row)

若系统出现如下信息,则说明尚未支持 JSON 类型。

  1. ERROR: type "json" does not exist
  2. LINE 1: SELECT '""'::json;
  3. ^

上述命令是一次从字符串到 JSON 格式的强制转换,这也基本上是其操作上的本质。

JSON 在数据库中的转换

数据库的操作主要分为读和写,JSON 的数据写入一般是字符串到 JSON。字符串中的内容必须符合 JSON 标准,包括字符串、数字、数组、对象等内容。如:

字符串

  1. => SELECT '"hijson"'::json;
  2. json
  3. -------
  4. "hijson"
  5. (1 row)

::在 PostgreSQL/Greenplum/HybridDB for PostgreSQL 中代表强制类型转换。在此转换的时候,会调用 JSON 类型的输入函数。因此,类型转换时同样会做 JSON 格式的检查,如下所示:

  1. => SELECT '{hijson:1024}'::json;
  2. ERROR: invalid input syntax for type json
  3. LINE 1: SELECT '{hijson:1024}'::json;
  4. ^
  5. DETAIL: Token "hijson" is invalid.
  6. CONTEXT: JSON data, line 1: {hijson...
  7. =>

上述"hijson"两边的"是必不可少的,因为在标准中,KEY 值对应的是一个字符串,所以这里的{hijson:1024}在语法上会报错。

除了类型上的强制转换,还有从数据库记录到 JSON 串的转换。

我们正常使用 JSON,不会只用一个 String 或一个 Number,而是一个包含一个或多个键值对的对象。所以,对 Greenplum 而言,支持到对象的转换,即支持了 JSON 的绝大多数场景,如:

  1. => select row_to_json(row('{"a":"a"}', 'b'));
  2. row_to_json
  3. ---------------------------------
  4. {"f1":"{\"a\":\"a\"}","f2":"b"}
  5. (1 row)
  6. => select row_to_json(row('{"a":"a"}'::json, 'b'));
  7. row_to_json
  8. ---------------------------
  9. {"f1":{"a":"a"},"f2":"b"}
  10. (1 row)

由此也可以看出字符串和 JSON 的区别,这样就可以很方便地将一整条记录转换成 JSON。

JSON 内部数据类型的定义

  • 对象

    对象是 JSON 中最常用的,如:

    1. => select '{"key":"value"}'::json;
    2. json
    3. -----------------
    4. {"key":"value"}
    5. (1 row)
  • 整数 & 浮点数

    JSON 的协议只有三种数字:整数、浮点数和常数表达式,当前 Greenplum 对这三种都有很好的支持。

    1. => SELECT '1024'::json;
    2. json
    3. ------
    4. 1024
    5. (1 row)
    6. => SELECT '0.1'::json;
    7. json
    8. ------
    9. 0.1
    10. (1 row)

    特殊情况下,需要如下信息:

    1. => SELECT '1e100'::json;
    2. json
    3. -------
    4. 1e100
    5. (1 row)
    6. => SELECT '{"f":1e100}'::json;
    7. json
    8. -------------
    9. {"f":1e100}
    10. (1 row)

    并且,包括下面这个长度超长的数字:

    1. => SELECT '9223372036854775808'::json;
    2. json
    3. ---------------------
    4. 9223372036854775808
    5. (1 row)
  • 数组

    1. => SELECT '[[1,2], [3,4,5]]'::json;
    2. json
    3. ------------------
    4. [[1,2], [3,4,5]]
    5. (1 row)

操作符

JSON 支持的操作符类型

  1. => select oprname,oprcode from pg_operator where oprleft = 3114;
  2. oprname | oprcode
  3. ---------+---------------------------
  4. -> | json_object_field
  5. ->> | json_object_field_text
  6. -> | json_array_element
  7. ->> | json_array_element_text
  8. #> | json_extract_path_op
  9. #>> | json_extract_path_text_op
  10. (6 rows)

基本使用方法

  1. => SELECT '{"f":"1e100"}'::json -> 'f';
  2. ?column?
  3. ----------
  4. "1e100"
  5. (1 row)
  6. => SELECT '{"f":"1e100"}'::json ->> 'f';
  7. ?column?
  8. ----------
  9. 1e100
  10. (1 row)
  11. => select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f4','f6'];
  12. ?column?
  13. -----------
  14. "stringy"
  15. (1 row)
  16. => select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f4,f6}';
  17. ?column?
  18. -----------
  19. "stringy"
  20. (1 row)
  21. => select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f2,0}';
  22. ?column?
  23. ----------
  24. f3
  25. (1 row)

JSON 函数

支持的函数

  1. postgres=# \df *json*
  2. List of functions
  3. Schema | Name | Result data type | Argument data types | Type
  4. ------------+---------------------------+------------------+-----------------------------------------------------------+--------
  5. pg_catalog | array_to_json | json | anyarray | normal
  6. pg_catalog | array_to_json | json | anyarray, boolean | normal
  7. pg_catalog | json_array_element | json | from_json json, element_index integer | normal
  8. pg_catalog | json_array_element_text | text | from_json json, element_index integer | normal
  9. pg_catalog | json_array_elements | SETOF json | from_json json, OUT value json | normal
  10. pg_catalog | json_array_length | integer | json | normal
  11. pg_catalog | json_each | SETOF record | from_json json, OUT key text, OUT value json | normal
  12. pg_catalog | json_each_text | SETOF record | from_json json, OUT key text, OUT value text | normal
  13. pg_catalog | json_extract_path | json | from_json json, VARIADIC path_elems text[] | normal
  14. pg_catalog | json_extract_path_op | json | from_json json, path_elems text[] | normal
  15. pg_catalog | json_extract_path_text | text | from_json json, VARIADIC path_elems text[] | normal
  16. pg_catalog | json_extract_path_text_op | text | from_json json, path_elems text[] | normal
  17. pg_catalog | json_in | json | cstring | normal
  18. pg_catalog | json_object_field | json | from_json json, field_name text | normal
  19. pg_catalog | json_object_field_text | text | from_json json, field_name text | normal
  20. pg_catalog | json_object_keys | SETOF text | json | normal
  21. pg_catalog | json_out | cstring | json | normal
  22. pg_catalog | json_populate_record | anyelement | base anyelement, from_json json, use_json_as_text boolean | normal
  23. pg_catalog | json_populate_recordset | SETOF anyelement | base anyelement, from_json json, use_json_as_text boolean | normal
  24. pg_catalog | json_recv | json | internal | normal
  25. pg_catalog | json_send | bytea | json | normal
  26. pg_catalog | row_to_json | json | record | normal
  27. pg_catalog | row_to_json | json | record, boolean | normal
  28. pg_catalog | to_json | json | anyelement | normal
  29. (24 rows)

基本使用方法

  1. => SELECT array_to_json('{{1,5},{99,100}}'::int[]);
  2. array_to_json
  3. ------------------
  4. [[1,5],[99,100]]
  5. (1 row)
  6. => SELECT row_to_json(row(1,'foo'));
  7. row_to_json
  8. ---------------------
  9. {"f1":1,"f2":"foo"}
  10. (1 row)
  11. => SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
  12. json_array_length
  13. -------------------
  14. 5
  15. (1 row)
  16. => select * from json_each('{"f1":[1,2,3],"f2":{"f3":1},"f4":null,"f5":99,"f6":"stringy"}') q;
  17. key | value
  18. -----+-----------
  19. f1 | [1,2,3]
  20. f2 | {"f3":1}
  21. f4 | null
  22. f5 | 99
  23. f6 | "stringy"
  24. (5 rows)
  25. => select json_each_text('{"f1":[1,2,3],"f2":{"f3":1},"f4":null,"f5":"null"}');
  26. json_each_text
  27. -------------------
  28. (f1,"[1,2,3]")
  29. (f2,"{""f3"":1}")
  30. (f4,)
  31. (f5,null)
  32. (4 rows)
  33. => select json_array_elements('[1,true,[1,[2,3]],null,{"f1":1,"f2":[7,8,9]},false]');
  34. json_array_elements
  35. -----------------------
  36. 1
  37. true
  38. [1,[2,3]]
  39. null
  40. {"f1":1,"f2":[7,8,9]}
  41. false
  42. (6 rows)
  43. create type jpop as (a text, b int, c timestamp);
  44. => select * from json_populate_record(null::jpop,'{"a":"blurfl","x":43.2}', false) q;
  45. a | b | c
  46. --------+---+---
  47. blurfl | |
  48. (1 row)
  49. => select * from json_populate_recordset(null::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',false) q;
  50. a | b | c
  51. --------+---+--------------------------
  52. blurfl | |
  53. | 3 | Fri Jan 20 10:42:53 2012
  54. (2 rows)

完整操作事例

创建表

  1. create table tj(id serial, ary int[], obj json, num integer);
  2. => insert into tj(ary, obj, num) values('{1,5}'::int[], '{"obj":1}', 5);
  3. INSERT 0 1
  4. => select row_to_json(q) from (select id, ary, obj, num from tj) as q;
  5. row_to_json
  6. -------------------------------------------
  7. {"f1":1,"f2":[1,5],"f3":{"obj":1},"f4":5}
  8. (1 row)
  9. => insert into tj(ary, obj, num) values('{2,5}'::int[], '{"obj":2}', 5);
  10. INSERT 0 1
  11. => select row_to_json(q) from (select id, ary, obj, num from tj) as q;
  12. row_to_json
  13. -------------------------------------------
  14. {"f1":1,"f2":[1,5],"f3":{"obj":1},"f4":5}
  15. {"f1":2,"f2":[2,5],"f3":{"obj":2},"f4":5}
  16. (2 rows)

多表 JOIN

  1. create table tj2(id serial, ary int[], obj json, num integer);
  2. => insert into tj2(ary, obj, num) values('{2,5}'::int[], '{"obj":2}', 5);
  3. INSERT 0 1
  4. => select * from tj, tj2 where tj.obj->>'obj' = tj2.obj->>'obj';
  5. id | ary | obj | num | id | ary | obj | num
  6. ----+-------+-----------+-----+----+-------+-----------+-----
  7. 2 | {2,5} | {"obj":2} | 5 | 1 | {2,5} | {"obj":2} | 5
  8. (1 row)
  9. => select * from tj, tj2 where json_object_field_text(tj.obj, 'obj') = json_object_field_text(tj2.obj, 'obj');
  10. id | ary | obj | num | id | ary | obj | num
  11. ----+-------+-----------+-----+----+-------+-----------+-----
  12. 2 | {2,5} | {"obj":2} | 5 | 1 | {2,5} | {"obj":2} | 5
  13. (1 row)

JSON 函数索引

  1. CREATE TEMP TABLE test_json (
  2. json_type text,
  3. obj json
  4. );
  5. => insert into test_json values('aa', '{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}');
  6. INSERT 0 1
  7. => insert into test_json values('cc', '{"f7":{"f3":1},"f8":{"f5":99,"f6":"foo"}}');
  8. INSERT 0 1
  9. => select obj->'f2' from test_json where json_type = 'aa';
  10. ?column?
  11. ----------
  12. {"f3":1}
  13. (1 row)
  14. => create index i on test_json (json_extract_path_text(obj, '{f4}'));
  15. CREATE INDEX
  16. => select * from test_json where json_extract_path_text(obj, '{f4}') = '{"f5":99,"f6":"foo"}';
  17. json_type | obj
  18. -----------+-------------------------------------------
  19. aa | {"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}
  20. (1 row)

注意:JSON 类型暂时不能支持作为分布键来使用;也不支持 JSON 聚合函数。

下面是 Python 访问的一个例子:

  1. #! /bin/env python
  2. import time
  3. import json
  4. import psycopg2
  5. def gpquery(sql):
  6. conn = None
  7. try:
  8. conn = psycopg2.connect("dbname=sanity1x2")
  9. conn.autocommit = True
  10. cur = conn.cursor()
  11. cur.execute(sql)
  12. return cur.fetchall()
  13. except Exception as e:
  14. if conn:
  15. try:
  16. conn.close()
  17. except:
  18. pass
  19. time.sleep(10)
  20. print e
  21. return None
  22. def main():
  23. sql = "select obj from tj;"
  24. #rows = Connection(host, port, user, pwd, dbname).query(sql)
  25. rows = gpquery(sql)
  26. for row in rows:
  27. print json.loads(row[0])
  28. if __name__ == "__main__":
  29. main()
本文导读目录