在 JSON 数组中搜索值 - Amazon Athena

在 JSON 数组中搜索值

要确定特定值是否存在于 JSON 编码的数组内,请使用 json_array_contains 函数。

以下查询列出参加“project2”的用户的名称。

WITH dataset AS ( SELECT * FROM (VALUES (JSON '{"name": "Bob Smith", "org": "legal", "projects": ["project1"]}'), (JSON '{"name": "Susan Smith", "org": "engineering", "projects": ["project1", "project2", "project3"]}'), (JSON '{"name": "Jane Smith", "org": "finance", "projects": ["project1", "project2"]}') ) AS t (users) ) SELECT json_extract_scalar(users, '$.name') AS user FROM dataset WHERE json_array_contains(json_extract(users, '$.projects'), 'project2')

此查询会返回一个用户列表。

+-------------+ | user | +-------------+ | Susan Smith | +-------------+ | Jane Smith | +-------------+

以下查询示例列出了完成项目的用户的名称以及已完成的项目的总数。它执行以下操作:

  • 为清晰起见,使用嵌套 SELECT 语句。

  • 提取项目的数组。

  • 使用 CAST 将数组转换为键-值对的本机数组。

  • 使用 UNNEST 运算符提取每个数组元素。

  • 按完成项目来筛选获取的值,并对其进行计数。

注意

当使用 CASTMAP 时,您可以将键元素指定为 VARCHAR(Presto 中的本机字符串),但将值保留为 JSON,因为 MAP 中的值具有不同类型:第一个键-值对为字符串,第二个键-值对为布尔型。

WITH dataset AS ( SELECT * FROM (VALUES (JSON '{"name": "Bob Smith", "org": "legal", "projects": [{"name":"project1", "completed":false}]}'), (JSON '{"name": "Susan Smith", "org": "engineering", "projects": [{"name":"project2", "completed":true}, {"name":"project3", "completed":true}]}'), (JSON '{"name": "Jane Smith", "org": "finance", "projects": [{"name":"project2", "completed":true}]}') ) AS t (users) ), employees AS ( SELECT users, CAST(json_extract(users, '$.projects') AS ARRAY(MAP(VARCHAR, JSON))) AS projects_array FROM dataset ), names AS ( SELECT json_extract_scalar(users, '$.name') AS name, projects FROM employees, UNNEST (projects_array) AS t(projects) ) SELECT name, count(projects) AS completed_projects FROM names WHERE cast(element_at(projects, 'completed') AS BOOLEAN) = true GROUP BY name

此查询返回以下结果:

+----------------------------------+ | name | completed_projects | +----------------------------------+ | Susan Smith | 2 | +----------------------------------+ | Jane Smith | 1 | +----------------------------------+