comando de subconsulta - OpenSearch Servicio Amazon

Las traducciones son generadas a través de traducción automática. En caso de conflicto entre la traducción y la version original de inglés, prevalecerá la version en inglés.

comando de subconsulta

nota

Para ver qué integraciones AWS de fuentes de datos admiten este PPL comando, consulte. Comandos

Usa el subquery comando para realizar consultas complejas y anidadas dentro de tus sentencias de Piped Processing Language ()PPL.

source=logs | where field in [ subquery source=events | where condition | fields field ]

En este ejemplo, la búsqueda principal (source=logs) se filtra según los resultados de la subconsulta (). source=events

El comando subquery admite varios niveles de anidación para el análisis de datos complejos.

Ejemplo de subconsulta anidada

source=logs | where id in [ subquery source=users | where user in [ subquery source=actions | where action="login" | fields user] | fields uid ]
InSubquery Uso
  • source = outer | where a in [ source = inner | fields b ]

  • source = outer | where (a) in [ source = inner | fields b ]

  • source = outer | where (a,b,c) in [ source = inner | fields d,e,f ]

  • source = outer | where a not in [ source = inner | fields b ]

  • source = outer | where (a) not in [ source = inner | fields b ]

  • source = outer | where (a,b,c) not in [ source = inner | fields d,e,f ]

  • source = outer a in [ source = inner | fields b ](filtrado de búsqueda con subconsulta)

  • source = outer a not in [ source = inner | fields b ](filtrado de búsqueda con subconsulta)

  • source = outer | where a in [ source = inner1 | where b not in [ source = inner2 | fields c ] | fields b ](anidado)

  • source = table1 | inner join left = l right = r on l.a = r.a AND r.a in [ source = inner | fields d ] | fields l.a, r.a, b, c(como filtro de unión)

SQLEjemplos de migración con In-subQuery PPL

TPC-H Q4 (subconsulta interna con agregación)

select o_orderpriority, count(*) as order_count from orders where o_orderdate >= date '1993-07-01' and o_orderdate < date '1993-07-01' + interval '3' month and o_orderkey in ( select l_orderkey from lineitem where l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority

Reescrito por consulta: PPL InSubquery

source = orders | where o_orderdate >= "1993-07-01" and o_orderdate < "1993-10-01" and o_orderkey IN [ source = lineitem | where l_commitdate < l_receiptdate | fields l_orderkey ] | stats count(1) as order_count by o_orderpriority | sort o_orderpriority | fields o_orderpriority, order_count

TPC-H Q20 (anidado en la subconsulta)

select s_name, s_address from supplier, nation where s_suppkey in ( select ps_suppkey from partsupp where ps_partkey in ( select p_partkey from part where p_name like 'forest%' ) ) and s_nationkey = n_nationkey and n_name = 'CANADA' order by s_name

PPL InSubquery Reescrito por consulta:

source = supplier | where s_suppkey IN [ source = partsupp | where ps_partkey IN [ source = part | where like(p_name, "forest%") | fields p_partkey ] | fields ps_suppkey ] | inner join left=l right=r on s_nationkey = n_nationkey and n_name = 'CANADA' nation | sort s_name
ExistsSubquery uso

Supuestos:a, b son campos del exterior de la tablac, d son campos del interior de la tablae, f son campos del interior de la tabla2.

  • source = outer | where exists [ source = inner | where a = c ]

  • source = outer | where not exists [ source = inner | where a = c ]

  • source = outer | where exists [ source = inner | where a = c and b = d ]

  • source = outer | where not exists [ source = inner | where a = c and b = d ]

  • source = outer exists [ source = inner | where a = c ](filtrado de búsqueda con subconsulta)

  • source = outer not exists [ source = inner | where a = c ](filtrado de búsqueda con subconsulta)

  • source = table as t1 exists [ source = table as t2 | where t1.a = t2.a ](el alias de la tabla es útil en una subconsulta existente)

  • source = outer | where exists [ source = inner1 | where a = c and exists [ source = inner2 | where c = e ] ](anidado)

  • source = outer | where exists [ source = inner1 | where a = c | where exists [ source = inner2 | where c = e ] ](anidado)

  • source = outer | where exists [ source = inner | where c > 10 ](existe una relación no correlacionada)

  • source = outer | where not exists [ source = inner | where c > 10 ](existe una relación no correlacionada)

  • source = outer | where exists [ source = inner ] | eval l = "nonEmpty" | fields l(existe una relación especial no correlacionada)

ScalarSubquery uso

Supuestos: a b son campos del exterior de la tablac, d son campos del interior de la tablae, f son campos de la tabla anidados

Subconsulta escalar no correlacionada

En Seleccionar:

  • source = outer | eval m = [ source = inner | stats max(c) ] | fields m, a

  • source = outer | eval m = [ source = inner | stats max(c) ] + b | fields m, a

En dónde:

  • source = outer | where a > [ source = inner | stats min(c) ] | fields a

En el filtro de búsqueda:

  • source = outer a > [ source = inner | stats min(c) ] | fields a

Subconsulta escalar correlacionada

En Seleccionar:

  • source = outer | eval m = [ source = inner | where outer.b = inner.d | stats max(c) ] | fields m, a

  • source = outer | eval m = [ source = inner | where b = d | stats max(c) ] | fields m, a

  • source = outer | eval m = [ source = inner | where outer.b > inner.d | stats max(c) ] | fields m, a

En dónde:

  • source = outer | where a = [ source = inner | where outer.b = inner.d | stats max(c) ]

  • source = outer | where a = [ source = inner | where b = d | stats max(c) ]

  • source = outer | where [ source = inner | where outer.b = inner.d OR inner.d = 1 | stats count() ] > 0 | fields a

En el filtro de búsqueda:

  • source = outer a = [ source = inner | where b = d | stats max(c) ]

  • source = outer [ source = inner | where outer.b = inner.d OR inner.d = 1 | stats count() ] > 0 | fields a

Subconsulta escalar anidada

  • source = outer | where a = [ source = inner | stats max(c) | sort c ] OR b = [ source = inner | where c = 1 | stats min(d) | sort d ]

  • source = outer | where a = [ source = inner | where c = [ source = nested | stats max(e) by f | sort f ] | stats max(d) by c | sort c | head 1 ]

(Relación) Subconsulta

InSubquery, ExistsSubquery y ScalarSubquery son todas expresiones de subconsulta. Pero no RelationSubquery es una expresión de subconsulta, es un plan de subconsulta que se usa comúnmente en las cláusulas Join o From.

  • source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ](subconsulta en el lado derecho de la unión)

  • source = [ source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ] | stats count(a) by b ] as outer | head 1

Contexto adicional

InSubqueryExistsSubquery, y ScalarSubquery son expresiones de subconsulta que se utilizan habitualmente en where cláusulas y filtros de búsqueda.

Comando Where:

| where <boolean expression> | ...

Filtro de búsqueda:

search source=* <boolean expression> | ...

Se puede usar una expresión de subconsulta en una expresión booleana:

| where orders.order_id in [ source=returns | where return_reason="damaged" | field order_id ]

El orders.order_id in [ source=... ] es un. <boolean expression>

En general, denominamos InSubquery expresión a este tipo de cláusula de subconsulta. Es una<boolean expression>.

Subconsulta con diferentes tipos de unión

En el ejemplo se utiliza un ScalarSubquery:

source=employees | join source=sales on employees.employee_id = sales.employee_id | where sales.sale_amount > [ source=targets | where target_met="true" | fields target_value ]

A diferencia de InSubquery ExistsSubquery, y ScalarSubquery, a no RelationSubquery es una expresión de subconsulta. En cambio, es un plan de subconsultas.

SEARCH source=customer | FIELDS c_custkey | LEFT OUTER JOIN left = c, right = o ON c.c_custkey = o.o_custkey [ SEARCH source=orders | WHERE o_comment NOT LIKE '%unusual%packages%' | FIELDS o_orderkey, o_custkey ] | STATS ...