Las traducciones son generadas a través de traducción automática. En caso de conflicto entre la traducción y la version original de inglés, prevalecerá la version en inglés.
comando de subconsulta
nota
Para ver qué integraciones AWS de fuentes de datos admiten este PPL comando, consulte. Comandos
Usa el subquery
comando para realizar consultas complejas y anidadas dentro de tus sentencias de Piped Processing Language ()PPL.
source=logs | where field in [ subquery source=events | where condition | fields field ]
En este ejemplo, la búsqueda principal (source=logs
) se filtra según los resultados de la subconsulta (). source=events
El comando subquery admite varios niveles de anidación para el análisis de datos complejos.
Ejemplo de subconsulta anidada
source=logs | where id in [ subquery source=users | where user in [ subquery source=actions | where action="login" | fields user] | fields uid ]
InSubquery Uso
source = outer | where a in [ source = inner | fields b ]
source = outer | where (a) in [ source = inner | fields b ]
source = outer | where (a,b,c) in [ source = inner | fields d,e,f ]
source = outer | where a not in [ source = inner | fields b ]
source = outer | where (a) not in [ source = inner | fields b ]
source = outer | where (a,b,c) not in [ source = inner | fields d,e,f ]
source = outer a in [ source = inner | fields b ]
(filtrado de búsqueda con subconsulta)source = outer a not in [ source = inner | fields b ]
(filtrado de búsqueda con subconsulta)source = outer | where a in [ source = inner1 | where b not in [ source = inner2 | fields c ] | fields b ]
(anidado)source = table1 | inner join left = l right = r on l.a = r.a AND r.a in [ source = inner | fields d ] | fields l.a, r.a, b, c
(como filtro de unión)
SQLEjemplos de migración con In-subQuery PPL
TPC-H Q4 (subconsulta interna con agregación)
select o_orderpriority, count(*) as order_count from orders where o_orderdate >= date '1993-07-01' and o_orderdate < date '1993-07-01' + interval '3' month and o_orderkey in ( select l_orderkey from lineitem where l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority
Reescrito por consulta: PPL InSubquery
source = orders | where o_orderdate >= "1993-07-01" and o_orderdate < "1993-10-01" and o_orderkey IN [ source = lineitem | where l_commitdate < l_receiptdate | fields l_orderkey ] | stats count(1) as order_count by o_orderpriority | sort o_orderpriority | fields o_orderpriority, order_count
TPC-H Q20 (anidado en la subconsulta)
select s_name, s_address from supplier, nation where s_suppkey in ( select ps_suppkey from partsupp where ps_partkey in ( select p_partkey from part where p_name like 'forest%' ) ) and s_nationkey = n_nationkey and n_name = 'CANADA' order by s_name
PPL InSubquery Reescrito por consulta:
source = supplier | where s_suppkey IN [ source = partsupp | where ps_partkey IN [ source = part | where like(p_name, "forest%") | fields p_partkey ] | fields ps_suppkey ] | inner join left=l right=r on s_nationkey = n_nationkey and n_name = 'CANADA' nation | sort s_name
ExistsSubquery uso
Supuestos:a
, b
son campos del exterior de la tablac
, d
son campos del interior de la tablae
, f
son campos del interior de la tabla2.
source = outer | where exists [ source = inner | where a = c ]
source = outer | where not exists [ source = inner | where a = c ]
source = outer | where exists [ source = inner | where a = c and b = d ]
source = outer | where not exists [ source = inner | where a = c and b = d ]
source = outer exists [ source = inner | where a = c ]
(filtrado de búsqueda con subconsulta)source = outer not exists [ source = inner | where a = c ]
(filtrado de búsqueda con subconsulta)source = table as t1 exists [ source = table as t2 | where t1.a = t2.a ]
(el alias de la tabla es útil en una subconsulta existente)source = outer | where exists [ source = inner1 | where a = c and exists [ source = inner2 | where c = e ] ]
(anidado)source = outer | where exists [ source = inner1 | where a = c | where exists [ source = inner2 | where c = e ] ]
(anidado)source = outer | where exists [ source = inner | where c > 10 ]
(existe una relación no correlacionada)source = outer | where not exists [ source = inner | where c > 10 ]
(existe una relación no correlacionada)source = outer | where exists [ source = inner ] | eval l = "nonEmpty" | fields l
(existe una relación especial no correlacionada)
ScalarSubquery uso
Supuestos: a
b
son campos del exterior de la tablac
, d
son campos del interior de la tablae
, f
son campos de la tabla anidados
Subconsulta escalar no correlacionada
En Seleccionar:
source = outer | eval m = [ source = inner | stats max(c) ] | fields m, a
source = outer | eval m = [ source = inner | stats max(c) ] + b | fields m, a
En dónde:
source = outer | where a > [ source = inner | stats min(c) ] | fields a
En el filtro de búsqueda:
source = outer a > [ source = inner | stats min(c) ] | fields a
Subconsulta escalar correlacionada
En Seleccionar:
source = outer | eval m = [ source = inner | where outer.b = inner.d | stats max(c) ] | fields m, a
source = outer | eval m = [ source = inner | where b = d | stats max(c) ] | fields m, a
source = outer | eval m = [ source = inner | where outer.b > inner.d | stats max(c) ] | fields m, a
En dónde:
source = outer | where a = [ source = inner | where outer.b = inner.d | stats max(c) ]
source = outer | where a = [ source = inner | where b = d | stats max(c) ]
source = outer | where [ source = inner | where outer.b = inner.d OR inner.d = 1 | stats count() ] > 0 | fields a
En el filtro de búsqueda:
source = outer a = [ source = inner | where b = d | stats max(c) ]
source = outer [ source = inner | where outer.b = inner.d OR inner.d = 1 | stats count() ] > 0 | fields a
Subconsulta escalar anidada
-
source = outer | where a = [ source = inner | stats max(c) | sort c ] OR b = [ source = inner | where c = 1 | stats min(d) | sort d ]
-
source = outer | where a = [ source = inner | where c = [ source = nested | stats max(e) by f | sort f ] | stats max(d) by c | sort c | head 1 ]
(Relación) Subconsulta
InSubquery
, ExistsSubquery
y ScalarSubquery
son todas expresiones de subconsulta. Pero no RelationSubquery
es una expresión de subconsulta, es un plan de subconsulta que se usa comúnmente en las cláusulas Join o From.
source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ]
(subconsulta en el lado derecho de la unión)source = [ source = table1 | join left = l right = r [ source = table2 | where d > 10 | head 5 ] | stats count(a) by b ] as outer | head 1
Contexto adicional
InSubquery
ExistsSubquery
, y ScalarSubquery
son expresiones de subconsulta que se utilizan habitualmente en where
cláusulas y filtros de búsqueda.
Comando Where:
| where <boolean expression> | ...
Filtro de búsqueda:
search source=* <boolean expression> | ...
Se puede usar una expresión de subconsulta en una expresión booleana:
| where orders.order_id in [ source=returns | where return_reason="damaged" | field order_id ]
El orders.order_id in [ source=... ]
es un. <boolean expression>
En general, denominamos InSubquery
expresión a este tipo de cláusula de subconsulta. Es una<boolean expression>
.
Subconsulta con diferentes tipos de unión
En el ejemplo se utiliza un ScalarSubquery
:
source=employees | join source=sales on employees.employee_id = sales.employee_id | where sales.sale_amount > [ source=targets | where target_met="true" | fields target_value ]
A diferencia de InSubquery ExistsSubquery, y ScalarSubquery, a no RelationSubquery es una expresión de subconsulta. En cambio, es un plan de subconsultas.
SEARCH source=customer | FIELDS c_custkey | LEFT OUTER JOIN left = c, right = o ON c.c_custkey = o.o_custkey [ SEARCH source=orders | WHERE o_comment NOT LIKE '%unusual%packages%' | FIELDS o_orderkey, o_custkey ] | STATS ...