parse command
The parse
command parses a text field with a regular expression
and appends the result to the search result.
Note
To see which AWS data source integrations support this PPL command, see Commands.
Syntax
Use the following syntax:
parse <field> <pattern>
field
-
Mandatory.
-
The field must be a text field.
pattern
-
Mandatory string.
-
This is the regular expression pattern used to extract new fields from the given text field.
-
If a new field name already exists, it will replace the original field.
Regular expression
The regular expression pattern is used to match the whole text field of each document with Java regex engine. Each named capture group in the expression will become a new STRING
field.
Example 1: Create a new field
The example shows how to create a new field host
for each
document. host
will be the host name after @
in the email
field. Parsing a null field will return an empty
string.
PPL query:
os> source=accounts | parse email '.+@(?<host>.+)' | fields email, host ; fetched rows / total rows = 4/4 +-----------------------+-------------+ | email | host | |-----------------------+-------------| | jane_doe@example.com | example.com | | john_doe@example.net | example.net | | null | | | juan_li@example.org | example.org | +-----------------------+-------------+
Example 2: Override an existing field
The example shows how to override the existing address
field
with the street number removed.
PPL query:
os> source=accounts | parse address '\d+ (?<address>.+)' | fields address ; fetched rows / total rows = 4/4 +------------------+ | address | |------------------| | Example Lane | | Example Street | | Example Avenue | | Example Court | +------------------+
Example 3: Filter and sort by casted parsed field
The example shows how to sort street numbers that are higher than 500 in
the address
field.
PPL query:
os> source=accounts | parse address '(?<streetNumber>\d+) (?<street>.+)' | where cast(streetNumber as int) > 500 | sort num(streetNumber) | fields streetNumber, street ; fetched rows / total rows = 3/3 +----------------+----------------+ | streetNumber | street | |----------------+----------------| | *** | Example Street | | *** | Example Avenue | | 880 | Example Lane | +----------------+----------------+
Limitations
There are a few limitations with the parse command:
-
Fields defined by parse cannot be parsed again.
The following command will not work:
source=accounts | parse address '\d+ (?<street>.+)' | parse street '\w+ (?<road>\w+)'
-
Fields defined by parse cannot be overridden with other commands.
where
will not match any documents sincestreet
cannot be overridden:source=accounts | parse address '\d+ (?<street>.+)' | eval street='1' | where street='1' ;
-
The text field used by parse cannot be overridden.
street
will not be successfully parsed sinceaddress
is overridden:source=accounts | parse address '\d+ (?<street>.+)' | eval address='1' ;
-
Fields defined by parse cannot be filtered or sorted after using them in the
stats
command.where
in the following command will not work:source=accounts | parse email '.+@(?<host>.+)' | stats avg(age) by host | where host=pyrami.com ;