

# Parser processors
<a name="parser-processors"></a>

Parser processors convert raw or semi-structured log data into structured formats. Each pipeline can have at most one parser processor, which must be the first processor in the pipeline.

**Conditional processing not supported**  
Parser processors (except Grok) do not support conditional processing with the `when` parameter. This includes OCSF, CSV, JSON, KeyValue, VPC, Route53, WAF, Postgres, and CloudFront parsers. For more information, see [Expression syntax for conditional processing](conditional-processing.md).

## OCSF processor
<a name="ocsf-processor"></a>

Parses and transforms log data according to Open Cybersecurity Schema Framework (OCSF) standards.

**Configuration**  
Configure the OCSF processor with the following parameters:

```
processor:
  - ocsf:
      version: "1.5"
      mapping_version: 1.5.0
      schema:
          microsoft_office365_management_activity:
```Parameters

`version` (required)  
The OCSF schema version to use for transformation. Must be 1.5

`mapping_version` (required)  
The OCSF mapping version for transformation. Must be 1.5.0.

`schema` (required)  
Schema object specifying the data source type. The supported schemas depend on the pipeline source type - each source type has its own set of compatible OCSF schemas. You must use a schema that matches your pipeline's source type.

This table lists the supported schema combinations.


| Pipeline Source Type | Supported Schemas | Version | Mapping Version | 
| --- | --- | --- | --- | 
| cloudwatch\$1logs | cloud\$1trail: | 1.5 | Not required | 
| cloudwatch\$1logs | route53\$1resolver: | 1.5 | Not required | 
| cloudwatch\$1logs | vpc\$1flow: | 1.5 | Not required | 
| cloudwatch\$1logs | eks\$1audit: | 1.5 | Not required | 
| cloudwatch\$1logs | aws\$1waf: | 1.5 | Not required | 
| s3 | Any OCSF schema | Any | Any | 
| microsoft\$1office365 | microsoft\$1office365: | 1.5 | 1.5.0 | 
| microsoft\$1entraid | microsoft\$1entraid: | 1.5 | 1.5.0 | 
| microsoft\$1windows\$1event | microsoft\$1windows\$1event: | 1.5 | 1.5.0 | 
| paloaltonetworks\$1nextgenerationfirewall | paloaltonetworks\$1nextgenerationfirewall: | 1.5 | 1.5.0 | 
| okta\$1auth0 | okta\$1auth0: | 1.5 | 1.5.0 | 
| okta\$1sso | okta\$1sso: | 1.5 | 1.5.0 | 
| crowdstrike\$1falcon | crowdstrike\$1falcon: | 1.5 | 1.5.0 | 
| github\$1auditlogs | github\$1auditlogs: | 1.5 | 1.5.0 | 
| sentinelone\$1endpointsecurity | sentinelone\$1endpointsecurity: | 1.5 | 1.5.0 | 
| servicenow\$1cmdb | servicenow\$1cmdb: | 1.5 | 1.5.0 | 
| wiz\$1cnapp | wiz\$1cnapp: | 1.5 | 1.5.0 | 
| zscaler\$1internetaccess | zscaler\$1internetaccess: | 1.5 | 1.5.0 | 

## CSV processor
<a name="csv-processor"></a>

Parses CSV formatted data into structured fields.

**Configuration**  
Configure the CSV processor with the following parameters:

```
processor:
  - csv:      
      column_names: ["col1", "col2", "col3"]
      delimiter: ","
      quote_character: '"'
```Parameters

`column_names` (optional)  
Array of column names for parsed fields. Maximum 100 columns, each name up to 128 characters. If not provided, defaults to column\$11, column\$12, and so on.

`delimiter` (optional)  
Character used to separate CSV fields. Must be a single character. Defaults to comma (,).

`quote_character` (optional)  
Character used to quote CSV fields containing delimiters. Must be a single character. Defaults to double quote (").

To use the processor without specifying additional parameters, use the following command:

```
processor:
  - csv: {}
```

## Grok processor
<a name="grok-processor"></a>

Parses unstructured data using Grok patterns. At most 1 Grok is supported per pipeline. For details on the Grok transformer in CloudWatch Logs see [Processors that you can use](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CloudWatch-Logs-Transformation-Processors.html) in the *CloudWatch Logs User Guide*.

**Configuration**  
Configure the Grok processor with the following parameters:

When the data source is a dictionary, you can use this configuration:

```
processor:
  - grok:      
      match:
       source_key: ["%{WORD:level} %{GREEDYDATA:msg}"]
```

When the data source is CloudWatch Logs, you can use this configuration:

```
processor:
  - grok:      
      match:
       source_key: ["%{WORD:level} %{GREEDYDATA:msg}"]
```Parameters

`match` (required)  
Field mapping with Grok patterns. Only one field mapping allowed.

`match.<field>` (required)  
Array with single Grok pattern. Maximum 512 characters per pattern.

`when` (optional)  
Conditional expression that determines whether this processor executes. Maximum length is 256 characters. See [Expression syntax for conditional processing](conditional-processing.md).

**Important**  
If the Grok processor is used as the parser (first processor) in a pipeline and its `when` condition evaluates to false, the entire pipeline does not execute for that log event. Parsers must run for downstream processors to receive structured data.

## VPC processor
<a name="vpc-processor"></a>

Parses VPC Flow Log data into structured fields.

**Configuration**  
Configure the VPC processor with the following parameters:

```
processor:
  - parse_vpc: {}
```

## JSON processor
<a name="json-processor"></a>

Parses JSON data into structured fields.

**Configuration**  
Configure the JSON processor with the following parameters:

```
processor:
  - parse_json:
      source: "message"
      destination: "parsed_json"
```Parameters

`source` (optional)  
The field containing the JSON data to parse. If omitted, the entire log message is processed

`destination` (optional)  
The field where the parsed JSON will be stored. If omitted, parsed fields are added to the root level

## Route 53 processor
<a name="route53-processor"></a>

Parses Route 53 resolver log data into structured fields.

**Configuration**  
Configure the Route 53 processor with the following parameters:

```
processor:
  - parse_route53: {}
```

## Key-value processor
<a name="key-value-processor"></a>

Parses key-value pair formatted data into structured fields.

**Configuration**  
Configure the key-value processor with the following parameters:

```
processor:
  - key_value:
      source: "message"
      destination: "parsed_kv"
      field_delimiter: "&"
      key_value_delimiter: "="
```Parameters

`source` (optional)  
Field containing key-value data. Maximum 128 characters.

`destination` (optional)  
Target field for parsed key-value pairs. Maximum 128 characters.

`field_delimiter` (optional)  
Pattern to split key-value pairs. Maximum 10 characters.

`key_value_delimiter` (optional)  
Pattern to split keys from values. Maximum 10 characters.

`overwrite_if_destination_exists` (optional)  
Whether to overwrite existing destination field.

`prefix` (optional)  
Prefix to add to extracted keys. Maximum 128 characters.

`non_match_value` (optional)  
Value for keys without matches. Maximum 128 characters.

To use the processor without specifying additional parameters, use the following command:

```
processor:
  - key_value: {}
```