class DataFormat
Language | Type name |
---|---|
.NET | Amazon.CDK.AWS.Glue.DataFormat |
Java | software.amazon.awscdk.services.glue.DataFormat |
Python | aws_cdk.aws_glue.DataFormat |
TypeScript (source) | @aws-cdk/aws-glue » DataFormat |
Defines the input/output formats and ser/de for a single DataFormat.
Example
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
database: myDatabase,
tableName: 'my_table',
columns: [{
name: 'col1',
type: glue.Schema.STRING,
}],
partitionKeys: [{
name: 'year',
type: glue.Schema.SMALL_INT,
}, {
name: 'month',
type: glue.Schema.SMALL_INT,
}],
dataFormat: glue.DataFormat.JSON,
});
Initializer
new DataFormat(props: DataFormatProps)
Parameters
- props
Data
Format Props
Properties
Name | Type | Description |
---|---|---|
input | Input | InputFormat for this data format. |
output | Output | OutputFormat for this data format. |
serialization | Serialization | Serialization library for this data format. |
classification | Classification | Classification string given to tables with this data format. |
static APACHE_LOGS | Data | DataFormat for Apache Web Server Logs. |
static AVRO | Data | DataFormat for Apache Avro. |
static CLOUDTRAIL_LOGS | Data | DataFormat for CloudTrail logs stored on S3. |
static CSV | Data | DataFormat for CSV Files. |
static JSON | Data | Stored as plain text files in JSON format. |
static LOGSTASH | Data | DataFormat for Logstash Logs, using the GROK SerDe. |
static ORC | Data | DataFormat for Apache ORC (Optimized Row Columnar). |
static PARQUET | Data | DataFormat for Apache Parquet. |
static TSV | Data | DataFormat for TSV (Tab-Separated Values). |
inputFormat
Type:
Input
InputFormat
for this data format.
outputFormat
Type:
Output
OutputFormat
for this data format.
serializationLibrary
Type:
Serialization
Serialization library for this data format.
classificationString?
Type:
Classification
(optional)
Classification string given to tables with this data format.
static APACHE_LOGS
Type:
Data
DataFormat for Apache Web Server Logs.
Also works for CloudFront logs
See also: https://docs.aws.amazon.com/athena/latest/ug/apache.html
static AVRO
Type:
Data
DataFormat for Apache Avro.
See also: https://docs.aws.amazon.com/athena/latest/ug/avro.html
static CLOUDTRAIL_LOGS
Type:
Data
DataFormat for CloudTrail logs stored on S3.
See also: https://docs.aws.amazon.com/athena/latest/ug/cloudtrail.html
static CSV
Type:
Data
DataFormat for CSV Files.
See also: https://docs.aws.amazon.com/athena/latest/ug/csv.html
static JSON
Type:
Data
Stored as plain text files in JSON format.
Uses OpenX Json SerDe for serialization and deseralization.
See also: https://docs.aws.amazon.com/athena/latest/ug/json.html
static LOGSTASH
Type:
Data
DataFormat for Logstash Logs, using the GROK SerDe.
See also: https://docs.aws.amazon.com/athena/latest/ug/grok.html
static ORC
Type:
Data
DataFormat for Apache ORC (Optimized Row Columnar).
See also: https://docs.aws.amazon.com/athena/latest/ug/orc.html
static PARQUET
Type:
Data
DataFormat for Apache Parquet.
See also: https://docs.aws.amazon.com/athena/latest/ug/parquet.html
static TSV
Type:
Data
DataFormat for TSV (Tab-Separated Values).
See also: https://docs.aws.amazon.com/athena/latest/ug/lazy-simple-serde.html