S3ParquetSource (AWS SDK for Java

java.lang.Object
- com.amazonaws.services.glue.model.S3ParquetSource

All Implemented Interfaces:

StructuredPojo, Serializable, Cloneable
```
@Generated(value="com.amazonaws:aws-java-sdk-code-generator")
public class S3ParquetSource
extends Object
implements Serializable, Cloneable, StructuredPojo
```
Specifies an Apache Parquet data store stored in Amazon S3.

See Also:

AWS API Documentation, Serialized Form

Constructor Summary

Constructors
Constructor and Description

S3ParquetSource()

Constructors
Constructor and Description
`S3ParquetSource()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`S3ParquetSource`	`clone()`
`boolean`	`equals(Object obj)`
`S3DirectSourceAdditionalOptions`	`getAdditionalOptions()` Specifies additional connection options.
`String`	`getCompressionType()` Specifies how the data is compressed.
`List<String>`	`getExclusions()` A string containing a JSON list of Unix-style glob patterns to exclude.
`String`	`getGroupFiles()` Grouping files is turned on by default when the input contains more than 50,000 files.
`String`	`getGroupSize()` The target group size in bytes.
`Integer`	`getMaxBand()` This option controls the duration in milliseconds after which the s3 listing is likely to be consistent.
`Integer`	`getMaxFilesInBand()` This option specifies the maximum number of files to save from the last maxBand seconds.
`String`	`getName()` The name of the data store.
`List<GlueSchema>`	`getOutputSchemas()` Specifies the data schema for the S3 Parquet source.
`List<String>`	`getPaths()` A list of the Amazon S3 paths to read from.
`Boolean`	`getRecurse()` If set to true, recursively reads files in all subdirectories under the specified paths.
`int`	`hashCode()`
`Boolean`	`isRecurse()` If set to true, recursively reads files in all subdirectories under the specified paths.
`void`	`marshall(ProtocolMarshaller protocolMarshaller)` Marshalls this structured data using the given `ProtocolMarshaller`.
`void`	`setAdditionalOptions(S3DirectSourceAdditionalOptions additionalOptions)` Specifies additional connection options.
`void`	`setCompressionType(String compressionType)` Specifies how the data is compressed.
`void`	`setExclusions(Collection<String> exclusions)` A string containing a JSON list of Unix-style glob patterns to exclude.
`void`	`setGroupFiles(String groupFiles)` Grouping files is turned on by default when the input contains more than 50,000 files.
`void`	`setGroupSize(String groupSize)` The target group size in bytes.
`void`	`setMaxBand(Integer maxBand)` This option controls the duration in milliseconds after which the s3 listing is likely to be consistent.
`void`	`setMaxFilesInBand(Integer maxFilesInBand)` This option specifies the maximum number of files to save from the last maxBand seconds.
`void`	`setName(String name)` The name of the data store.
`void`	`setOutputSchemas(Collection<GlueSchema> outputSchemas)` Specifies the data schema for the S3 Parquet source.
`void`	`setPaths(Collection<String> paths)` A list of the Amazon S3 paths to read from.
`void`	`setRecurse(Boolean recurse)` If set to true, recursively reads files in all subdirectories under the specified paths.
`String`	`toString()` Returns a string representation of this object.
`S3ParquetSource`	`withAdditionalOptions(S3DirectSourceAdditionalOptions additionalOptions)` Specifies additional connection options.
`S3ParquetSource`	`withCompressionType(ParquetCompressionType compressionType)` Specifies how the data is compressed.
`S3ParquetSource`	`withCompressionType(String compressionType)` Specifies how the data is compressed.
`S3ParquetSource`	`withExclusions(Collection<String> exclusions)` A string containing a JSON list of Unix-style glob patterns to exclude.
`S3ParquetSource`	`withExclusions(String... exclusions)` A string containing a JSON list of Unix-style glob patterns to exclude.
`S3ParquetSource`	`withGroupFiles(String groupFiles)` Grouping files is turned on by default when the input contains more than 50,000 files.
`S3ParquetSource`	`withGroupSize(String groupSize)` The target group size in bytes.
`S3ParquetSource`	`withMaxBand(Integer maxBand)` This option controls the duration in milliseconds after which the s3 listing is likely to be consistent.
`S3ParquetSource`	`withMaxFilesInBand(Integer maxFilesInBand)` This option specifies the maximum number of files to save from the last maxBand seconds.
`S3ParquetSource`	`withName(String name)` The name of the data store.
`S3ParquetSource`	`withOutputSchemas(Collection<GlueSchema> outputSchemas)` Specifies the data schema for the S3 Parquet source.
`S3ParquetSource`	`withOutputSchemas(GlueSchema... outputSchemas)` Specifies the data schema for the S3 Parquet source.
`S3ParquetSource`	`withPaths(Collection<String> paths)` A list of the Amazon S3 paths to read from.
`S3ParquetSource`	`withPaths(String... paths)` A list of the Amazon S3 paths to read from.
`S3ParquetSource`	`withRecurse(Boolean recurse)` If set to true, recursively reads files in all subdirectories under the specified paths.

Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - S3ParquetSource
```
public S3ParquetSource()
```
- Method Detail
  - setName
```
public void setName(String name)
```
    The name of the data store.
    
    Parameters:
    
    name - The name of the data store.
  - getName
```
public String getName()
```
    The name of the data store.
    
    Returns:
    
    The name of the data store.
  - withName
```
public S3ParquetSource withName(String name)
```
    The name of the data store.
    
    Parameters:
    
    name - The name of the data store.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - getPaths
```
public List<String> getPaths()
```
    A list of the Amazon S3 paths to read from.
    
    Returns:
    
    A list of the Amazon S3 paths to read from.
  - setPaths
```
public void setPaths(Collection<String> paths)
```
    A list of the Amazon S3 paths to read from.
    
    Parameters:
    
    paths - A list of the Amazon S3 paths to read from.
  - withPaths
```
public S3ParquetSource withPaths(String... paths)
```
    A list of the Amazon S3 paths to read from.
    
    NOTE: This method appends the values to the existing list (if any). Use setPaths(java.util.Collection) or withPaths(java.util.Collection) if you want to override the existing values.
    
    Parameters:
    
    paths - A list of the Amazon S3 paths to read from.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - withPaths
```
public S3ParquetSource withPaths(Collection<String> paths)
```
    A list of the Amazon S3 paths to read from.
    
    Parameters:
    
    paths - A list of the Amazon S3 paths to read from.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - setCompressionType
```
public void setCompressionType(String compressionType)
```
    Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").
    
    Parameters:
    
    compressionType - Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").
    
    See Also:
    
    ParquetCompressionType
  - getCompressionType
```
public String getCompressionType()
```
    Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").
    
    Returns:
    
    Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").
    
    See Also:
    
    ParquetCompressionType
  - withCompressionType
```
public S3ParquetSource withCompressionType(String compressionType)
```
    Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").
    
    Parameters:
    
    compressionType - Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
    
    See Also:
    
    ParquetCompressionType
  - withCompressionType
```
public S3ParquetSource withCompressionType(ParquetCompressionType compressionType)
```
    Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").
    
    Parameters:
    
    compressionType - Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
    
    See Also:
    
    ParquetCompressionType
  - getExclusions
```
public List<String> getExclusions()
```
    A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.
    
    Returns:
    
    A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.
  - setExclusions
```
public void setExclusions(Collection<String> exclusions)
```
    A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.
    
    Parameters:
    
    exclusions - A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.
  - withExclusions
```
public S3ParquetSource withExclusions(String... exclusions)
```
    A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.
    
    NOTE: This method appends the values to the existing list (if any). Use setExclusions(java.util.Collection) or withExclusions(java.util.Collection) if you want to override the existing values.
    
    Parameters:
    
    exclusions - A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - withExclusions
```
public S3ParquetSource withExclusions(Collection<String> exclusions)
```
    A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.
    
    Parameters:
    
    exclusions - A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - setGroupSize
```
public void setGroupSize(String groupSize)
```
    The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.
    
    Parameters:
    
    groupSize - The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.
  - getGroupSize
```
public String getGroupSize()
```
    The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.
    
    Returns:
    
    The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.
  - withGroupSize
```
public S3ParquetSource withGroupSize(String groupSize)
```
    The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.
    
    Parameters:
    
    groupSize - The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - setGroupFiles
```
public void setGroupFiles(String groupFiles)
```
    Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".
    
    Parameters:
    
    groupFiles - Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".
  - getGroupFiles
```
public String getGroupFiles()
```
    Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".
    
    Returns:
    
    Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".
  - withGroupFiles
```
public S3ParquetSource withGroupFiles(String groupFiles)
```
    Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".
    
    Parameters:
    
    groupFiles - Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - setRecurse
```
public void setRecurse(Boolean recurse)
```
    If set to true, recursively reads files in all subdirectories under the specified paths.
    
    Parameters:
    
    recurse - If set to true, recursively reads files in all subdirectories under the specified paths.
  - getRecurse
```
public Boolean getRecurse()
```
    If set to true, recursively reads files in all subdirectories under the specified paths.
    
    Returns:
    
    If set to true, recursively reads files in all subdirectories under the specified paths.
  - withRecurse
```
public S3ParquetSource withRecurse(Boolean recurse)
```
    If set to true, recursively reads files in all subdirectories under the specified paths.
    
    Parameters:
    
    recurse - If set to true, recursively reads files in all subdirectories under the specified paths.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - isRecurse
```
public Boolean isRecurse()
```
    If set to true, recursively reads files in all subdirectories under the specified paths.
    
    Returns:
    
    If set to true, recursively reads files in all subdirectories under the specified paths.
  - setMaxBand
```
public void setMaxBand(Integer maxBand)
```
    This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.
    
    Parameters:
    
    maxBand - This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.
  - getMaxBand
```
public Integer getMaxBand()
```
    This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.
    
    Returns:
    
    This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.
  - withMaxBand
```
public S3ParquetSource withMaxBand(Integer maxBand)
```
    This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.
    
    Parameters:
    
    maxBand - This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - setMaxFilesInBand
```
public void setMaxFilesInBand(Integer maxFilesInBand)
```
    This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.
    
    Parameters:
    
    maxFilesInBand - This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.
  - getMaxFilesInBand
```
public Integer getMaxFilesInBand()
```
    This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.
    
    Returns:
    
    This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.
  - withMaxFilesInBand
```
public S3ParquetSource withMaxFilesInBand(Integer maxFilesInBand)
```
    This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.
    
    Parameters:
    
    maxFilesInBand - This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - setAdditionalOptions
```
public void setAdditionalOptions(S3DirectSourceAdditionalOptions additionalOptions)
```
    Specifies additional connection options.
    
    Parameters:
    
    additionalOptions - Specifies additional connection options.
  - getAdditionalOptions
```
public S3DirectSourceAdditionalOptions getAdditionalOptions()
```
    Specifies additional connection options.
    
    Returns:
    
    Specifies additional connection options.
  - withAdditionalOptions
```
public S3ParquetSource withAdditionalOptions(S3DirectSourceAdditionalOptions additionalOptions)
```
    Specifies additional connection options.
    
    Parameters:
    
    additionalOptions - Specifies additional connection options.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - getOutputSchemas
```
public List<GlueSchema> getOutputSchemas()
```
    Specifies the data schema for the S3 Parquet source.
    
    Returns:
    
    Specifies the data schema for the S3 Parquet source.
  - setOutputSchemas
```
public void setOutputSchemas(Collection<GlueSchema> outputSchemas)
```
    Specifies the data schema for the S3 Parquet source.
    
    Parameters:
    
    outputSchemas - Specifies the data schema for the S3 Parquet source.
  - withOutputSchemas
```
public S3ParquetSource withOutputSchemas(GlueSchema... outputSchemas)
```
    Specifies the data schema for the S3 Parquet source.
    
    NOTE: This method appends the values to the existing list (if any). Use setOutputSchemas(java.util.Collection) or withOutputSchemas(java.util.Collection) if you want to override the existing values.
    
    Parameters:
    
    outputSchemas - Specifies the data schema for the S3 Parquet source.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - withOutputSchemas
```
public S3ParquetSource withOutputSchemas(Collection<GlueSchema> outputSchemas)
```
    Specifies the data schema for the S3 Parquet source.
    
    Parameters:
    
    outputSchemas - Specifies the data schema for the S3 Parquet source.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - toString
```
public String toString()
```
    Returns a string representation of this object. This is useful for testing and debugging. Sensitive data will be redacted from this string using a placeholder value.
    
    Overrides:
    
    toString in class Object
    
    Returns:
    
    A string representation of this object.
    
    See Also:
    
    Object.toString()
  - equals
```
public boolean equals(Object obj)
```
    Overrides:
    
    equals in class Object
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class Object
  - clone
```
public S3ParquetSource clone()
```
    Overrides:
    
    clone in class Object
  - marshall
```
public void marshall(ProtocolMarshaller protocolMarshaller)
```
    Description copied from interface: StructuredPojo
    
    Marshalls this structured data using the given ProtocolMarshaller.
    
    Specified by:
    
    marshall in interface StructuredPojo
    
    Parameters:
    
    protocolMarshaller - Implementation of ProtocolMarshaller used to marshall this object's data.

AWS SDK for Java 1.x API Reference - 1.12.778

Class S3ParquetSource

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

S3ParquetSource

Method Detail

setName

getName

withName

getPaths

setPaths

withPaths

withPaths

setCompressionType

getCompressionType

withCompressionType

withCompressionType

getExclusions

setExclusions

withExclusions

withExclusions

setGroupSize

getGroupSize

withGroupSize

setGroupFiles

getGroupFiles

withGroupFiles

setRecurse

getRecurse

withRecurse

isRecurse

setMaxBand

getMaxBand

withMaxBand

setMaxFilesInBand

getMaxFilesInBand

withMaxFilesInBand

setAdditionalOptions

getAdditionalOptions

withAdditionalOptions

getOutputSchemas

setOutputSchemas

withOutputSchemas

withOutputSchemas

toString

equals

hashCode

clone

marshall